How to Build a Customer Support Bot in 100 Lines

A production-quality customer support bot needs four things: a knowledge base it can search, a fast LLM to generate responses, streaming so users don’t wait, and a way to escalate when it doesn’t know the answer. Here’s all four in under 100 lines.

What we’re building

Ingest your documentation into a vector index (one-time setup)
Accept user questions via API
Retrieve relevant chunks from the docs
Generate a grounded, streamed response
Escalate to a human when confidence is low

Setup

npm install @neureus/sdk

NEUREUS_API_KEY=nr_your_key_here

Step 1: Ingest your documentation (run once)

// scripts/ingest-docs.ts
import { NeureuAI } from '@neureus/sdk';

const client = new NeureuAI({ apiKey: process.env.NEUREUS_API_KEY! });

const docs = [
  'https://yourproduct.com/docs/getting-started',
  'https://yourproduct.com/docs/billing',
  'https://yourproduct.com/docs/integrations',
  'https://yourproduct.com/docs/troubleshooting',
];

for (const url of docs) {
  const { documentId, chunks } = await client.rag.ingest({ url, chunkSize: 400, overlap: 50 });
  console.log(`Ingested ${url}: ${chunks} chunks (id: ${documentId})`);
}

Run once: npx ts-node scripts/ingest-docs.ts. Re-run when docs change.

Step 2: The support bot endpoint (~70 lines)

// api/support.ts
import { NeureuAI } from '@neureus/sdk';

const client = new NeureuAI({ apiKey: process.env.NEUREUS_API_KEY! });

const SYSTEM_PROMPT = `You are a helpful customer support assistant for YourProduct.

Rules:
1. Answer only from the provided documentation context. Do not make up features or policies.
2. If you cannot find the answer in the context, say "I don't have specific information about that" and suggest contacting support.
3. Be concise. Users are frustrated when they have to read long answers.
4. Include a direct answer in the first sentence.

If the user's question falls into any of these categories, say "let me connect you with a human agent":
- Billing disputes or refund requests
- Account security issues
- Complaints about specific team members
- Legal questions`;

interface SupportRequest {
  question: string;
  conversationHistory?: Array<{ role: 'user' | 'assistant'; content: string }>;
  stream?: boolean;
}

export async function handleSupportRequest(req: SupportRequest) {
  const { question, conversationHistory = [], stream = true } = req;

  // 1. Retrieve relevant documentation
  const { results } = await client.rag.query({ query: question, topK: 5 });
  
  const context = results.length > 0
    ? results.map(r => `[Source: ${r.metadata?.url ?? 'docs'}]\n${r.content}`).join('\n\n---\n\n')
    : 'No specific documentation found for this question.';

  // 2. Build the message array
  const messages = [
    ...conversationHistory.slice(-6),  // keep last 3 turns
    { role: 'user' as const, content: question },
  ];

  const systemWithContext = `${SYSTEM_PROMPT}\n\n## Relevant documentation\n\n${context}`;

  // 3. Check if escalation is needed before generating
  const needsHuman = /refund|billing dispute|charge|fraud|security breach|legal|lawsuit/i.test(question);
  if (needsHuman) {
    return {
      text: "I'd like to connect you with a human agent for this. Please hold on — someone will be with you shortly.",
      escalated: true,
      sources: [],
    };
  }

  // 4. Generate response (streaming or not)
  if (stream) {
    const responseStream = await client.ai.stream({
      model: 'claude-haiku-4-5',  // fast + cheap for support
      system: systemWithContext,
      messages,
    });
    return { stream: responseStream, sources: results.map(r => r.metadata?.url).filter(Boolean) };
  }

  const { text } = await client.ai.chat({
    model: 'claude-haiku-4-5',
    system: systemWithContext,
    messages,
  });

  // 5. Detect low-confidence answers
  const lowConfidence = text.includes("I don't have specific information") || 
                        text.includes("I'm not sure") ||
                        results.length === 0;

  return {
    text,
    escalated: false,
    lowConfidence,
    sources: results.map(r => r.metadata?.url).filter(Boolean),
  };
}

Step 3: Wire it up as an API route

Next.js App Router:

// app/api/support/route.ts
import { NextRequest } from 'next/server';
import { handleSupportRequest } from '@/api/support';

export const runtime = 'edge';

export async function POST(req: NextRequest) {
  const body = await req.json();
  const result = await handleSupportRequest({ ...body, stream: true });

  if ('stream' in result) {
    return new Response(result.stream, {
      headers: { 'Content-Type': 'text/event-stream', 'Cache-Control': 'no-cache' },
    });
  }

  return Response.json(result);
}

Express / Hono:

app.post('/api/support', async (c) => {
  const body = await c.req.json();
  const result = await handleSupportRequest({ ...body, stream: false });
  return c.json(result);
});

<!-- Embeddable support widget — 30 lines -->
<div id="support-widget">
  <div id="messages"></div>
  <form id="support-form">
    <input id="question" placeholder="How can I help you?" />
    <button type="submit">Send</button>
  </form>
</div>

<script>
const form = document.getElementById('support-form');
const messages = document.getElementById('messages');
const history = [];

form.addEventListener('submit', async (e) => {
  e.preventDefault();
  const question = document.getElementById('question').value;
  document.getElementById('question').value = '';
  
  addMessage('user', question);
  const assistantEl = addMessage('assistant', '');

  const res = await fetch('/api/support', {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({ question, conversationHistory: history }),
  });

  const reader = res.body.getReader();
  const decoder = new TextDecoder();
  let fullText = '';

  while (true) {
    const { done, value } = await reader.read();
    if (done) break;
    for (const line of decoder.decode(value).split('\n')) {
      if (line.startsWith('data: ')) {
        try { const d = JSON.parse(line.slice(6)); if (d.text) { fullText += d.text; assistantEl.textContent = fullText; } } catch {}
      }
    }
  }

  history.push({ role: 'user', content: question }, { role: 'assistant', content: fullText });
});

function addMessage(role, text) {
  const el = document.createElement('div');
  el.className = `message ${role}`;
  el.textContent = text;
  messages.appendChild(el);
  return el;
}
</script>

What you get

RAG-grounded answers: every response is based on your actual documentation, not hallucinated
Streaming: users see the response appear word by word
Conversation history: the last 3 turns are included for context
Automatic escalation: billing/legal/security queries route to human agents
Source attribution: sources[] in the response shows which docs were used
Low-confidence detection: the bot flags when it’s uncertain so you can trigger escalation

Extending it

Add handoff to Intercom/Zendesk:

if (result.escalated || result.lowConfidence) {
  await createZendeskTicket({ question, conversation: history });
}

Use a smarter model for complex questions:

const model = isComplexTechnicalQuestion(question) 
  ? 'claude-sonnet-4-6' 
  : 'claude-haiku-4-5';

Add feedback tracking:

// After the user rates the response:
await client.analytics.track({ event: 'support_rating', rating, question, wasEscalated: result.escalated });

Total line count

ingest-docs.ts: 12 lines
support.ts: 65 lines
API route: 15 lines
HTML widget: 40 lines
Total: ~132 lines (close enough for a blog post)

The key insight: the hard parts (embedding, retrieval, streaming, generation) are one API. The code you write is the product logic — escalation rules, conversation management, UI.

Try it yourself — free tier includes 50 document ingestions and 500 Neurons/month. The support bot above runs on 2–5 Neurons per question depending on context size.