API reference

Ingest a URL (cURL)

curl -X POST https://app.neureus.ai/rag/ingest \
  -H "Authorization: Bearer nr_your_key" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://your-docs.com/getting-started"}'

# Or raw content
curl -X POST https://app.neureus.ai/rag/ingest \
  -H "Authorization: Bearer nr_your_key" \
  -H "Content-Type: application/json" \
  -d '{
    "content": "Our return policy allows returns within 30 days...",
    "title": "Return Policy"
  }'

Query (cURL)

curl -X POST https://app.neureus.ai/rag/query \
  -H "Authorization: Bearer nr_your_key" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What is the return policy?",
    "model": "gpt-4o-mini",
    "k": 5
  }'

# Response
{
  "answer": "Returns are accepted within 30 days of purchase...",
  "sources": [
    { "documentId": "doc_abc", "title": "Return Policy", "excerpt": "..." }
  ],
  "logId": "log_xyz"
}

TypeScript SDK

import { NeureuClient } from '@neureus/sdk';

const client = new NeureuClient({
  apiKey: process.env.NEUREUS_API_KEY!,
});

// Ingest
await client.rag.ingest({
  url: 'https://your-docs.com/api-reference',
});

// Query — use any provider for the answer
const result = await client.rag.query({
  query: 'How do I authenticate?',
  model: 'claude-haiku-4-5',  // or @wai/llama-3.3-70b (free)
});

console.log(result.answer);
// → "Authentication uses API keys passed as Bearer tokens..."
console.log(result.sources);
// → [{ documentId, title, excerpt }, ...]

Document management

// List all ingested documents
const docs = await client.rag.listDocuments();
// → [{ documentId, title, chunkCount, createdAt }, ...]

// Delete a document (removes all its chunks from Vectorize)
await client.rag.deleteDocument('doc_abc');

What's included

Zero setup

No vector DB to provision, no embedding model to deploy, no chunking library to configure. Two endpoints.

Auto-chunking

Documents split at sentence boundaries automatically. No chunk_size parameters to tune.

Semantic search

Dense vector similarity via Cloudflare Vectorize. Finds conceptually related content, not just keyword matches.

Any LLM for answers

Route to any of 10 providers for answer generation. Use free Workers AI or premium models — same endpoint.

Source attribution

Every answer includes the document chunks used. Cite sources, debug retrieval quality, audit answers.

Multi-tenant isolation

Each API key has its own document namespace. Serve multiple clients from one deployment.

Common use cases

Internal knowledge base

Ingest your company wiki, runbooks, and docs. Let employees ask natural-language questions and get answers with source links.

Customer support bot

Ingest your product documentation and FAQ. Connect to your support widget — customers get instant, accurate answers without a human.

Document Q&A

Ingest contracts, reports, or research papers. Extract specific information, summarize sections, or compare across multiple documents.

Research assistant

Ingest multiple sources on a topic. Query across all of them to synthesize findings, spot contradictions, and trace claims to sources.

RAG in Two API Calls

How it works