RAG API
Ingest a document or URL. Query with natural language. Get an LLM answer with source attribution. No vector DB, no embedding service, no chunking library — just two endpoints.
curl -X POST https://app.neureus.ai/rag/ingest \
-H "Authorization: Bearer nr_your_key" \
-H "Content-Type: application/json" \
-d '{"url": "https://your-docs.com/getting-started"}'
# Or raw content
curl -X POST https://app.neureus.ai/rag/ingest \
-H "Authorization: Bearer nr_your_key" \
-H "Content-Type: application/json" \
-d '{
"content": "Our return policy allows returns within 30 days...",
"title": "Return Policy"
}' curl -X POST https://app.neureus.ai/rag/query \
-H "Authorization: Bearer nr_your_key" \
-H "Content-Type: application/json" \
-d '{
"query": "What is the return policy?",
"model": "gpt-4o-mini",
"k": 5
}'
# Response
{
"answer": "Returns are accepted within 30 days of purchase...",
"sources": [
{ "documentId": "doc_abc", "title": "Return Policy", "excerpt": "..." }
],
"logId": "log_xyz"
} import { NeureuClient } from '@neureus/sdk';
const client = new NeureuClient({
apiKey: process.env.NEUREUS_API_KEY!,
});
// Ingest
await client.rag.ingest({
url: 'https://your-docs.com/api-reference',
});
// Query — use any provider for the answer
const result = await client.rag.query({
query: 'How do I authenticate?',
model: 'claude-haiku-4-5', // or @wai/llama-3.3-70b (free)
});
console.log(result.answer);
// → "Authentication uses API keys passed as Bearer tokens..."
console.log(result.sources);
// → [{ documentId, title, excerpt }, ...] // List all ingested documents
const docs = await client.rag.listDocuments();
// → [{ documentId, title, chunkCount, createdAt }, ...]
// Delete a document (removes all its chunks from Vectorize)
await client.rag.deleteDocument('doc_abc'); No vector DB to provision, no embedding model to deploy, no chunking library to configure. Two endpoints.
Documents split at sentence boundaries automatically. No chunk_size parameters to tune.
Dense vector similarity via Cloudflare Vectorize. Finds conceptually related content, not just keyword matches.
Route to any of 10 providers for answer generation. Use free Workers AI or premium models — same endpoint.
Every answer includes the document chunks used. Cite sources, debug retrieval quality, audit answers.
Each API key has its own document namespace. Serve multiple clients from one deployment.
Ingest your company wiki, runbooks, and docs. Let employees ask natural-language questions and get answers with source links.
Ingest your product documentation and FAQ. Connect to your support widget — customers get instant, accurate answers without a human.
Ingest contracts, reports, or research papers. Extract specific information, summarize sections, or compare across multiple documents.
Ingest multiple sources on a topic. Query across all of them to synthesize findings, spot contradictions, and trace claims to sources.
Start free. 50 documents, no credit card required.