One endpoint, any provider

Change the model field to switch providers. The API, response format, and streaming behavior stay identical.

The 4-pass prompt preprocessor runs automatically before every request — normalizing, trimming, and optionally compressing your messages before they hit the provider.

// TypeScript — works with any provider
const response = await fetch(
  'https://app.neureus.ai/ai/chat',
  {
    method: 'POST',
    headers: {
      'Authorization': 'Bearer nr_your_api_key',
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({
      // Switch providers by changing this string:
      model: 'claude-sonnet-4-6',
      // model: 'gpt-4o',
      // model: '@wai/llama-3.3-70b',  // free
      // model: 'gemini-2.5-flash',
      // model: 'deepseek-r1',
      messages: [
        { role: 'user', content: 'Explain RAG in plain English' }
      ],
      stream: true,  // SSE works across all providers
    }),
  }
);
// Response: OpenAI-compatible SSE stream

All supported providers

Workers AI models are always free. Paid models are priced 10% below OpenRouter.

Anthropic Top quality

claude-opus-4-8
claude-sonnet-4-6
claude-haiku-4-5

OpenAI Most popular

gpt-4o
gpt-4o-mini
o3
o4-mini

Google Long context

gemini-2.5-pro
gemini-2.5-flash
gemini-2.0-flash

Meta Llama Free via Workers AI

llama-3.3-70b
llama-3.1-8b

DeepSeek Reasoning

deepseek-r1
deepseek-v3

Mistral Code + edge

mistral-large-2411
mistral-small-3.1
codestral-2501

Cohere RAG-optimized

command-r-plus
command-r

Qwen Free via Workers AI

qwen-2.5-72b
qwen-2.5-coder-32b

Perplexity Online search

llama-3.1-sonar-large-128k-online

Cloudflare Workers AI Always free

@wai/llama-3.3-70b
@wai/deepseek-r1-32b
@wai/nemotron-120b
@wai/qwq-32b

Bring your own provider keys

On Scale plan and above: store your OpenAI or Anthropic API keys encrypted per tenant (AES-GCM). Route through Neureus's gateway at your own rate — the preprocessor still saves 10–30% tokens on top of your direct pricing.

AES-GCM encryption per tenant — keys never stored in plaintext
Rotate keys via POST /ai/providers/:provider/rotate — zero downtime
BYOK overlays global secrets — set once, use across all API calls

// Set your OpenAI key (Scale plan+)
await fetch(
  'https://app.neureus.ai/ai/providers/openai',
  {
    method: 'PUT',
    headers: { 'Authorization': 'Bearer nr_key' },
    body: JSON.stringify({ apiKey: 'sk-your-key' }),
  }
);

// Rotate (re-encrypts under same DEK)
await fetch(
  'https://app.neureus.ai/ai/providers/openai/rotate',
  {
    method: 'POST',
    headers: { 'Authorization': 'Bearer nr_key' },
    body: JSON.stringify({ newApiKey: 'sk-new-key' }),
  }
);

What's included in the gateway

Multi-provider routing

Change providers by changing the model ID. claude-* → Anthropic, gpt-* → OpenAI, @wai/* → Workers AI (free). Same API call, automatic dispatch.

SSE streaming

Set stream: true to get OpenAI-compatible Server-Sent Events from any provider — including Anthropic and Google — with a unified event format.

Prompt preprocessor

4-pass pipeline before every request: normalize → structure → trim (>6K tokens) → compress (opt-in, Llama 3.1 8B). Cuts token count 10–30%.

Batch inference

Async batch jobs via OpenAI + Anthropic Batch APIs. 40% below realtime pricing. Webhook delivery on completion. Use for high-volume, non-urgent workloads.

BYOK encryption

Bring your own OpenAI or Anthropic keys on Scale plan+. Stored AES-GCM encrypted per tenant. Rotate keys via PUT /ai/providers/:provider/rotate without downtime.

Edge delivery

300+ Cloudflare locations. Zero cold starts (Workers run on V8 isolates, not containers). p95 latency <80ms globally.

Model	OpenRouter	Neureus (realtime)	Neureus (batch)
GPT-4o	$5.00/1M	$4.50/1M	$3.00/1M
Claude Sonnet 4.6	$3.00/1M	$2.70/1M	$1.80/1M
Gemini 2.5 Flash	$0.75/1M	$0.675/1M	~$0.45/1M
Llama 3.3 70B (Workers AI)	$0.59/1M	Free	Free
DeepSeek R1	$0.55/1M	$0.50/1M	~$0.33/1M

One API. Any Model.
Best Price.

One endpoint, any provider

All supported providers

Always 10% below OpenRouter

Bring your own provider keys

What's included in the gateway

Multi-provider routing

SSE streaming

Prompt preprocessor

Batch inference

BYOK encryption

Edge delivery

Start routing to any provider in minutes

One API. Any Model.Best Price.

One endpoint, any provider

All supported providers

Always 10% below OpenRouter

Bring your own provider keys

What's included in the gateway

Multi-provider routing

SSE streaming

Prompt preprocessor

Batch inference

BYOK encryption

Edge delivery

Start routing to any provider in minutes

One API. Any Model.
Best Price.