Monthly Token Estimate

or enter custom

Input tokens (millions / month)

M tokens

Output tokens (millions / month)

M tokens

Prices per million tokens. Neureus prices are 10% below OpenRouter published rates.

Monthly cost by model 1M input · 0.5M output

Model	Standard price	N Neureus price	Savings
Llama 3.2 1B Cloudflare AI	—	—	Save 10%
Llama 3.1 8B Instruct Meta	—	—	Save 10%
Llama 3.2 3B Cloudflare AI	—	—	Save 10%
Qwen 2.5 Coder 32B Qwen	—	—	Save 10%
Gemini 2.0 Flash Google	—	—	Save 10%
Mistral Small 3.1 Mistral AI	—	—	Save 10%
GPT-4o mini OpenAI	—	—	Save 10%
Gemini 2.5 Flash Google	—	—	Save 10%
Command R Cohere	—	—	Save 10%
DeepSeek V3 DeepSeek	—	—	Save 10%
Llama 4 Scout 17B Cloudflare AI	—	—	Save 10%
Llama 3.3 70B Cloudflare AI	—	—	Save 10%
Codestral Mistral AI	—	—	Save 10%
Llama 3.1 70B Instruct Meta	—	—	Save 10%
Qwen 2.5 72B Instruct Qwen	—	—	Save 10%
Mistral Small 3.1 24B Cloudflare AI	—	—	Save 10%
GPT-4.1 mini OpenAI	—	—	Save 10%
DeepSeek R1 32B Cloudflare AI	—	—	Save 10%
Nemotron 120B Cloudflare AI	—	—	Save 10%
DeepSeek R1 DeepSeek	—	—	Save 10%
Llama 3.3 70B Instruct Meta	—	—	Save 10%
QwQ 32B Cloudflare AI	—	—	Save 10%
Qwen Coder 32B Cloudflare AI	—	—	Save 10%
Claude Haiku 4.5 Anthropic	—	—	Save 10%
Kimi K2 Cloudflare AI	—	—	Save 10%
Sonar Large (Online) Perplexity	—	—	Save 10%
o4-mini OpenAI	—	—	Save 10%
Gemini 2.5 Pro Google	—	—	Save 10%
GPT-4.1 OpenAI	—	—	Save 10%
Mistral Large Mistral AI	—	—	Save 10%
GPT-4o OpenAI	—	—	Save 10%
Command R+ Cohere	—	—	Save 10%
Claude Sonnet 4.6 Anthropic	—	—	Save 10%
o3 OpenAI	—	—	Save 10%
Claude Opus 4 Anthropic	—	—	Save 10%

How LLM pricing works

Tokens, not characters.

LLM APIs charge per token, not per character or word. One token is roughly 4 characters of English text — a typical 1,000-word document is about 1,300 tokens.

Input tokens

The prompt you send to the model: system instructions, conversation history, context, and the user's message. Input tokens are usually cheaper than output.

"Summarize this document in 3 bullets" + a 2,000-word doc ≈ ~700 input tokens

Output tokens

The model's response. Output tokens typically cost 3–5× more than input tokens because they're generated one at a time, while input is processed in batch.

A 200-word summary ≈ ~260 output tokens

Monthly estimate

Multiply your daily token counts by 30. For a chatbot handling 1,000 conversations/day with average 500 input + 200 output tokens: 15M input + 6M output per month.

1,000 chats/day × 500 in × 30 days = 15M input tokens/month

The Neureus discount

Neureus routes every request through a response cache and prompt compression layer, returning savings to you as a permanent 10% discount off OpenRouter published rates.

$100/mo at OpenRouter → $90/mo via Neureus — same models, same quality

Full pricing table

Every model. Every price.

All 35 models available via one Neureus API key, priced 10% below OpenRouter.

Model	Context	OpenRouter Input /M	OpenRouter Output /M	Neureus Input /M	Neureus Output /M
Llama 3.2 1B Cloudflare AI	128K	$0.027	$0.20	$0.024	$0.18
Llama 3.1 8B Instruct Meta	128K	$0.050	$0.050	$0.045	$0.045
Llama 3.2 3B Cloudflare AI	128K	$0.051	$0.34	$0.046	$0.30
Qwen 2.5 Coder 32B Qwen	33K	$0.070	$0.16	$0.063	$0.14
Gemini 2.0 Flash Google	1M	$0.10	$0.40	$0.090	$0.36
Mistral Small 3.1 Mistral AI	128K	$0.10	$0.30	$0.090	$0.27
GPT-4o mini OpenAI	128K	$0.15	$0.60	$0.14	$0.54
Gemini 2.5 Flash Google	1M	$0.15	$0.60	$0.14	$0.54
Command R Cohere	128K	$0.15	$0.60	$0.14	$0.54
DeepSeek V3 DeepSeek	66K	$0.27	$1.10	$0.24	$0.99
Llama 4 Scout 17B Cloudflare AI	128K	$0.27	$0.85	$0.24	$0.77
Llama 3.3 70B Cloudflare AI	128K	$0.29	$2.25	$0.26	$2.03
Codestral Mistral AI	256K	$0.30	$0.90	$0.27	$0.81
Llama 3.1 70B Instruct Meta	128K	$0.35	$0.40	$0.32	$0.36
Qwen 2.5 72B Instruct Qwen	128K	$0.35	$0.40	$0.32	$0.36
Mistral Small 3.1 24B Cloudflare AI	128K	$0.35	$0.56	$0.32	$0.50
GPT-4.1 mini OpenAI	1M	$0.40	$1.60	$0.36	$1.44
DeepSeek R1 32B Cloudflare AI	32K	$0.50	$4.88	$0.45	$4.39
Nemotron 120B Cloudflare AI	128K	$0.50	$1.50	$0.45	$1.35
DeepSeek R1 DeepSeek	66K	$0.55	$2.19	$0.50	$1.97
Llama 3.3 70B Instruct Meta	128K	$0.59	$0.79	$0.53	$0.71
QwQ 32B Cloudflare AI	32K	$0.66	$1.00	$0.59	$0.90
Qwen Coder 32B Cloudflare AI	32K	$0.66	$1.00	$0.59	$0.90
Claude Haiku 4.5 Anthropic	200K	$0.80	$4.00	$0.72	$3.60
Kimi K2 Cloudflare AI	128K	$0.95	$4.00	$0.85	$3.60
Sonar Large (Online) Perplexity	127K	$1.00	$1.00	$0.90	$0.90
o4-mini OpenAI	200K	$1.10	$4.40	$0.99	$3.96
Gemini 2.5 Pro Google	1M	$1.25	$10.00	$1.13	$9.00
GPT-4.1 OpenAI	1M	$2.00	$8.00	$1.80	$7.20
Mistral Large Mistral AI	128K	$2.00	$6.00	$1.80	$5.40
GPT-4o OpenAI	128K	$2.50	$10.00	$2.25	$9.00
Command R+ Cohere	128K	$2.50	$10.00	$2.25	$9.00
Claude Sonnet 4.6 Anthropic	200K	$3.00	$15.00	$2.70	$13.50
o3 OpenAI	200K	$10.00	$40.00	$9.00	$36.00
Claude Opus 4 Anthropic	200K	$15.00	$75.00	$13.50	$67.50

Prices in USD per million tokens. Updated monthly from provider published rates. Neureus price = OpenRouter rate × 0.90.

FAQ

Common questions.

How accurate is this calculator?

The calculator uses published API pricing from each provider (via OpenRouter's rate card), updated monthly. Actual costs may vary slightly if providers change pricing mid-month or if you hit rate-limit tiers.

What counts as an input token vs output token?

Input tokens include your system prompt, conversation history, any documents you send, and the user's message. Output tokens are the model's reply. Most LLMs charge more for output than input (usually 3–5× more).

How does Neureus price 10% below OpenRouter?

Neureus runs a semantic response cache and prompt compression layer in front of every model. Repeated or similar queries are served from cache at $0 provider cost, and compressed prompts cost fewer tokens. The savings are passed to you as a flat 10% discount.

Which model should I use for my use case?

For cost-sensitive production workloads, Llama 3.3 70B and Mistral Large are strong choices. For highest quality, GPT-4.1 or Claude Opus 4. For reasoning tasks, DeepSeek R1 and o3 lead benchmarks. Browse the full model catalog for capability details.

Can I bring my own OpenAI or Anthropic API key?

Yes. Neureus supports BYOK (bring your own key) — you supply your provider key, Neureus routes through it, and you keep the direct billing relationship with the provider while still getting Neureus's caching and routing layer.

LLM Cost Calculator

Tokens, not characters.

Every model. Every price.

Common questions.

Start for free.
Pay less on every model.

LLM Cost Calculator

Tokens, not characters.

Every model. Every price.

Common questions.

Start for free.Pay less on every model.

Start for free.
Pay less on every model.