Cloudflare AI

Llama 4 Scout 17B

Meta's Llama 4 Scout MoE model at the edge. 128K context with native vision support.

edgeopen sourcemultimodal

Use this model free ← All models

Pricing via Neureus

Save 10% vs OpenRouter

Context window 128K tokens

Max output 4K tokens

Input (OpenRouter) $0.27/M

Input (Neureus) $0.24/M

Output (OpenRouter) $0.85/M

Output (Neureus) $0.77/M

Neureus prices all models at 10% below published OpenRouter rates, updated monthly. Free tier includes 5M tokens →

Use this model

One API. Every model.

Swap any model ID in a single field. Neureus handles routing, caching, and tenant isolation automatically.

  chat.ts 
const res = await fetch('https://app.neureus.ai/ai/chat', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${NEUREUS_API_KEY}`,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    model: 'cloudflare/@wai/llama-4-scout-17b',
    messages: [{ role: 'user', content: 'Hello!' }],
  }),
});
const { message } = await res.json();

Semantic response caching — repeated queries served free from edge

Per-tenant spend caps — set a monthly limit per customer

PII guardrails — optional PHI/PII detection before each request

Automatic failover — falls back to an alternate provider on error

Llama 4 Scout 17B

One API. Every model.

Start using Llama 4 Scout 17Bfor free.

Start using Llama 4 Scout 17B
for free.