Skip to content

OpenRouter Provider

OpenRouter provides access to 400+ AI models from dozens of providers through a single API. This is ideal for:

  • Multi-model workflows - Switch between Claude, GPT, Llama, and more
  • Cost optimization - Route to the cheapest available provider
  • Fallback resilience - Automatic failover when models are unavailable
  • Unified billing - Single API key for all providers

Set your OpenRouter API key:

Terminal window
export OPENROUTER_API_KEY=sk-or-...

Optional analytics configuration:

Terminal window
export OPENROUTER_SITE_URL="https://myapp.com" # Your app URL
export OPENROUTER_APP_NAME="MyApp" # Your app name

OpenRouter models use the format provider/model-name:

Terminal window
openrouter:anthropic/claude-sonnet-4-5
openrouter:openai/gpt-4o
openrouter:deepseek/deepseek-chat
openrouter:meta-llama/llama-3.3-70b-instruct
ModelOpenRouter IDBest For
Claude Sonnet 4.5anthropic/claude-sonnet-4-5Balanced performance
GPT-4oopenai/gpt-4oGeneral tasks, vision
DeepSeek Chat (V3)deepseek/deepseek-chatCost-effective coding
DeepSeek R1deepseek/deepseek-r1Complex reasoning
Llama 3.3 70Bmeta-llama/llama-3.3-70b-instructOpen-source option
Gemini 2.5 Flashgoogle/gemini-2.5-flashFast, long context

llmist provides shortcuts for common OpenRouter models:

AliasFull Model
or:sonnetopenrouter:anthropic/claude-sonnet-4-5
or:opusopenrouter:anthropic/claude-opus-4-5
or:haikuopenrouter:anthropic/claude-haiku-4-5
or:gpt4oopenrouter:openai/gpt-4o
or:gpt5openrouter:openai/gpt-5.2
or:flashopenrouter:google/gemini-2.5-flash
or:llamaopenrouter:meta-llama/llama-3.3-70b-instruct
or:deepseekopenrouter:deepseek/deepseek-r1
import { LLMist } from 'llmist';
const answer = await LLMist.createAgent()
.withModel('openrouter:deepseek/deepseek-chat')
.askAndCollect('Explain async/await in JavaScript');

OpenRouter supports intelligent routing to optimize for cost, speed, or quality:

import { LLMist } from 'llmist';
const answer = await LLMist.createAgent()
.withModel('openrouter:deepseek/deepseek-chat')
.withExtra({
routing: {
route: 'cheapest', // 'cheapest' | 'fastest' | 'quality'
},
})
.askAndCollect('Hello!');
OptionDescription
qualityBest quality provider (default)
cheapestLowest cost provider
fastestLowest latency provider

Configure automatic fallback to alternative models:

import { LLMist } from 'llmist';
const answer = await LLMist.createAgent()
.withModel('openrouter:anthropic/claude-sonnet-4-5')
.withExtra({
routing: {
// Try these models in order if primary is unavailable
models: [
'anthropic/claude-sonnet-4-5',
'openai/gpt-4o',
'deepseek/deepseek-chat',
],
},
})
.askAndCollect('Complex analysis task...');

Route to a specific provider when a model is available from multiple sources:

const answer = await LLMist.createAgent()
.withModel('openrouter:meta-llama/llama-3.3-70b-instruct')
.withExtra({
routing: {
provider: 'Together', // Route to Together AI
// Or specify provider preference order:
// order: ['Together', 'Fireworks', 'Anyscale'],
},
})
.askAndCollect('Hello!');

For advanced setups, configure the provider manually:

import { LLMist, OpenRouterProvider } from 'llmist';
import OpenAI from 'openai';
const openRouterClient = new OpenAI({
apiKey: process.env.OPENROUTER_API_KEY,
baseURL: 'https://openrouter.ai/api/v1',
});
const client = new LLMist({
autoDiscoverProviders: false,
adapters: [
new OpenRouterProvider(openRouterClient, {
siteUrl: 'https://myapp.com',
appName: 'MyApp',
}),
],
});
  • Access models not directly supported (Llama, Mistral, Qwen, etc.)
  • Need automatic failover between providers
  • Want unified billing across multiple models
  • Optimizing for cost with dynamic routing
  • Testing different models quickly

llmist automatically enables prompt caching for OpenRouter requests. This works with both Anthropic Claude and Google Gemini models routed through OpenRouter, saving up to 75% on cached input tokens.

llmist adds cache_control breakpoints to the last system message and last user message in each request. OpenRouter uses sticky routing to maximize cache hits by sending subsequent requests to the same provider endpoint.

For models with implicit caching (Gemini 2.5+, OpenAI, DeepSeek), caching happens server-side automatically — no breakpoints needed. llmist extracts and reports cached token counts from the API response in both cases.

If needed, disable caching explicitly:

const answer = await LLMist.createAgent()
.withModel('openrouter:anthropic/claude-sonnet-4-5')
.withoutCaching()
.askAndCollect('One-off question');

OpenRouter passes through usage information, including cached token counts:

for await (const event of agent.run()) {
if (event.type === 'llm_call_complete') {
console.log('Input tokens:', event.usage?.promptTokens);
console.log('Output tokens:', event.usage?.completionTokens);
console.log('Estimated cost:', event.cost);
}
}

The OpenRouter provider includes enhanced error messages:

ErrorMeaning
401Invalid API key - check OPENROUTER_API_KEY
402Insufficient credits - add funds at openrouter.ai/credits
429Rate limit exceeded - reduce request frequency
503Model unavailable - try a different model or use fallbacks

OpenRouter provides access to reasoning-capable models like DeepSeek R1, which excels at math, logic, and coding with chain-of-thought reasoning:

const answer = await LLMist.createAgent()
.withModel('openrouter:deepseek/deepseek-r1')
.askAndCollect('Prove there are infinitely many primes.');