Every major model in one table. Sort by price, context, or latency. Filter by capability. 19 models from 9 providers, all cross-checked against vendor pricing pages.
Need a workload estimate? Use the cost calculator. Want to count tokens for a specific prompt? Try the token estimator.
| Provider | Model | Input / 1M ↑ | Output / 1M | Cached input | Context | Caching | Batch | Latency |
|---|---|---|---|---|---|---|---|---|
| OpenAI | GPT-5 nano | $0.050 | $0.400 | $0.0050 | 400K | Yes | 50% | fast |
| Mistral | Mistral Small 3 | $0.200 | $0.600 | - | 128K | No | 50% | fast |
| OpenAI | GPT-5 mini | $0.250 | $2.00 | $0.025 | 400K | Yes | 50% | balanced |
| DeepSeek | DeepSeek V3 | $0.270 | $1.10 | $0.070 | 128K | Yes | No | balanced |
Gemini 2.5 Flash | $0.300 | $2.50 | $0.075 | 1M | Yes | 50% | fast | |
| DeepSeek | DeepSeek R1 Reasoning model. | $0.550 | $2.19 | $0.140 | 128K | Yes | No | frontier |
| Groq | Llama 3.3 70B Versatile Sub-second TTFT typical. | $0.590 | $0.790 | - | 131K | No | No | fast |
| Cerebras | Llama 3.1 70B Wafer-scale inference, very high tok/s. | $0.600 | $0.600 | - | 128K | No | No | fast |
| Together AI | Llama 3.3 70B Instruct Turbo | $0.880 | $0.880 | - | 131K | No | No | balanced |
| Anthropic | Claude Haiku 4.5 | $1.00 | $5.00 | $0.100 | 200K | Yes | 50% | fast |
| OpenAI | GPT-5 | $1.25 | $10.00 | $0.125 | 400K | Yes | 50% | frontier |
| OpenAI | GPT-5.3 | $1.25 | $10.00 | $0.125 | 400K | Yes | 50% | frontier |
| OpenAI | GPT-5-Codex Specialized for coding/agentic tasks. | $1.25 | $10.00 | $0.125 | 400K | Yes | 50% | frontier |
Gemini 2.5 Pro Higher tier above 200K tokens: $2.50 in / $15 out per 1M. | $1.25 | $10.00 | $0.310 | 2M | Yes | 50% | frontier | |
| Mistral | Mistral Large 2 | $2.00 | $6.00 | - | 128K | No | 50% | frontier |
| Cohere | Command A | $2.50 | $10.00 | - | 256K | No | No | balanced |
| Anthropic | Claude Sonnet 4.7 | $3.00 | $15.00 | $0.300 | 1M | Yes | 50% | balanced |
| Together AI | Llama 3.1 405B Instruct Turbo | $3.50 | $3.50 | - | 131K | No | No | balanced |
| Anthropic | Claude Opus 4.7 | $15.00 | $75.00 | $1.50 | 1M | Yes | 50% | frontier |
Showing 19 of 19 models. Click any column header to sort.
Estimate the monthly bill for a specific workload - RPS, prompt size, output size.
Paste a prompt, see token counts across the major tokenizers side by side.
Side-by-side comparison across AI coding tools, frameworks, and infra.
Prompt caching, batch APIs, smaller-model fallbacks - deep dives.
Self-host vs API: when each wins on cost, latency, and privacy.
The full Developers Digest tool directory.
Prices reflect publicly listed standard-tier rates per 1M tokens as of 2026-04-29. Cached input pricing applies only when prompt caching is engaged and varies by provider. Batch discounts typically apply to async batch APIs with 24-hour SLAs.
Always verify against the vendor pricing page before procurement. Enterprise and committed-use pricing is not reflected here. Spot a stale number? Let us know.