Estimate cost per request, day, and month across Claude, GPT-5, Gemini, and friends. Toggle prompt caching and batch mode to see real-world savings. Pricing snapshot: 2026-04-29.
Get the techniques we use to cut AI bills 50 to 80% - prompt caching patterns, batch pipelines, model routing, and eval-driven downsizing. Free, no fluff.

New tutorials, open-source projects, and deep dives on coding agents - delivered weekly.
(input tokens / 1M) * input price + (output tokens / 1M) * output price, multiplied by requests per day. Output tokens are typically 3 to 5x more expensive than input.
Anthropic gives a 90% discount on cached input tokens. Cache writes cost 25% more than base input, so caching pays off when context is reused enough times. On stable system prompts you can see 50 to 80% total savings.
An async API that returns results within 24 hours at 50% off both input and output. Ideal for evals, backfills, summarization, and offline pipelines.
Snapshot from 2026-04-29. Verify against provider pricing pages before committing to a contract.
Pricing as of 2026-04-29. Always verify against provider pricing pages.