The Fable 5 Moment
18 partsTL;DR
Claude Fable 5's $10/$50 per million token pricing can catch teams off guard - here is how to build a real cost model before you commit.
Read next
Anthropic shipped two names for one architecture on June 9, 2026. Here is what separates Fable 5 from Mythos 5, who can actually get unrestricted access, and what developers should do right now.
7 min readFable 5 launched June 9 at 2x GPT-5.5's price with a 22-point SWE-Bench Pro gap. Here is the decision framework for choosing between them.
7 min readFable 5 lists at $10/$50 per million tokens - twice Opus 4.8. But list price is the wrong number. Here is the cost-per-outcome math that actually decides whether the upgrade pays.
8 min readClaude Fable 5 launched on June 9, 2026, and the pricing hit fast. Within hours, Simon Willison had spent $110.42 in a single day working through coding and agentic tasks - all under his $100/month Max subscription. That number circulated widely because it made the sticker price concrete: $10 per million input tokens and $50 per million output tokens, exactly double Claude Opus 4.8.
The question is not whether Fable 5 is expensive. It obviously is. The question is whether your specific workload justifies that price - and whether you have the right tooling in place to find out before your bill does.
Last updated: June 10, 2026
Fable 5's list price is $10/million input tokens and $50/million output tokens, confirmed by Anthropic at launch and detailed in the model overview docs. That puts it at twice the cost of Claude Opus 4.8 ($5/$25) on every token dimension.
Willison's $110 day came from a single large coding session: 78.2 million tokens through prod_datasette_agent, a Claude Code session that handled a human-in-the-loop feature for his Datasette project. The AgentsView treemap he published shows that one session accounted for 89.9% of his total daily spend.
That is not a freak case - it is what agentic coding looks like at Fable 5 prices. A model that resolves real GitHub issues at 80.3% success (SWE-Bench Pro, via Anthropic launch data) tends to take more turns, not fewer, and each turn costs output tokens at $50/million.
For comparison:
| Model | Input ($/M) | Output ($/M) |
|---|---|---|
| Claude Fable 5 | $10 | $50 |
| Claude Opus 4.8 | $5 | $25 |
| Claude Sonnet 4.6 | ~$3 | ~$15 |
| GPT-5.5 standard | $5 | $30 |
| GPT-5.5 batch/flex | $2.50 | $15 |
Source: Finout pricing breakdown, Anthropic docs.
One notable gap: Anthropic has no batch or async pricing tier for Fable 5. GPT-5.5's $2.50/$15 flex pricing for offline processing has no equivalent here, which matters for teams running large-scale background jobs.
The sticker price is only half the equation. The other half is how many tokens a task actually consumes.
One reported evaluation found that Fable 5 completed a frontier physics research task in 36 hours using one-third the reasoning tokens that GPT-5.5 needed to reach the same result over four days (cited in the Finout analysis). At $10/M vs $5/M input price, using 3x fewer tokens means Fable 5's effective cost on that task class is lower, not higher.
The same logic applies to long-horizon coding. Stripe reported that Fable 5 completed a codebase-wide migration across a 50-million-line Ruby codebase in one day - work estimated at two months for a full team. If you are paying engineers $150/hour, the token cost is irrelevant; the throughput is the value. At $50/M output, even a large-context migration run is a rounding error compared to the labor alternative.
The efficiency advantage has a hard boundary though: it applies to complex, multi-step reasoning and long-horizon agentic work. For short-context, well-defined tasks - classification, summarization, structured extraction, RAG retrieval calls - Fable 5 will not use fewer tokens than Opus 4.8. It will use roughly the same tokens at twice the price. That is where the math breaks down, and where most high-volume API spend lives.
The rule: estimate whether your task is long-context and high-complexity before routing to Fable 5. If the answer is not clearly yes, the efficiency argument does not apply.
Fable 5 supports prompt caching at the same discount rate as the rest of the Claude family: a 90% reduction on cached input tokens, bringing the effective input cost from $10/M to $1/M on cached prefixes (TrueFoundry pricing table).
The cache write costs are higher than the base input rate:
The math makes caching attractive whenever you reuse a large system prompt or context window across many requests. If your agent setup uses a 100k-token system prompt across 1,000 requests per day, serving that from cache at $1/M versus $10/M on input saves $0.90 per million tokens on every reuse. At scale, that adds up fast.
Cache writes are worth the upfront cost if:
The practical implementation note: design your prompts with the stable, reusable content at the top of the context. Anthropic's caching mechanics reward prefix reuse - if the cacheable portion is buried after dynamic content, the cache cannot hit.
Get the weekly deep dive
Tutorials on Claude Code, AI agents, and dev tools - delivered free every week.
From the archive
Jun 10, 2026 • 8 min read
Jun 10, 2026 • 7 min read
Jun 10, 2026 • 8 min read
Jun 10, 2026 • 7 min read
Willison's $110 day is notable for a second reason: he knew it was $110 in near-real-time because he had AgentsView configured with custom Fable 5 pricing. Without that, the spend would have been invisible until the monthly invoice.
AgentsView provides per-project, per-session cost treemaps for local Claude Code usage. It requires adding the Fable 5 model price manually since the tool was not pre-configured at launch (Willison published a TIL on that setup step).
For API usage at team scale, the appropriate layer is a gateway. TrueFoundry's AI Gateway and similar tools provide:
The governance argument for a gateway strengthens at Fable 5's price point. An ungoverned rollout at $50/M output creates invoice surprises that can be hard to explain to finance. Setting budget caps per team before the first production request is the cleaner path.
The most cost-effective Fable 5 strategy is not "use Fable 5 for everything" or "avoid Fable 5 until it's cheaper." It is routing by task complexity.
The model selection guidance from Finout is practical and matches what early production teams are reporting:
| Task type | Recommended model | Rationale |
|---|---|---|
| Long agentic coding, migrations | Fable 5 | 22-point SWE-Bench Pro lead, compounding advantage over multi-step work |
| Complex financial or analytical research | Fable 5 | Documented wins in Hebbia and IMC evaluations |
| General coding, Q&A, document tasks | Opus 4.8 | Half the price, still ahead of GPT-5.5 on SWE-Bench Pro (69.2% vs 58.6%) |
| High-volume classification, summarization | Sonnet 4.6 | ~$3/$15; marginal quality loss on well-defined tasks |
| Large-scale offline batch processing | GPT-5.5 flex | $2.50/$15 with no Anthropic equivalent |
| Bio, chem, or security workflows | Opus 4.8 for now | Fable 5's classifier fallback intercepts many domain queries |
The routing principle: use the cheapest model that clears your quality bar on the specific task type. Do not treat Fable 5 as the default; treat it as a premium tier that specific task classes earn.
One implementation detail worth flagging: Fable 5 has safety classifiers that route cybersecurity, biology/chemistry, and distillation queries to Claude Opus 4.8 automatically. According to Anthropic, this affects fewer than 5% of sessions in general use, but for teams with domain-adjacent workloads the rate will be higher. Fallback requests are billed at Opus 4.8 rates, not Fable 5 rates - but they also do not get Fable 5 quality. Build fallback detection into your routing layer to understand when it is happening.
Fable 5 is available on Claude.ai subscription plans through June 22, 2026 at no extra cost. After June 23, it moves to usage credits, billed separately from the flat subscription fee. Anthropic's stated intent is to restore it as a standard plan feature when capacity allows, with no committed date.
For Claude Code specifically, Fable 5 counts as 2x usage against the subscription allocation. A Pro or Max subscriber running long coding sessions with Fable 5 will exhaust their usage budget faster than with Opus 4.8.
The ROI question for agentic use: does a Fable 5 session that solves a problem in one pass justify the cost over two or three Opus 4.8 passes that might not fully solve it? For Willison, the answer was clearly yes - he described getting several days' worth of engineering work from a single day of Fable sessions. For simpler tasks, the math flips.
The session-level analysis matters more with Fable 5 than with cheaper models because the per-session cost is high enough that a failed or wasted session is visible on the bill. Invest in clear task framing and good system prompts before running long Fable 5 sessions. Garbage-in, expensive-garbage-out is a real pattern at $50/M output.
Use this framework to estimate Fable 5 costs for common task types before committing to production routing.
Inputs you need:
Formula:
daily_cost = (input_tokens * (1 - cache_hit_ratio) * $0.000010
+ input_tokens * cache_hit_ratio * $0.000001
+ output_tokens * $0.000050)
* requests_per_day
Benchmarks by task type:
| Task type | Typical input tokens | Typical output tokens | Est. cost/task (no cache) |
|---|---|---|---|
| Code review (single file) | 8,000 | 1,500 | $0.155 |
| Codebase migration (large context) | 80,000 | 12,000 | $1.40 |
| Chat / Q&A | 2,000 | 500 | $0.045 |
| Agentic loop (10 turns) | 50,000 total | 15,000 total | $1.25 |
| Document summarization | 15,000 | 2,000 | $0.25 |
These are rough estimates. Actual token counts vary significantly with prompt design. The key variable is output token count - at $50/M, output dominates the bill for any response over a few hundred tokens.
Where to validate: Run a 50-request sample through your actual prompts with token counting enabled before routing production volume. Anthropic's API returns token counts per response; capture them for a week before making model routing decisions.
$10 per million input tokens and $50 per million output tokens. Prompt caching reduces the effective input cost to $1 per million tokens on cached prefixes. This is double the price of Claude Opus 4.8 at $5/$25 per million tokens.
For long-horizon agentic tasks - codebase migrations, multi-step research, complex coding problems - yes. Fable 5 scores 80.3% on SWE-Bench Pro versus 69.2% for Opus 4.8, and early benchmarks suggest it uses fewer tokens on tasks it handles well, partially closing the cost gap. For high-volume, short-context, well-defined tasks, Opus 4.8 or Sonnet 4.6 are almost always the better economics.
For local Claude Code usage, AgentsView provides per-project cost treemaps with custom model pricing. For API usage at team scale, an AI gateway like TrueFoundry's provides per-team and per-application budget caps, virtual keys, and per-request cost logging. Set budget caps before routing production volume to Fable 5.
Yes. Fable 5 supports prompt caching with a 90% discount on cached input tokens, reducing the effective input rate from $10/M to $1/M on cache hits. Cache write costs are $12.50/M for a 5-minute cache and $20/M for a 1-hour cache. Caching is most valuable for large, stable system prompts reused across many requests.
The request is automatically routed to Claude Opus 4.8 and billed at Opus 4.8 rates. You are notified when this happens. Anthropic reports the fallback triggers in fewer than 5% of general sessions, but rates are higher for bio, chem, or cybersecurity-adjacent workloads. The Fallback API needs to be configured explicitly on the API; it is not fully automatic outside the Claude apps.
Fable 5 is included in Pro, Max, Team, and seat-based Enterprise plans at no extra charge through June 22, 2026. After June 23 it requires usage credits. Anthropic has stated intent to restore it as a standard plan feature when capacity allows, but has given no committed date.
Technical content at the intersection of AI and development. Building with AI agents, Claude Code, and modern dev tools - then showing you exactly how it works.
Open-source AI pair programming in your terminal. Works with any LLM - Claude, GPT, Gemini, local models. Git-aware ed...
View ToolAnthropic's agentic coding CLI. Runs in your terminal, edits files autonomously, spawns sub-agents, and maintains memory...
View ToolOpen-source AI code assistant for VS Code and JetBrains. Bring your own model - local or API. Tab autocomplete, chat,...
View ToolHigh-performance code editor built in Rust with native AI integration. Sub-millisecond input latency. Built-in assistant...
View ToolCatch broken SKILL.md files in CI before they hit your team.
View AppUnlock pro skills and share private collections with your team.
View AppKnow what each agent run cost before the bill arrives. Budgets and alerts included.
View AppWhat MCP servers are, how they work, and how to build your own in 5 minutes.
AI AgentsInstall Ollama and LM Studio, pull your first model, and run AI locally for coding, chat, and automation - with zero cloud dependency.
Getting StartedInstall Claude Code, configure your first project, and start shipping code with AI in under 5 minutes.
Getting StartedAnthropic shipped two names for one architecture on June 9, 2026. Here is what separates Fable 5 from Mythos 5, who can...
Fable 5 launched June 9 at 2x GPT-5.5's price with a 22-point SWE-Bench Pro gap. Here is the decision framework for choo...
Fable 5 lists at $10/$50 per million tokens - twice Opus 4.8. But list price is the wrong number. Here is the cost-per-o...
Fable 5 is mostly a drop-in replacement for Opus 4.8, but 'mostly' is doing real work in that sentence. Here's every bre...
Anthropic's Claude Fable 5 includes undisclosed interventions that silently degrade responses for certain ML development...
Fable 5 ships with safety classifiers that route flagged requests away from the model. In production you need to handle...

New tutorials, open-source projects, and deep dives on coding agents - delivered weekly.