TL;DR
Fable 5 is mostly a drop-in replacement for Opus 4.8, but 'mostly' is doing real work in that sentence. Here's every breaking change, what to delete from your code, and the prompt audit you should run before flipping the model ID.
Read next
Fable 5 ships with safety classifiers that route flagged requests away from the model. In production you need to handle this, and Anthropic shipped three ways to do it. Here's how each one works, with code, plus the billing rules nobody has written up.
10 min readAnthropic gave subscribers two weeks of free Fable 5 access, then it moves to usage credits. Here's what's actually changing, what the real-world burn rates look like, and what to do depending on how you use Claude.
6 min readClaude Fable 5 routes blocked queries to Opus 4.8 rather than refusing outright - but the fallback is not automatic for API users and requires explicit configuration. Here is the complete developer guide to the refusal architecture.
8 min readAnthropic shipped Claude Fable 5 on June 9. It's a new tier above Opus: $10 per million input tokens, $50 per million output, 1M context window, 128K max output. The model ID is claude-fable-5.
If you're on Opus 4.8, migration is close to a one-line change. If you're on anything older, there's a stack of breaking changes between you and Fable 5, and one of them has a deadline this week: claude-sonnet-4-20250514 and claude-opus-4-20250514 retire on June 15, 2026.
Here's the full migration, in order.
response = client.messages.create(
model="claude-fable-5", # was: claude-opus-4-8
max_tokens=32000,
messages=[{"role": "user", "content": "..."}],
)
This works for most Opus 4.8 codebases. Now here's everything that can break.
thinking: disabled is now a 400 errorThis is the one new breaking change versus Opus 4.8. Fable 5 has exactly one thinking mode: adaptive, always on. There is no way to turn it off.
# Opus 4.8: valid, runs without thinking
thinking={"type": "disabled"}
# Fable 5: 400 error. Delete the parameter entirely.
If you omit thinking, you get adaptive thinking. That's the only option. Search your codebase for "disabled" near any thinking config and remove it.
Two related carryovers from the Opus 4.7/4.8 surface, in case you skipped those releases: budget_tokens returns a 400, and so do temperature, top_p, and top_k. The effort parameter is the only depth control now.
max_tokens on workloads that ran without thinkingBecause thinking is always on, and thinking tokens count against max_tokens, any workload you previously ran with thinking disabled now needs headroom it didn't need before.
A classification task that comfortably ran at max_tokens: 500 on Opus 4.8 with thinking off can now hit the cap mid-thought. Budget for thinking plus response text, not response text alone.
high, not xhighEffort levels are low, medium, high, xhigh, and max, set via output_config:
output_config={"effort": "high"}
Anthropic's own migration guidance is direct about this: even if you ran xhigh on Opus 4.8, start at high on Fable 5. Lower effort on Fable 5 often beats xhigh on prior models. Given that output costs $50 per million tokens, the difference between high and xhigh shows up on your bill fast. Reserve xhigh and max for work where capability genuinely matters more than cost.
Fable 5 runs safety classifiers on requests and during generation, targeting three categories: offensive cyber, biology and chemistry, and attempts to extract the model's raw reasoning. When a classifier fires, you get HTTP 200 with a new shape:
{
"stop_reason": "refusal",
"stop_details": {
"type": "refusal",
"category": "cyber",
"explanation": "..."
}
}
The categories are "cyber", "bio", "reasoning_extraction", or null. Anthropic says fewer than 5% of sessions trigger a fallback, but the classifiers are tuned conservative and benign work trips them. Day-one reports include a base64 implementation flagged as cybersecurity and genome-alignment work force-routed away from Fable.
If your code doesn't check stop_reason, a refusal looks like a short, useless completion. At minimum, log it. In production, you want the fallback pattern: retry on Opus 4.8 automatically. That's a big enough topic that we wrote a separate guide to the new Fallback API.
The billing rule worth knowing: a request refused before any output is generated is not billed and doesn't count against rate limits.
Get the weekly deep dive
Tutorials on Claude Code, AI agents, and dev tools - delivered free every week.
From the archive
Jun 10, 2026 • 8 min read
Jun 10, 2026 • 7 min read
Jun 8, 2026 • 8 min read
Jun 7, 2026 • 5 min read
This one will bite teams with mature prompt libraries. The reasoning_extraction classifier targets attempts to pull out the model's chain of thought, and old prompts are full of phrases like "show your reasoning step by step" and "explain your thought process before answering."
On Fable 5, those instructions can trigger refusals or elevated fallback rates. The model's raw chain of thought is never returned anyway: thinking.display defaults to "omitted", and the most you can get is "summarized". So those instructions buy you nothing and cost you reliability.
Grep your prompts, skills, and CLAUDE.md files for reasoning-extraction language and cut it. If you need visibility into the model's process, read the summarized thinking blocks instead.
Two facts to hold at once:
So your real cost delta is workload-dependent and probably less than 2x on long agentic tasks. Don't guess. Run a week of representative traffic and compare actual spend, not unit prices.
Two pricing notes that help: the full 1M context window bills at standard rates with no long-context premium, and the prompt-cache minimum drops to 512 tokens on the Claude API (it stays 1,024 on Bedrock). Short system prompts that never cached before now do.
Anthropic's Fable 5 prompting guide makes a point that's easy to skim past: old, prescriptive prompts can make Fable 5 worse. Instruction following is strong enough that brief instructions beat enumerated rule lists, and skill files written to keep weaker models on rails now read as constraints that degrade output.
The shift in one line: give it objectives, not task lists. If your CLAUDE.md spells out a 14-step procedure for something Fable 5 can figure out, the procedure is now the bottleneck.
While you're in there: single requests can run many minutes at high effort, so raise client timeouts, stream responses, and treat long runs as async jobs rather than blocking calls.
Apply the generations in order. Each one has its own breaking changes:
From 4.7 to 4.8: nothing breaks. Swap the ID and re-tune prompts.
From 4.6 to 4.7: remove temperature, top_p, top_k (all 400 now). Replace budget_tokens thinking with adaptive thinking plus effort. The 4.7 tokenizer produces up to 30-35% more tokens for the same text, so re-run count_tokens on your prompts and raise max_tokens and any compaction triggers accordingly.
From 4.5 or earlier to 4.6: assistant prefills return a 400 (replace with structured outputs via output_config.format). Remove beta headers that went GA (effort-2025-11-24 and friends). Stream anything above roughly 16K max_tokens. Handle the refusal and model_context_window_exceeded stop reasons.
If you're on the retiring claude-sonnet-4-20250514 or claude-opus-4-20250514, you have until June 15. That's not a Fable 5 decision, it's a "your API calls stop working" decision. Move to claude-sonnet-4-6 or claude-opus-4-8 first, then evaluate Fable 5 from stable ground.
claude-fable-5thinking: {type: "disabled"}max_tokens on previously non-thinking workloadshigh, benchmark before going higherstop_reason: "refusal" and stop_details.categoryOne more date for your calendar: Fable 5 is included free on Pro, Max, Team, and seat-based Enterprise plans only through June 22. From June 23 it requires usage credits. If you're evaluating on a subscription, this is the week to do it.
Sources: Anthropic's migration guide, models overview, Prompting Claude Fable 5, and model deprecations.
Technical content at the intersection of AI and development. Building with AI agents, Claude Code, and modern dev tools - then showing you exactly how it works.
Anthropic's flagship reasoning model. Best-in-class for coding, long-context analysis, and agentic workflows. 1M token c...
View ToolAnthropic's agentic coding CLI. Runs in your terminal, edits files autonomously, spawns sub-agents, and maintains memory...
View ToolAnthropic's Python SDK for building production agent systems. Tool use, guardrails, agent handoffs, and orchestration. R...
View ToolAnthropic's AI. Opus 4.6 for hard problems, Sonnet 4.6 for speed, Haiku 4.5 for cost. 200K context window. Best coding m...
View ToolEvery coding agent in one window. Stop alt-tabbing between Claude, Codex, and Cursor.
View AppTurn a one-liner into a working Claude Code skill. From idea to installed in a minute.
View AppBeat the August 2026 Assistants API sunset. Paste old code, get Responses API.
View AppA concrete step-by-step guide to moving your development workflow from Cursor to Claude Code - settings, rules, keybindings, and the habits that transfer.
Getting StartedManaged scheduling on Anthropic infrastructure with API and GitHub triggers.
Claude CodeConfigure Claude Code for maximum productivity -- CLAUDE.md, sub-agents, MCP servers, and autonomous workflows.
AI Agents
Claude Fable 5 Released: Benchmarks, Pricing, Availability, and Real-World Examples Anthropic has released Claude Fable 5, the first general-use “Mythos class” model, and the video reviews the announ...

Nimbalyst Demo: A Visual Workspace for Codex + Claude Code with Kanban, Plans, and AI Commits Try it: https://nimbalyst.com/ Star Repo Here: https://github.com/Nimbalyst/nimbalyst This video demos N...

Claude Design by Anthropic: Generate a Design System From Your Repo + Build High-Fidelity UI Fast The video reviews Claude Design by Anthropic, calling it a highly differentiated product, and demonst...
Fable 5 ships with safety classifiers that route flagged requests away from the model. In production you need to handle...
Anthropic gave subscribers two weeks of free Fable 5 access, then it moves to usage credits. Here's what's actually chan...
Anthropic's Claude Fable 5 mandates 30-day data retention on every platform, overriding existing Zero Data Retention con...
Claude Fable 5 routes blocked queries to Opus 4.8 rather than refusing outright - but the fallback is not automatic for...
Fable 5 lists at $10/$50 per million tokens - twice Opus 4.8. But list price is the wrong number. Here is the cost-per-o...
Anthropic added three new primitives to Claude Managed Agents in spring 2026 - dreaming, outcomes, and multi-agent orche...

New tutorials, open-source projects, and deep dives on coding agents - delivered weekly.