Migrating to Claude Fable 5: The Practical Guide

Developers Digest•June 10, 2026•9 min read

Claude Fable 5 Anthropic API Migration

The Fable 5 Moment

31 parts

Previous in seriesClaude Mythos 5 Explained: What It Is, Who Can Access It, and Why It's Gated

Next in seriesFable 5 Leaves Your Claude Plan on June 22. Here's How to Plan for It

TL;DR

Fable 5 is mostly a drop-in replacement for Opus 4.8, but 'mostly' is doing real work in that sentence. Here's every breaking change, what to delete from your code, and the prompt audit you should run before flipping the model ID.

Anthropic shipped Claude Fable 5 on June 9. It's a new tier above Opus: $10 per million input tokens, $50 per million output, 1M context window, 128K max output. The model ID is claude-fable-5.

If you're on Opus 4.8, migration is close to a one-line change. If you're on anything older, there's a stack of breaking changes between you and Fable 5, and one of them has a deadline this week: claude-sonnet-4-20250514 and claude-opus-4-20250514 retire on June 15, 2026.

Here's the full migration, in order.

The one-line version#

Python

response = client.messages.create(
    model="claude-fable-5",  # was: claude-opus-4-8
    max_tokens=32000,
    messages=[{"role": "user", "content": "..."}],
)

This works for most Opus 4.8 codebases. Now here's everything that can break.

1. `thinking: disabled` is now a 400 error#

This is the one new breaking change versus Opus 4.8. Fable 5 has exactly one thinking mode: adaptive, always on. There is no way to turn it off.

Python

# Opus 4.8: valid, runs without thinking
thinking={"type": "disabled"}

# Fable 5: 400 error. Delete the parameter entirely.

If you omit thinking, you get adaptive thinking. That's the only option. Search your codebase for "disabled" near any thinking config and remove it.

Two related carryovers from the Opus 4.7/4.8 surface, in case you skipped those releases: budget_tokens returns a 400, and so do temperature, top_p, and top_k. The effort parameter is the only depth control now.

2. Raise `max_tokens` on workloads that ran without thinking#

Because thinking is always on, and thinking tokens count against max_tokens, any workload you previously ran with thinking disabled now needs headroom it didn't need before.

A classification task that comfortably ran at max_tokens: 500 on Opus 4.8 with thinking off can now hit the cap mid-thought. Budget for thinking plus response text, not response text alone.

3. Start effort at `high`, not `xhigh`#

See the full breakdown of effort levels if you need more than the summary below. Effort levels are low, medium, high, xhigh, and max, set via output_config:

Python

output_config={"effort": "high"}

Anthropic's own migration guidance is direct about this: even if you ran xhigh on Opus 4.8, start at high on Fable 5. Lower effort on Fable 5 often beats xhigh on prior models. Given that output costs $50 per million tokens, the difference between high and xhigh shows up on your bill fast. Reserve xhigh and max for work where capability genuinely matters more than cost.

4. Handle the new refusal stop reason#

Fable 5 runs safety classifiers on requests and during generation, targeting three categories: offensive cyber, biology and chemistry, and attempts to extract the model's raw reasoning. When a classifier fires, you get HTTP 200 with a new shape:

JSON

{
  "stop_reason": "refusal",
  "stop_details": {
    "type": "refusal",
    "category": "cyber",
    "explanation": "..."
  }
}

The categories are "cyber", "bio", "reasoning_extraction", or null. Anthropic says fewer than 5% of sessions trigger a fallback, but the classifiers are tuned conservative and benign work trips them. Day-one reports include a base64 implementation flagged as cybersecurity and genome-alignment work force-routed away from Fable.

If your code doesn't check stop_reason, a refusal looks like a short, useless completion. At minimum, log it. In production, you want the fallback pattern: retry on Opus 4.8 automatically. That's a big enough topic that we wrote a separate guide to the new Fallback API. For the reasoning behind the classifiers themselves, see Fable 5's safeguards and refusal architecture; if you're running multiple agents, handling refusals across an agent fleet covers the retry patterns in more depth.

The billing rule worth knowing: a request refused before any output is generated is not billed and doesn't count against rate limits.

Newsletter

Get the weekly deep dive

Tutorials on Claude Code, AI agents, and dev tools, delivered free every week.

From the archive

MiniMax M2.5 for Developers: The Anthropic-Compatible Budget Frontier Model

Jun 10, 2026 • 7 min read

Neon Postgres in 2026: Review and Setup for AI App Builders

Jun 10, 2026 • 9 min read

What the 'Notes on DeepSeek' Essay Gets Right About Open-Weights Economics

Jun 10, 2026 • 7 min read

OpenAI Agents SDK vs Claude Agent SDK: Building Agents on the Two Big Platforms

Jun 10, 2026 • 9 min read

5. Audit your prompts for "show your reasoning"#

This one will bite teams with mature prompt libraries. The reasoning_extraction classifier targets attempts to pull out the model's chain of thought, and old prompts are full of phrases like "show your reasoning step by step" and "explain your thought process before answering."

On Fable 5, those instructions can trigger refusals or elevated fallback rates. The model's raw chain of thought is never returned anyway: thinking.display defaults to "omitted", and the most you can get is "summarized". So those instructions buy you nothing and cost you reliability.

Grep your prompts, skills, and CLAUDE.md files for reasoning-extraction language and cut it. If you need visibility into the model's process, read the summarized thinking blocks instead.

6. Re-baseline your costs, not just your code#

Two facts to hold at once:

Fable 5 is exactly 2x Opus 4.8 pricing across the board, including cache writes and batch.
Token counts are roughly unchanged from Opus 4.8 (same tokenizer), and early production reports suggest Fable 5 uses substantially fewer tokens to finish the same agentic work. One pre-launch tester measured roughly half the tokens of Opus 4.8 in agentic harnesses.

So your real cost delta is workload-dependent and probably less than 2x on long agentic tasks. Don't guess. Run a week of representative traffic and compare actual spend, not unit prices. For a deeper walkthrough of the math, see production cost modeling for Fable 5.

Two pricing notes that help: the full 1M context window bills at standard rates with no long-context premium, and the prompt-cache minimum drops to 512 tokens on the Claude API (it stays 1,024 on Bedrock). Short system prompts that never cached before now do.

7. Trim your prompts. Seriously.#

Anthropic's Fable 5 prompting guide makes a point that's easy to skim past: old, prescriptive prompts can make Fable 5 worse. Instruction following is strong enough that brief instructions beat enumerated rule lists, and skill files written to keep weaker models on rails now read as constraints that degrade output.

The shift in one line: give it objectives, not task lists. If your CLAUDE.md spells out a 14-step procedure for something Fable 5 can figure out, the procedure is now the bottleneck. There is a full walkthrough of this rewrite process in rewriting your prompts and skills for Fable 5.

While you're in there: single requests can run many minutes at high effort, so raise client timeouts, stream responses, and treat long runs as async jobs rather than blocking calls.

Coming from Opus 4.7 or earlier#

Apply the generations in order. Each one has its own breaking changes:

From 4.7 to 4.8: nothing breaks. Swap the ID and re-tune prompts.

From 4.6 to 4.7: remove temperature, top_p, top_k (all 400 now). Replace budget_tokens thinking with adaptive thinking plus effort. The 4.7 tokenizer produces up to 30-35% more tokens for the same text, so re-run count_tokens on your prompts and raise max_tokens and any compaction triggers accordingly.

From 4.5 or earlier to 4.6: assistant prefills return a 400 (replace with structured outputs via output_config.format). Remove beta headers that went GA (effort-2025-11-24 and friends). Stream anything above roughly 16K max_tokens. Handle the refusal and model_context_window_exceeded stop reasons.

If you're on the retiring claude-sonnet-4-20250514 or claude-opus-4-20250514, you have until June 15. That's not a Fable 5 decision, it's a "your API calls stop working" decision. Move to claude-sonnet-4-6 or claude-opus-4-8 first, then evaluate Fable 5 from stable ground.

The checklist#

Swap model ID to claude-fable-5
Delete any thinking: {type: "disabled"}
Raise max_tokens on previously non-thinking workloads
Set effort to high, benchmark before going higher
Handle stop_reason: "refusal" and stop_details.category
Grep prompts for "show your reasoning" language and remove it
Run a cost comparison on real traffic before committing
Cut prescriptive procedures from prompts and skills
Raise client timeouts and stream long requests

One more date for your calendar: Fable 5 is included free on Pro, Max, Team, and seat-based Enterprise plans only through June 22. From June 23 it requires usage credits. If you're evaluating on a subscription, this is the week to do it.

FAQ#

Is migrating from Opus 4.8 to Fable 5 really a one-line change?#

For most codebases, swapping the model string is enough to get a working response. Whether it's a good response depends on the seven items above: thinking: disabled becomes a 400 error, max_tokens may need raising, effort defaults change, and the new refusal stop reason needs handling before you can trust it in production.

What happens if I don't migrate off `claude-sonnet-4-20250514` or `claude-opus-4-20250514`?#

Those model IDs retire on June 15, 2026. That's a hard stop on API calls, not a Fable 5 decision. Move to claude-sonnet-4-6 or claude-opus-4-8 first, confirm stability, then evaluate Fable 5 separately.

Why does Fable 5 refuse requests that Opus 4.8 answered fine?#

Fable 5 runs safety classifiers targeting offensive cyber, biology/chemistry, and reasoning-extraction categories. Anthropic says under 5% of sessions trigger a fallback, but the classifiers are tuned conservative, so benign work (like base64 handling or genome-alignment code) can trip them. See Fable 5's safeguards and refusal architecture for how the categories work.

Will Fable 5 cost more than Opus 4.8 for the same workload?#

Unit pricing is exactly 2x Opus 4.8, but token counts on agentic tasks are often lower because Fable 5 needs fewer tokens to finish the same work. The real delta is workload-dependent; run a week of representative traffic before drawing conclusions, and see production cost modeling for Fable 5 for the full math.

Sources: Anthropic's migration guide, models overview, Prompting Claude Fable 5, and model deprecations.

Handling Fable 5 Refusals: A Working Guide to the Fallback API

Fable 5 ships with safety classifiers that route flagged requests away from the model. In production you need to handle this, and Anthropic shipped three ways to do it. Here's how each one works, with code, plus the billing rules nobody has written up.

10 min read

Fable 5 Leaves Your Claude Plan on June 22. Here's How to Plan for It

Anthropic gave subscribers two weeks of free Fable 5 access, then it moves to usage credits. Here's what's actually changing, what the real-world burn rates look like, and what to do depending on how you use Claude.

6 min read

Fable 5 Before June 22: The Decision Checklist for Every Plan Tier

12 days out from the Fable 5 promotional window closing on claude.ai, here is the practical checklist for Pro users, Max subscribers, teams, and API developers - what to decide, what to test, and what not to worry about.

9 min read

Suggest an editSave

Discuss this article on Twitter/X

Developers Digest

Technical content at the intersection of AI and development. Building with AI agents, Claude Code, and modern dev tools - then showing you exactly how it works.

300+ videos30K+ GitHub stars50+ articles

Subscribe YouTube GitHub Twitter/X

Try These Tools

Base64 Encoder

Related Tools

AI ModelsNew

Claude Fable 5

Anthropic's first generally available Mythos-class model, released June 9, 2026. 1M context, 128K max output, $10/$50 pe...

View Tool

AI Models

Claude Opus 4.7

Anthropic's flagship reasoning model. Best-in-class for coding, long-context analysis, and agentic workflows. 1M token c...

View Tool

AI CodingDaily Driver

Claude Code

Anthropic's agentic coding CLI. Runs in your terminal, edits files autonomously, spawns sub-agents, and maintains memory...

View Tool

AI FrameworksNew

Claude Agent SDK

Anthropic's Python SDK for building production agent systems. Tool use, guardrails, agent handoffs, and orchestration. R...

View Tool

Apps from Developers Digest

Developer Tools

Agent Hub

Every coding agent in one window. Stop alt-tabbing between Claude, Codex, and Cursor.

View App

Developer ToolsIn Progress

Skill Builder

Turn a one-liner into a working Claude Code skill. From idea to installed in a minute.

View App

Developer ToolsIn Progress

Migrate

Beat the August 2026 Assistants API sunset. Paste old code, get Responses API.

View App

Related Guides

Guide

Migrating from Cursor to Claude Code

A concrete step-by-step guide to moving your development workflow from Cursor to Claude Code - settings, rules, keybindings, and the habits that transfer.

Getting Started

Guide

Routines (Web) - Claude Code

Managed scheduling on Anthropic infrastructure with API and GitHub triggers.

Claude Code

Guide

Claude Code Setup Guide

Configure Claude Code for maximum productivity -- CLAUDE.md, sub-agents, MCP servers, and autonomous workflows.

AI Agents

Build with the member tools

Migrating to Claude Fable 5: The Practical Guide

Developers Digest•June 10, 2026•9 min read

Claude Fable 5 Anthropic API Migration

The Fable 5 Moment

31 parts

Previous in seriesClaude Mythos 5 Explained: What It Is, Who Can Access It, and Why It's Gated

Next in seriesFable 5 Leaves Your Claude Plan on June 22. Here's How to Plan for It

TL;DR

Anthropic shipped Claude Fable 5 on June 9. It's a new tier above Opus: $10 per million input tokens, $50 per million output, 1M context window, 128K max output. The model ID is claude-fable-5.

Here's the full migration, in order.

The one-line version#

Python

response = client.messages.create(
    model="claude-fable-5",  # was: claude-opus-4-8
    max_tokens=32000,
    messages=[{"role": "user", "content": "..."}],
)

This works for most Opus 4.8 codebases. Now here's everything that can break.

1. `thinking: disabled` is now a 400 error#

This is the one new breaking change versus Opus 4.8. Fable 5 has exactly one thinking mode: adaptive, always on. There is no way to turn it off.

Python

# Opus 4.8: valid, runs without thinking
thinking={"type": "disabled"}

# Fable 5: 400 error. Delete the parameter entirely.

If you omit thinking, you get adaptive thinking. That's the only option. Search your codebase for "disabled" near any thinking config and remove it.

2. Raise `max_tokens` on workloads that ran without thinking#

Because thinking is always on, and thinking tokens count against max_tokens, any workload you previously ran with thinking disabled now needs headroom it didn't need before.

A classification task that comfortably ran at max_tokens: 500 on Opus 4.8 with thinking off can now hit the cap mid-thought. Budget for thinking plus response text, not response text alone.

3. Start effort at `high`, not `xhigh`#

See the full breakdown of effort levels if you need more than the summary below. Effort levels are low, medium, high, xhigh, and max, set via output_config:

Python

output_config={"effort": "high"}

4. Handle the new refusal stop reason#

JSON

{
  "stop_reason": "refusal",
  "stop_details": {
    "type": "refusal",
    "category": "cyber",
    "explanation": "..."
  }
}

The billing rule worth knowing: a request refused before any output is generated is not billed and doesn't count against rate limits.

Newsletter

Get the weekly deep dive

Tutorials on Claude Code, AI agents, and dev tools, delivered free every week.

From the archive

MiniMax M2.5 for Developers: The Anthropic-Compatible Budget Frontier Model

Jun 10, 2026 • 7 min read

Neon Postgres in 2026: Review and Setup for AI App Builders

Jun 10, 2026 • 9 min read

What the 'Notes on DeepSeek' Essay Gets Right About Open-Weights Economics

Jun 10, 2026 • 7 min read

OpenAI Agents SDK vs Claude Agent SDK: Building Agents on the Two Big Platforms

Jun 10, 2026 • 9 min read

5. Audit your prompts for "show your reasoning"#

Grep your prompts, skills, and CLAUDE.md files for reasoning-extraction language and cut it. If you need visibility into the model's process, read the summarized thinking blocks instead.

6. Re-baseline your costs, not just your code#

Two facts to hold at once:

Fable 5 is exactly 2x Opus 4.8 pricing across the board, including cache writes and batch.
Token counts are roughly unchanged from Opus 4.8 (same tokenizer), and early production reports suggest Fable 5 uses substantially fewer tokens to finish the same agentic work. One pre-launch tester measured roughly half the tokens of Opus 4.8 in agentic harnesses.

7. Trim your prompts. Seriously.#

While you're in there: single requests can run many minutes at high effort, so raise client timeouts, stream responses, and treat long runs as async jobs rather than blocking calls.

Coming from Opus 4.7 or earlier#

Apply the generations in order. Each one has its own breaking changes:

From 4.7 to 4.8: nothing breaks. Swap the ID and re-tune prompts.

The checklist#

Swap model ID to claude-fable-5
Delete any thinking: {type: "disabled"}
Raise max_tokens on previously non-thinking workloads
Set effort to high, benchmark before going higher
Handle stop_reason: "refusal" and stop_details.category
Grep prompts for "show your reasoning" language and remove it
Run a cost comparison on real traffic before committing
Cut prescriptive procedures from prompts and skills
Raise client timeouts and stream long requests

FAQ#

Is migrating from Opus 4.8 to Fable 5 really a one-line change?#

What happens if I don't migrate off `claude-sonnet-4-20250514` or `claude-opus-4-20250514`?#

Why does Fable 5 refuse requests that Opus 4.8 answered fine?#

Will Fable 5 cost more than Opus 4.8 for the same workload?#

Sources: Anthropic's migration guide, models overview, Prompting Claude Fable 5, and model deprecations.

Discuss this article on Twitter/X

Developers Digest

Technical content at the intersection of AI and development. Building with AI agents, Claude Code, and modern dev tools - then showing you exactly how it works.

300+ videos30K+ GitHub stars50+ articles

Subscribe YouTube GitHub Twitter/X

The one-line version#

1. thinking: disabled is now a 400 error#

2. Raise max_tokens on workloads that ran without thinking#

3. Start effort at high, not xhigh#

4. Handle the new refusal stop reason#

MiniMax M2.5 for Developers: The Anthropic-Compatible Budget Frontier Model

Neon Postgres in 2026: Review and Setup for AI App Builders

What the 'Notes on DeepSeek' Essay Gets Right About Open-Weights Economics

OpenAI Agents SDK vs Claude Agent SDK: Building Agents on the Two Big Platforms

5. Audit your prompts for "show your reasoning"#

6. Re-baseline your costs, not just your code#

7. Trim your prompts. Seriously.#

Coming from Opus 4.7 or earlier#

The checklist#

FAQ#

Is migrating from Opus 4.8 to Fable 5 really a one-line change?#

What happens if I don't migrate off claude-sonnet-4-20250514 or claude-opus-4-20250514?#

Why does Fable 5 refuse requests that Opus 4.8 answered fine?#

Will Fable 5 cost more than Opus 4.8 for the same workload?#

Handling Fable 5 Refusals: A Working Guide to the Fallback API

Fable 5 Leaves Your Claude Plan on June 22. Here's How to Plan for It

Fable 5 Before June 22: The Decision Checklist for Every Plan Tier

Try These Tools

Related Tools

Claude Fable 5

Claude Opus 4.7

Claude Code

Claude Agent SDK

Apps from Developers Digest

Agent Hub

Skill Builder

Migrate

Related Guides

Migrating from Cursor to Claude Code

Routines (Web) - Claude Code

Claude Code Setup Guide

Related Videos

Claude Mythos & Fable 5 Banned

Claude Fable 5 in 7 Minutes

Anthropic's Cowork: Claude Code for the Rest of Your Work

Related Posts

Handling Fable 5 Refusals: A Working Guide to the Fallback API

Fable 5 Leaves Your Claude Plan on June 22. Here's How to Plan for It

Claude Opus 5 vs Opus 4.8 vs Fable 5: Benchmark Comparison (July 2026)

Terence Tao Digests the Jacobian Conjecture Counterexample: How Claude Fable 5 Broke an 87-Year-Old Math Problem

Fable 5 Is Back: The Anthropic Model the Government Switched Off

Refusals at Fleet Scale: Building Fable 5 Agents That Do Not Silently Fail

Build with the member tools

Get Smarter About AI Dev

The one-line version#

1. thinking: disabled is now a 400 error#

2. Raise max_tokens on workloads that ran without thinking#

3. Start effort at high, not xhigh#

4. Handle the new refusal stop reason#

MiniMax M2.5 for Developers: The Anthropic-Compatible Budget Frontier Model

Neon Postgres in 2026: Review and Setup for AI App Builders

What the 'Notes on DeepSeek' Essay Gets Right About Open-Weights Economics

OpenAI Agents SDK vs Claude Agent SDK: Building Agents on the Two Big Platforms

5. Audit your prompts for "show your reasoning"#

6. Re-baseline your costs, not just your code#

7. Trim your prompts. Seriously.#

Coming from Opus 4.7 or earlier#

The checklist#

FAQ#

Is migrating from Opus 4.8 to Fable 5 really a one-line change?#

What happens if I don't migrate off claude-sonnet-4-20250514 or claude-opus-4-20250514?#

Why does Fable 5 refuse requests that Opus 4.8 answered fine?#

Will Fable 5 cost more than Opus 4.8 for the same workload?#

Handling Fable 5 Refusals: A Working Guide to the Fallback API

Fable 5 Leaves Your Claude Plan on June 22. Here's How to Plan for It

Fable 5 Before June 22: The Decision Checklist for Every Plan Tier

Try These Tools

Related Tools

Claude Fable 5

Claude Opus 4.7

Claude Code

Claude Agent SDK

Apps from Developers Digest

Agent Hub

Skill Builder

1. `thinking: disabled` is now a 400 error#

2. Raise `max_tokens` on workloads that ran without thinking#

3. Start effort at `high`, not `xhigh`#

What happens if I don't migrate off `claude-sonnet-4-20250514` or `claude-opus-4-20250514`?#

1. `thinking: disabled` is now a 400 error#

2. Raise `max_tokens` on workloads that ran without thinking#

3. Start effort at `high`, not `xhigh`#

What happens if I don't migrate off `claude-sonnet-4-20250514` or `claude-opus-4-20250514`?#