The Fable 5 Moment
29 partsTL;DR
A practical playbook for running Claude Fable 5 as the orchestrator over Sonnet and Haiku workers, with verified cost math on when the premium pays off.
Read next
Fable 5 posts an 80.3% SWE-Bench Pro score and costs 2x Opus 4.8 - here is the task-profile scoring guide that tells you when the premium pays off.
7 min readFable 5 1M context workflows that actually work: whole-repo reviews, log archaeology, multi-doc synthesis - plus the honest math on when RAG still wins.
10 min readFable 5 effort levels explained: what low, medium, high, xhigh, and max actually change, which models support each level, and how effort drives your token bill.
10 min readLast updated: June 11, 2026
Claude Fable 5 costs $10 per million input tokens and $50 per million output tokens - double Opus 4.8 and ten times Haiku 4.5, per Anthropic's pricing page. Pointing it at every task in a multi-agent fleet is the fastest way to turn a useful model into a budget problem. But one seat consistently earns the premium: the orchestrator.
Anthropic's own Fable 5 prompting guide calls out delegation as a headline improvement: Fable 5 is "significantly more dependable at dispatching and sustaining parallel subagents, and reliably manages ongoing communication with long-running subagents and peer agents." That is the orchestrator job description. This playbook covers putting Fable 5 in that seat, routing the actual work to Sonnet and Haiku, and the math on when that beats both an all-frontier fleet and an all-cheap one.
In any fan-out architecture, the orchestrator makes the decisions that compound: how to decompose the task, which worker gets which slice, and what to do when results conflict. A planning mistake at the top multiplies across every worker downstream. A worker mistake stays local and is cheap to retry.
That asymmetry is the whole argument. You pay the frontier rate where errors compound and the commodity rate where they do not. Anthropic's own cost optimization guidance says it plainly: "Choose Haiku for simple tasks, Sonnet for most production workloads, and Opus for the most complex reasoning." Fable 5 now sits above Opus in that last bucket, and the launch announcement claims "the longer and more complex the task, the larger Fable 5's lead over our other models." Orchestration runs are exactly that profile. If you have not picked a top-tier model yet, our Fable 5 vs Opus 4.8 decision guide covers the head-to-head.
Verified pricing and specs from the models overview and pricing docs, accessed June 11, 2026:
| Tier | Model | Input / Output per MTok | Context | Best for |
|---|---|---|---|---|
| Orchestrator | Fable 5 | $10 / $50 | 1M | Decomposition, dispatch, conflict resolution, final synthesis |
| Escalation | Opus 4.8 | $5 / $25 | 1M | Hard worker tasks, plus anything in Fable's safeguarded domains |
| Workhorse | Sonnet 4.6 | $3 / $15 | 1M | Implementation legs: edits, tests, multi-file changes |
| Scout | Haiku 4.5 | $1 / $5 | 200K | Search, classification, summarization, read-only exploration |
Three constraints worth internalizing before you wire this up:
Assume a quality audit fanned out across 12 workers. The orchestrator reads the task, a repo map, and all worker summaries (150K input) and produces dispatch briefs plus a final report (20K output). Each worker reads a 60K slice and returns an 8K summary. The volumes are illustrative assumptions; the per-token prices are verified.
Orchestrator on Fable 5: 150K x $10/M + 20K x $50/M = $1.50 + $1.00 = $2.50
Worker fleet, per configuration:
| Configuration | Per worker | 12 workers | Run total |
|---|---|---|---|
| All Fable 5 (workers too) | $1.00 | $12.00 | $14.50 |
| Fable 5 + Opus 4.8 workers | $0.50 | $6.00 | $8.50 |
| Fable 5 + Sonnet 4.6 workers | $0.30 | $3.60 | $6.10 |
| Fable 5 + Haiku 4.5 workers | $0.10 | $1.20 | $3.70 |
| All Sonnet 4.6 (orchestrator too) | $0.30 | $3.60 | $4.35 |
Two readings of that table matter:
The leverage is in the worker tier. Downgrading the orchestrator from Fable 5 to Sonnet saves $1.75. Downgrading the workers saves $8.40. The mixed fleet runs 58% cheaper than all-Fable, and Fable-plus-Haiku runs 74% cheaper. The expensive seat worth keeping is the one that decides what everyone else does.
The orchestrator premium is small in absolute terms. Going from all-Sonnet to Fable-at-the-top costs $1.75 extra on this run. If a smarter decomposition saves one botched worker pass and the human time to notice it, it has paid for itself. If your fan-outs are trivially parallel with no judgment calls (lint 500 files, summarize 200 tickets), the premium buys nothing - keep Sonnet or Haiku in charge and batch it.
Prompt caching tightens this further. Cache reads bill at 0.1x base input, with 5-minute writes at 1.25x. If 40K of each worker's 60K input is a shared prefix (conventions doc, repo map), the Sonnet worker fleet's input cost drops from $2.16 to about $1.00, taking the mixed-fleet run from $6.10 to roughly $4.94. The prompting guide also notes long-lived subagents "save time and cost through cache reads" - reuse workers across subtasks instead of cold-starting them. For deeper modeling, see Fable 5 production cost modeling and our cost-per-task analysis.
Get the weekly deep dive
Tutorials on Claude Code, AI agents, and dev tools - delivered free every week.
From the archive
Jun 11, 2026 • 10 min read
Jun 11, 2026 • 10 min read
Jun 11, 2026 • 8 min read
Jun 11, 2026 • 8 min read
The routing primitives are first-class in Claude Code. Per the subagents docs, every subagent definition takes a model field accepting sonnet, opus, haiku, fable, a full model ID, or inherit (the default):
---
name: code-scout
description: Read-only repo exploration and file discovery
tools: Read, Glob, Grep
model: haiku
---
You search and summarize. Return file paths, signatures, and a short
summary. Never edit.
Run the main session on Fable 5 and it becomes the orchestrator by default; every subagent runs on whatever its definition pins. Claude Code's built-in Explore subagent already follows this pattern - it is pinned to Haiku for fast, read-only codebase search.
For bigger fan-outs, dynamic workflows move the orchestration loop into a script with hard caps of 16 concurrent agents and 1,000 agents per run. The docs are explicit about the routing lever: "Every agent in a workflow uses your session's model unless the script routes a stage to a different one." A scout stage on Haiku, an implementation stage on Sonnet, and a synthesis stage on the session's Fable 5 is the natural shape.
Three operating rules from the official guidance worth adopting:
For the broader taxonomy of these structures, see our seven AI agent orchestration patterns breakdown.
Fable 5 ships with safety classifiers covering offensive cybersecurity, biology and life sciences, and reasoning extraction. Per the introduction doc, a declined request returns stop_reason: "refusal" as an HTTP 200, and the launch post reports classifiers trigger, on average, in under 5% of sessions. For an orchestrator this matters twice:
reasoning_extraction category and elevate fallbacks. Orchestrator templates that ask workers to "show your full reasoning" are a common offender.The billing is forgiving: pre-output refusals are not billed, and the fallback credit refunds the prompt-cache cost of switching models on retry, with the beta fallbacks parameter handling retries server-side. Full mechanics in Fable 5's safeguards and refusal architecture.
One more operational note: Fable 5 is included on Pro, Max, Team, and seat-based Enterprise plans only through June 22, 2026, then moves behind usage credits, per the launch post. If your orchestrator runs on a subscription seat today, the math above becomes your real bill in under two weeks - see our June 22 deadline explainer.
Honest tradeoffs, because the orchestrator pattern is not free lunch:
/workflows lets you stop a run before it gets expensive.A tiered multi-agent setup where Claude Fable 5 handles planning, decomposition, dispatch, and synthesis, while cheaper models (Sonnet 4.6 for implementation, Haiku 4.5 for search and classification) execute the work items. You pay the $10/$50 frontier rate only on the decisions that compound.
In the worked example above, the mixed fleet costs $6.10 versus $14.50 for all-Fable - about 58% less. With Haiku workers it drops to $3.70, about 74% less. Prompt caching on shared worker prefixes cuts the mixed-fleet total further, to roughly $4.94.
Set the model field in each subagent's YAML frontmatter to haiku, sonnet, opus, fable, a full model ID, or inherit. The default is inherit, so an unconfigured fleet under a Fable 5 session silently bills everything at Fable rates.
Opus 4.8 at $5/$25 is the right call when zero data retention is required (Fable 5 carries mandatory 30-day retention), when your workload would hit Fable's classifiers, or when run plans are simple enough that Fable's documented delegation improvements do not change outcomes. The premium is small in absolute dollars, but it should still buy something.
Not if the refusal happens before output generation - per Anthropic's docs you are not billed for pre-output refusals, and the fallback credit refunds the prompt-cache cost of retrying on another model. Configure the server-side fallbacks parameter (beta) or SDK middleware so refusals retry on Opus 4.8 automatically.
Technical content at the intersection of AI and development. Building with AI agents, Claude Code, and modern dev tools - then showing you exactly how it works.
Anthropic's AI. Opus 4.6 for hard problems, Sonnet 4.6 for speed, Haiku 4.5 for cost. 200K context window. Best coding m...
View ToolUnified API for 200+ models. One API key, one billing dashboard. OpenAI, Anthropic, Google, Meta, Mistral, and more. Aut...
View ToolAnthropic's smallest Claude 4.5 model. Near-frontier coding performance at one-third the cost of Sonnet 4 and up to 4-5x...
View ToolFactory AI's terminal coding agent. Runs Anthropic and OpenAI models in one subscription. Handles full tasks end-to-end...
View ToolInstall Ollama and LM Studio, pull your first model, and run AI locally for coding, chat, and automation - with zero cloud dependency.
Getting StartedUse opus, sonnet, haiku, and best to switch models easily.
Claude CodeInteractive UI to switch models and effort sliders mid-session.
Claude CodeFable 5 posts an 80.3% SWE-Bench Pro score and costs 2x Opus 4.8 - here is the task-profile scoring guide that tells you...
Fable 5 1M context workflows that actually work: whole-repo reviews, log archaeology, multi-doc synthesis - plus the hon...
Fable 5 effort levels explained: what low, medium, high, xhigh, and max actually change, which models support each level...
Rewriting prompts and skills for Fable 5: what changes when you migrate agents from Opus 4.x, how effort interplay works...
Twelve documented Claude Fable 5 use patterns - agent orchestration, overnight runs, 1M-context refactors, effort tuning...
Claude Code dynamic workflows turn orchestration into a JavaScript script that runs up to 1,000 agents per run - here is...

New tutorials, open-source projects, and deep dives on coding agents - delivered weekly.