Claude Sonnet 5 Developer Guide: Migration, API, and Effort Levels

TL;DR
Everything developers need to migrate from Sonnet 4.6 to Sonnet 5 - three breaking API changes, the new effort parameter, tokenizer impact, and when to use each effort level. Verified against Anthropic's official docs on July 4, 2026.
Official Sources
| Source | Description |
|---|---|
| Introducing Claude Sonnet 5 | Anthropic official announcement (June 30, 2026) |
| Sonnet 5 Migration Guide | Official migration documentation |
| What's New in Sonnet 5 | Feature changelog |
| Effort Parameter Docs | Reasoning effort configuration |
| Prompting Claude Sonnet 5 | Prompting best practices |
| Claude Pricing | Current pricing for all Claude plans |
Claude Sonnet 5 shipped on June 30, 2026 as Anthropic's most agentic Sonnet model yet. It's a drop-in replacement for Sonnet 4.6 - but "drop-in" doesn't mean zero changes. There are three breaking API changes that will hard-fail your code if you don't handle them, plus a new tokenizer that quietly increases your token counts by up to 35%.
This guide covers what breaks, what to change, and how to use the new effort parameter to control reasoning depth.
Last updated: July 4, 2026
Quick Migration Checklist
Before updating your model ID, verify these four items:
- Update model ID:
claude-sonnet-4-6toclaude-sonnet-5 - Remove sampling parameters: Any
temperature,top_p, ortop_kset to non-default values returns a 400 error - Remove manual extended thinking:
thinking: {type: "enabled", budget_tokens: N}returns a 400 error - use the neweffortparameter instead - Recount tokens: The new tokenizer maps the same text to ~1.0-1.35x more tokens
If your current code is simple (no sampling params, no extended thinking), the migration is just changing the model ID. Otherwise, read on.
Breaking Change 1: Sampling Parameters Removed
What changed: Requests that set temperature, top_p, or top_k to non-default values return a 400 error.
Why: Anthropic's position is that sampling parameters introduce unpredictable output quality and are incompatible with adaptive thinking. Sonnet 5's reasoning process adjusts dynamically based on effort level, so manual sampling control isn't supported.
Migration:
// Before (Sonnet 4.6)
const response = await anthropic.messages.create({
model: "claude-sonnet-4-6",
temperature: 0.7,
top_p: 0.9,
messages: [{ role: "user", content: "..." }]
});
// After (Sonnet 5) - remove sampling params
const response = await anthropic.messages.create({
model: "claude-sonnet-5",
messages: [{ role: "user", content: "..." }]
});
If you were using low temperature for deterministic outputs, the replacement is using low effort level, which produces more consistent results with less exploration.
Breaking Change 2: Manual Extended Thinking Removed
What changed: Setting thinking: {type: "enabled", budget_tokens: N} returns a 400 error.
Why: Sonnet 5 uses adaptive thinking that automatically adjusts based on task complexity. Instead of specifying a fixed token budget for reasoning, you set an effort level and the model allocates thinking tokens as needed.
Migration:
// Before (Sonnet 4.6)
const response = await anthropic.messages.create({
model: "claude-sonnet-4-6",
thinking: { type: "enabled", budget_tokens: 10000 },
messages: [{ role: "user", content: "..." }]
});
// After (Sonnet 5) - use effort parameter
const response = await anthropic.messages.create({
model: "claude-sonnet-5",
thinking: { type: "enabled", effort: "high" },
messages: [{ role: "user", content: "..." }]
});
The effort values are: low, medium, high (default), max, and xhigh.
Newsletter
Get the weekly deep dive
Tutorials on Claude Code, AI agents, and dev tools, delivered free every week.
From the archive
Dan Luu's Agentic Coding Notes Point to the Real Bottleneck
Jul 4, 2026 • 8 min read
Image Token Compression Is a Real Agent Cost Lever
Jul 4, 2026 • 8 min read
Jamesob's Guide to Running SOTA LLMs Locally: The Hardware and Config That Actually Works
Jul 4, 2026 • 9 min read
Leanstral 1.5: Mistral's Open Theorem-Proving Model Hits 100% on miniF2F
Jul 4, 2026 • 8 min read
Breaking Change 3: Adaptive Thinking On By Default
What changed: On Sonnet 4.6, requests without a thinking field ran without thinking. On Sonnet 5, the same requests run with adaptive thinking at high effort by default.
Impact: Your existing prompts will use more tokens and potentially produce different outputs. This is usually an improvement, but it changes behavior.
To disable thinking entirely:
const response = await anthropic.messages.create({
model: "claude-sonnet-5",
thinking: { type: "disabled" },
messages: [{ role: "user", content: "..." }]
});
To match Sonnet 4.6 behavior more closely: Use medium effort, which Anthropic says is comparable to Sonnet 4.6 at high effort.
The New Effort Parameter
Sonnet 5's key feature is selectable reasoning effort. Instead of controlling thinking with a token budget, you set a semantic effort level.
| Effort | Use Case | Cost | Default |
|---|---|---|---|
low | Simple classification, quick lookups, high-volume tasks | Lowest | No |
medium | Cost-saving step-down, comparable to Sonnet 4.6 at high | Low-Medium | No |
high | Complex reasoning, coding, agentic tasks | Medium | Yes |
max | Maximum quality for difficult problems | High | No |
xhigh | Advanced coding, complex agentic work requiring extended exploration | Highest | No |
Using effort in the API:
const response = await anthropic.messages.create({
model: "claude-sonnet-5",
thinking: { type: "enabled", effort: "xhigh" },
max_tokens: 16000, // Leave headroom for thinking
messages: [{ role: "user", content: "Debug this failing test..." }]
});
Important: At high, xhigh, or max effort, leave headroom in max_tokens so the model has room for thinking and tool calls.
Effort Level Decision Guide
Use low when:
- Processing high-volume batch tasks
- Running simple classification
- Speed matters more than depth
- Tasks are well-scoped with clear outputs
Use medium when:
- Migrating from Sonnet 4.6 and want similar cost/quality
- Tasks are moderately complex but routine
- Balancing cost and capability
Use high (default) when:
- Running agents with tool use
- Coding tasks with multiple files
- Problems requiring chain-of-thought reasoning
- Quality matters more than speed
Use xhigh when:
- Debugging complex multi-file issues
- Agent sessions with many tool calls
- Problems that would benefit from extensive exploration
- You need maximum capability at the Sonnet tier
Use Opus 4.8 instead when:
- Running
xhighand costs are approaching Opus anyway - Tasks require the absolute highest capability
- Agentic search or computer use (Opus is cheaper per success on these benchmarks)
Tokenizer Impact
Sonnet 5 uses an updated tokenizer. The same input text produces approximately 1.0-1.35x more tokens than Sonnet 4.6, depending on content type.
Practical impact:
- Prompts that fit in Sonnet 4.6's context may exceed limits in Sonnet 5
- Your per-request costs may increase even at the same per-token price
- The introductory pricing ($2/$10) partially offsets this - Anthropic calls it "cost-neutral"
Migration steps:
- Re-run token counts on your prompts using Anthropic's token counting API
- Check that long prompts still fit in the 1M context window
- Revisit any
max_tokenslimits sized close to expected output length - Budget approximately 30% more tokens for the same workload
Benchmarks at a Glance
| Benchmark | Sonnet 5 | Sonnet 4.6 | Opus 4.8 |
|---|---|---|---|
| SWE-Bench Verified | 85.2% | 72.1% | 91.6% |
| SWE-Bench Pro | 63.2% | 58.1% | 73.5% |
| Terminal-Bench 2.1 | 80.4% | 67.0% | 74.6% |
| OSWorld-Verified | 81.2% | 78.5% | 87.3% |
Sonnet 5 at 80.4% on Terminal-Bench 2.1 beats Opus 4.8's 74.6% - the first time a Sonnet model has outperformed its Opus sibling on a major coding benchmark.
Pricing Summary
| Period | Input ($/MTok) | Output ($/MTok) |
|---|---|---|
| Now through Aug 31, 2026 | $2 | $10 |
| After Aug 31, 2026 | $3 | $15 |
The introductory pricing combined with the tokenizer change means:
- At $2/$10, Sonnet 5 is genuinely cheaper than Sonnet 4.6 for most workloads
- After August 31, costs will be roughly similar due to the ~30% token increase
- For high-effort reasoning tasks, costs can approach Opus 4.8 levels
Complete Migration Example
Here's a full before/after showing all three breaking changes:
// Before: Sonnet 4.6 with all deprecated features
const response = await anthropic.messages.create({
model: "claude-sonnet-4-6",
temperature: 0.3,
thinking: { type: "enabled", budget_tokens: 8000 },
max_tokens: 4000,
messages: [{
role: "user",
content: "Review this PR and suggest improvements..."
}]
});
// After: Sonnet 5 with equivalent intent
const response = await anthropic.messages.create({
model: "claude-sonnet-5",
thinking: { type: "enabled", effort: "high" },
max_tokens: 8000, // Increased for thinking headroom
messages: [{
role: "user",
content: "Review this PR and suggest improvements..."
}]
});
When to Stay on Sonnet 4.6
Sonnet 4.6 remains available. Consider staying on it if:
- You depend on sampling parameters (
temperature,top_p,top_k) for your use case - You need precise control over thinking token budgets
- You have a production system that's working and the migration isn't worth the risk
- You're running high-volume workloads and the tokenizer increase matters to your margins
Anthropic hasn't announced an EOL date for Sonnet 4.6 yet.
FAQ
What is the model ID for Claude Sonnet 5?
The model ID is claude-sonnet-5. Use this in API calls to specify the model. The previous model ID claude-sonnet-4-6 continues to work for Sonnet 4.6.
Does Claude Sonnet 5 support extended thinking?
Yes, but not manually. Sonnet 5 uses adaptive thinking controlled by the effort parameter (low, medium, high, max, xhigh). Setting a manual budget_tokens returns a 400 error. The model automatically allocates thinking tokens based on the effort level and task complexity.
What is the context window for Claude Sonnet 5?
Sonnet 5 has a 1M-token context window and 128K max output tokens. There is no long-context pricing premium - the same per-token rates apply regardless of context length.
How much more do prompts cost with the new tokenizer?
The same text maps to approximately 1.0-1.35x more tokens with Sonnet 5's tokenizer compared to Sonnet 4.6. Anthropic set introductory pricing to be "cost-neutral" overall, but your actual cost change depends on your content type and effort level.
Is Claude Sonnet 5 available in Claude Code?
Yes. Sonnet 5 is now the default model in Claude Code with a native 1M-token context window. Interactive Claude Code in the terminal uses your subscription limits; programmatic usage (Agent SDK, claude -p) draws from the API credit pool.
When does the introductory pricing end?
August 31, 2026. After that date, pricing moves from $2/$10 per MTok to $3/$15 per MTok.
Should I use Sonnet 5 or Opus 4.8?
Use Sonnet 5 at low/medium effort for high-volume, well-scoped tasks where cost matters. Use Opus 4.8 for complex, open-ended tasks or when you need maximum capability. At xhigh effort, Sonnet 5 costs approach Opus 4.8 while performing slightly worse on several benchmarks - at that point, Opus is often the better choice.
Can I disable thinking in Sonnet 5?
Yes. Pass thinking: { type: "disabled" } to turn off adaptive thinking entirely. This produces simpler, faster responses but loses the reasoning capability.
Sources
- Introducing Claude Sonnet 5 - verified July 4, 2026
- Sonnet 5 Migration Guide - verified July 4, 2026
- What's New in Sonnet 5 - verified July 4, 2026
- Effort Parameter Documentation - verified July 4, 2026
- Prompting Claude Sonnet 5 - verified July 4, 2026
- Claude Pricing - verified July 4, 2026
- Claude Sonnet 5 Benchmarks - June 30, 2026
Read next
Claude Sonnet 5 Launch Analysis: The Most Agentic Sonnet Yet
Anthropic releases Claude Sonnet 5 with improved agentic capabilities, better tool use, and an introductory pricing deal. Here's what developers need to know.
6 min readClaude Sonnet 5 vs Sonnet 4.6: Should You Upgrade?
Claude Sonnet 5 lands near Opus 4.8 on some tasks for a fraction of the price - but a new tokenizer runs about 30 percent more tokens. Here is the upgrade decision for builders, with the numbers.
6 min readAI Coding Tools Pricing: The June 2026 Reality Check
Every major AI coding tool just went through a pricing shift. Here are the exact numbers for Cursor, GitHub Copilot, Claude Code, Devin, and the Anthropic API - verified from live pricing pages on July 4, 2026. Claude Sonnet 5 is now the default model with promotional pricing through August 31.
9 min readTechnical content at the intersection of AI and development. Building with AI agents, Claude Code, and modern dev tools - then showing you exactly how it works.









