Claude Sonnet 5 Developer Guide: Migration, API, and Effort Levels

Q: What is the model ID for Claude Sonnet 5?

The model ID is `claude-sonnet-5`. Use this in API calls to specify the model. The previous model ID `claude-sonnet-4-6` continues to work for Sonnet 4.6.

Q: Can I disable thinking in Sonnet 5?

Yes. Pass `thinking: { type: "disabled" }` to turn off adaptive thinking entirely. This produces simpler, faster responses but loses the reasoning capability. ---

Official Sources

Source	Description
Introducing Claude Sonnet 5	Anthropic official announcement (June 30, 2026)
Sonnet 5 Migration Guide	Official migration documentation
What's New in Sonnet 5	Feature changelog
Effort Parameter Docs	Reasoning effort configuration
Prompting Claude Sonnet 5	Prompting best practices
Claude Pricing	Current pricing for all Claude plans

Claude Sonnet 5 shipped on June 30, 2026 as Anthropic's most agentic Sonnet model yet. It's a drop-in replacement for Sonnet 4.6 - but "drop-in" doesn't mean zero changes. There are three breaking API changes that will hard-fail your code if you don't handle them, plus a new tokenizer that quietly increases your token counts by up to 35%.

This guide covers what breaks, what to change, and how to use the new effort parameter to control reasoning depth.

Last updated: July 4, 2026

Quick Migration Checklist

Before updating your model ID, verify these four items:

Update model ID: claude-sonnet-4-6 to claude-sonnet-5
Remove sampling parameters: Any temperature, top_p, or top_k set to non-default values returns a 400 error
Remove manual extended thinking: thinking: {type: "enabled", budget_tokens: N} returns a 400 error - use the new effort parameter instead
Recount tokens: The new tokenizer maps the same text to ~1.0-1.35x more tokens

If your current code is simple (no sampling params, no extended thinking), the migration is just changing the model ID. Otherwise, read on.

Breaking Change 1: Sampling Parameters Removed

What changed: Requests that set temperature, top_p, or top_k to non-default values return a 400 error.

Why: Anthropic's position is that sampling parameters introduce unpredictable output quality and are incompatible with adaptive thinking. Sonnet 5's reasoning process adjusts dynamically based on effort level, so manual sampling control isn't supported.

Migration:

// Before (Sonnet 4.6)
const response = await anthropic.messages.create({
  model: "claude-sonnet-4-6",
  temperature: 0.7,
  top_p: 0.9,
  messages: [{ role: "user", content: "..." }]
});

// After (Sonnet 5) - remove sampling params
const response = await anthropic.messages.create({
  model: "claude-sonnet-5",
  messages: [{ role: "user", content: "..." }]
});

If you were using low temperature for deterministic outputs, the replacement is using low effort level, which produces more consistent results with less exploration.

Breaking Change 2: Manual Extended Thinking Removed

What changed: Setting thinking: {type: "enabled", budget_tokens: N} returns a 400 error.

Why: Sonnet 5 uses adaptive thinking that automatically adjusts based on task complexity. Instead of specifying a fixed token budget for reasoning, you set an effort level and the model allocates thinking tokens as needed.

Migration:

// Before (Sonnet 4.6)
const response = await anthropic.messages.create({
  model: "claude-sonnet-4-6",
  thinking: { type: "enabled", budget_tokens: 10000 },
  messages: [{ role: "user", content: "..." }]
});

// After (Sonnet 5) - use effort parameter
const response = await anthropic.messages.create({
  model: "claude-sonnet-5",
  thinking: { type: "enabled", effort: "high" },
  messages: [{ role: "user", content: "..." }]
});

The effort values are: low, medium, high (default), max, and xhigh.

Newsletter

Get the weekly deep dive

Tutorials on Claude Code, AI agents, and dev tools, delivered free every week.

From the archive

Dan Luu's Agentic Coding Notes Point to the Real Bottleneck

Jul 4, 2026 • 8 min read

Image Token Compression Is a Real Agent Cost Lever

Jul 4, 2026 • 8 min read

Jamesob's Guide to Running SOTA LLMs Locally: The Hardware and Config That Actually Works

Jul 4, 2026 • 9 min read

Leanstral 1.5: Mistral's Open Theorem-Proving Model Hits 100% on miniF2F

Jul 4, 2026 • 8 min read

Breaking Change 3: Adaptive Thinking On By Default

What changed: On Sonnet 4.6, requests without a thinking field ran without thinking. On Sonnet 5, the same requests run with adaptive thinking at high effort by default.

Impact: Your existing prompts will use more tokens and potentially produce different outputs. This is usually an improvement, but it changes behavior.

To disable thinking entirely:

const response = await anthropic.messages.create({
  model: "claude-sonnet-5",
  thinking: { type: "disabled" },
  messages: [{ role: "user", content: "..." }]
});

To match Sonnet 4.6 behavior more closely: Use medium effort, which Anthropic says is comparable to Sonnet 4.6 at high effort.

The New Effort Parameter

Sonnet 5's key feature is selectable reasoning effort. Instead of controlling thinking with a token budget, you set a semantic effort level.

Effort	Use Case	Cost	Default
`low`	Simple classification, quick lookups, high-volume tasks	Lowest	No
`medium`	Cost-saving step-down, comparable to Sonnet 4.6 at high	Low-Medium	No
`high`	Complex reasoning, coding, agentic tasks	Medium	Yes
`max`	Maximum quality for difficult problems	High	No
`xhigh`	Advanced coding, complex agentic work requiring extended exploration	Highest	No

Using effort in the API:

const response = await anthropic.messages.create({
  model: "claude-sonnet-5",
  thinking: { type: "enabled", effort: "xhigh" },
  max_tokens: 16000, // Leave headroom for thinking
  messages: [{ role: "user", content: "Debug this failing test..." }]
});

Important: At high, xhigh, or max effort, leave headroom in max_tokens so the model has room for thinking and tool calls.

Effort Level Decision Guide

Use low when:

Processing high-volume batch tasks
Running simple classification
Speed matters more than depth
Tasks are well-scoped with clear outputs

Use medium when:

Migrating from Sonnet 4.6 and want similar cost/quality
Tasks are moderately complex but routine
Balancing cost and capability

Use high (default) when:

Running agents with tool use
Coding tasks with multiple files
Problems requiring chain-of-thought reasoning
Quality matters more than speed

Use xhigh when:

Debugging complex multi-file issues
Agent sessions with many tool calls
Problems that would benefit from extensive exploration
You need maximum capability at the Sonnet tier

Use Opus 4.8 instead when:

Running xhigh and costs are approaching Opus anyway
Tasks require the absolute highest capability
Agentic search or computer use (Opus is cheaper per success on these benchmarks)

Tokenizer Impact

Sonnet 5 uses an updated tokenizer. The same input text produces approximately 1.0-1.35x more tokens than Sonnet 4.6, depending on content type.

Practical impact:

Prompts that fit in Sonnet 4.6's context may exceed limits in Sonnet 5
Your per-request costs may increase even at the same per-token price
The introductory pricing ($2/$10) partially offsets this - Anthropic calls it "cost-neutral"

Migration steps:

Re-run token counts on your prompts using Anthropic's token counting API
Check that long prompts still fit in the 1M context window
Revisit any max_tokens limits sized close to expected output length
Budget approximately 30% more tokens for the same workload

Benchmarks at a Glance

Benchmark	Sonnet 5	Sonnet 4.6	Opus 4.8
SWE-Bench Verified	85.2%	72.1%	91.6%
SWE-Bench Pro	63.2%	58.1%	73.5%
Terminal-Bench 2.1	80.4%	67.0%	74.6%
OSWorld-Verified	81.2%	78.5%	87.3%

Sonnet 5 at 80.4% on Terminal-Bench 2.1 beats Opus 4.8's 74.6% - the first time a Sonnet model has outperformed its Opus sibling on a major coding benchmark.

Pricing Summary

Period	Input ($/MTok)	Output ($/MTok)
Now through Aug 31, 2026	$2	$10
After Aug 31, 2026	$3	$15

The introductory pricing combined with the tokenizer change means:

At $2/$10, Sonnet 5 is genuinely cheaper than Sonnet 4.6 for most workloads
After August 31, costs will be roughly similar due to the ~30% token increase
For high-effort reasoning tasks, costs can approach Opus 4.8 levels

Complete Migration Example

Here's a full before/after showing all three breaking changes:

// Before: Sonnet 4.6 with all deprecated features
const response = await anthropic.messages.create({
  model: "claude-sonnet-4-6",
  temperature: 0.3,
  thinking: { type: "enabled", budget_tokens: 8000 },
  max_tokens: 4000,
  messages: [{
    role: "user",
    content: "Review this PR and suggest improvements..."
  }]
});

// After: Sonnet 5 with equivalent intent
const response = await anthropic.messages.create({
  model: "claude-sonnet-5",
  thinking: { type: "enabled", effort: "high" },
  max_tokens: 8000, // Increased for thinking headroom
  messages: [{
    role: "user",
    content: "Review this PR and suggest improvements..."
  }]
});

When to Stay on Sonnet 4.6

Sonnet 4.6 remains available. Consider staying on it if:

You depend on sampling parameters (temperature, top_p, top_k) for your use case
You need precise control over thinking token budgets
You have a production system that's working and the migration isn't worth the risk
You're running high-volume workloads and the tokenizer increase matters to your margins

Anthropic hasn't announced an EOL date for Sonnet 4.6 yet.

FAQ

What is the model ID for Claude Sonnet 5?

The model ID is claude-sonnet-5. Use this in API calls to specify the model. The previous model ID claude-sonnet-4-6 continues to work for Sonnet 4.6.

Does Claude Sonnet 5 support extended thinking?

Yes, but not manually. Sonnet 5 uses adaptive thinking controlled by the effort parameter (low, medium, high, max, xhigh). Setting a manual budget_tokens returns a 400 error. The model automatically allocates thinking tokens based on the effort level and task complexity.

What is the context window for Claude Sonnet 5?

Sonnet 5 has a 1M-token context window and 128K max output tokens. There is no long-context pricing premium - the same per-token rates apply regardless of context length.

How much more do prompts cost with the new tokenizer?

The same text maps to approximately 1.0-1.35x more tokens with Sonnet 5's tokenizer compared to Sonnet 4.6. Anthropic set introductory pricing to be "cost-neutral" overall, but your actual cost change depends on your content type and effort level.

Is Claude Sonnet 5 available in Claude Code?

Yes. Sonnet 5 is now the default model in Claude Code with a native 1M-token context window. Interactive Claude Code in the terminal uses your subscription limits; programmatic usage (Agent SDK, claude -p) draws from the API credit pool.

When does the introductory pricing end?

August 31, 2026. After that date, pricing moves from $2/$10 per MTok to $3/$15 per MTok.

Should I use Sonnet 5 or Opus 4.8?

Use Sonnet 5 at low/medium effort for high-volume, well-scoped tasks where cost matters. Use Opus 4.8 for complex, open-ended tasks or when you need maximum capability. At xhigh effort, Sonnet 5 costs approach Opus 4.8 while performing slightly worse on several benchmarks - at that point, Opus is often the better choice.

Can I disable thinking in Sonnet 5?

Yes. Pass thinking: { type: "disabled" } to turn off adaptive thinking entirely. This produces simpler, faster responses but loses the reasoning capability.

Sources

Introducing Claude Sonnet 5 - verified July 4, 2026
Sonnet 5 Migration Guide - verified July 4, 2026
What's New in Sonnet 5 - verified July 4, 2026
Effort Parameter Documentation - verified July 4, 2026
Prompting Claude Sonnet 5 - verified July 4, 2026
Claude Pricing - verified July 4, 2026
Claude Sonnet 5 Benchmarks - June 30, 2026

Official Sources

Source	Description
Introducing Claude Sonnet 5	Anthropic official announcement (June 30, 2026)
Sonnet 5 Migration Guide	Official migration documentation
What's New in Sonnet 5	Feature changelog
Effort Parameter Docs	Reasoning effort configuration
Prompting Claude Sonnet 5	Prompting best practices
Claude Pricing	Current pricing for all Claude plans

This guide covers what breaks, what to change, and how to use the new effort parameter to control reasoning depth.

Last updated: July 4, 2026

Quick Migration Checklist

Before updating your model ID, verify these four items:

Update model ID: claude-sonnet-4-6 to claude-sonnet-5
Remove sampling parameters: Any temperature, top_p, or top_k set to non-default values returns a 400 error
Remove manual extended thinking: thinking: {type: "enabled", budget_tokens: N} returns a 400 error - use the new effort parameter instead
Recount tokens: The new tokenizer maps the same text to ~1.0-1.35x more tokens

If your current code is simple (no sampling params, no extended thinking), the migration is just changing the model ID. Otherwise, read on.

Breaking Change 1: Sampling Parameters Removed

What changed: Requests that set temperature, top_p, or top_k to non-default values return a 400 error.

Migration:

// Before (Sonnet 4.6)
const response = await anthropic.messages.create({
  model: "claude-sonnet-4-6",
  temperature: 0.7,
  top_p: 0.9,
  messages: [{ role: "user", content: "..." }]
});

// After (Sonnet 5) - remove sampling params
const response = await anthropic.messages.create({
  model: "claude-sonnet-5",
  messages: [{ role: "user", content: "..." }]
});

If you were using low temperature for deterministic outputs, the replacement is using low effort level, which produces more consistent results with less exploration.

Breaking Change 2: Manual Extended Thinking Removed

What changed: Setting thinking: {type: "enabled", budget_tokens: N} returns a 400 error.

Migration:

// Before (Sonnet 4.6)
const response = await anthropic.messages.create({
  model: "claude-sonnet-4-6",
  thinking: { type: "enabled", budget_tokens: 10000 },
  messages: [{ role: "user", content: "..." }]
});

// After (Sonnet 5) - use effort parameter
const response = await anthropic.messages.create({
  model: "claude-sonnet-5",
  thinking: { type: "enabled", effort: "high" },
  messages: [{ role: "user", content: "..." }]
});

The effort values are: low, medium, high (default), max, and xhigh.

Newsletter

Get the weekly deep dive

Tutorials on Claude Code, AI agents, and dev tools, delivered free every week.

From the archive

Dan Luu's Agentic Coding Notes Point to the Real Bottleneck

Jul 4, 2026 • 8 min read

Image Token Compression Is a Real Agent Cost Lever

Jul 4, 2026 • 8 min read

Jamesob's Guide to Running SOTA LLMs Locally: The Hardware and Config That Actually Works

Jul 4, 2026 • 9 min read

Leanstral 1.5: Mistral's Open Theorem-Proving Model Hits 100% on miniF2F

Jul 4, 2026 • 8 min read

Breaking Change 3: Adaptive Thinking On By Default

What changed: On Sonnet 4.6, requests without a thinking field ran without thinking. On Sonnet 5, the same requests run with adaptive thinking at high effort by default.

Impact: Your existing prompts will use more tokens and potentially produce different outputs. This is usually an improvement, but it changes behavior.

To disable thinking entirely:

const response = await anthropic.messages.create({
  model: "claude-sonnet-5",
  thinking: { type: "disabled" },
  messages: [{ role: "user", content: "..." }]
});

To match Sonnet 4.6 behavior more closely: Use medium effort, which Anthropic says is comparable to Sonnet 4.6 at high effort.

The New Effort Parameter

Sonnet 5's key feature is selectable reasoning effort. Instead of controlling thinking with a token budget, you set a semantic effort level.

Effort	Use Case	Cost	Default
`low`	Simple classification, quick lookups, high-volume tasks	Lowest	No
`medium`	Cost-saving step-down, comparable to Sonnet 4.6 at high	Low-Medium	No
`high`	Complex reasoning, coding, agentic tasks	Medium	Yes
`max`	Maximum quality for difficult problems	High	No
`xhigh`	Advanced coding, complex agentic work requiring extended exploration	Highest	No

Using effort in the API:

const response = await anthropic.messages.create({
  model: "claude-sonnet-5",
  thinking: { type: "enabled", effort: "xhigh" },
  max_tokens: 16000, // Leave headroom for thinking
  messages: [{ role: "user", content: "Debug this failing test..." }]
});

Important: At high, xhigh, or max effort, leave headroom in max_tokens so the model has room for thinking and tool calls.

Effort Level Decision Guide

Use low when:

Processing high-volume batch tasks
Running simple classification
Speed matters more than depth
Tasks are well-scoped with clear outputs

Use medium when:

Migrating from Sonnet 4.6 and want similar cost/quality
Tasks are moderately complex but routine
Balancing cost and capability

Use high (default) when:

Running agents with tool use
Coding tasks with multiple files
Problems requiring chain-of-thought reasoning
Quality matters more than speed

Use xhigh when:

Debugging complex multi-file issues
Agent sessions with many tool calls
Problems that would benefit from extensive exploration
You need maximum capability at the Sonnet tier

Use Opus 4.8 instead when:

Running xhigh and costs are approaching Opus anyway
Tasks require the absolute highest capability
Agentic search or computer use (Opus is cheaper per success on these benchmarks)

Tokenizer Impact

Sonnet 5 uses an updated tokenizer. The same input text produces approximately 1.0-1.35x more tokens than Sonnet 4.6, depending on content type.

Practical impact:

Prompts that fit in Sonnet 4.6's context may exceed limits in Sonnet 5
Your per-request costs may increase even at the same per-token price
The introductory pricing ($2/$10) partially offsets this - Anthropic calls it "cost-neutral"

Migration steps:

Re-run token counts on your prompts using Anthropic's token counting API
Check that long prompts still fit in the 1M context window
Revisit any max_tokens limits sized close to expected output length
Budget approximately 30% more tokens for the same workload

Benchmarks at a Glance

Benchmark	Sonnet 5	Sonnet 4.6	Opus 4.8
SWE-Bench Verified	85.2%	72.1%	91.6%
SWE-Bench Pro	63.2%	58.1%	73.5%
Terminal-Bench 2.1	80.4%	67.0%	74.6%
OSWorld-Verified	81.2%	78.5%	87.3%

Sonnet 5 at 80.4% on Terminal-Bench 2.1 beats Opus 4.8's 74.6% - the first time a Sonnet model has outperformed its Opus sibling on a major coding benchmark.

Pricing Summary

Period	Input ($/MTok)	Output ($/MTok)
Now through Aug 31, 2026	$2	$10
After Aug 31, 2026	$3	$15

The introductory pricing combined with the tokenizer change means:

At $2/$10, Sonnet 5 is genuinely cheaper than Sonnet 4.6 for most workloads
After August 31, costs will be roughly similar due to the ~30% token increase
For high-effort reasoning tasks, costs can approach Opus 4.8 levels

Complete Migration Example

Here's a full before/after showing all three breaking changes:

// Before: Sonnet 4.6 with all deprecated features
const response = await anthropic.messages.create({
  model: "claude-sonnet-4-6",
  temperature: 0.3,
  thinking: { type: "enabled", budget_tokens: 8000 },
  max_tokens: 4000,
  messages: [{
    role: "user",
    content: "Review this PR and suggest improvements..."
  }]
});

// After: Sonnet 5 with equivalent intent
const response = await anthropic.messages.create({
  model: "claude-sonnet-5",
  thinking: { type: "enabled", effort: "high" },
  max_tokens: 8000, // Increased for thinking headroom
  messages: [{
    role: "user",
    content: "Review this PR and suggest improvements..."
  }]
});

When to Stay on Sonnet 4.6

Sonnet 4.6 remains available. Consider staying on it if:

You depend on sampling parameters (temperature, top_p, top_k) for your use case
You need precise control over thinking token budgets
You have a production system that's working and the migration isn't worth the risk
You're running high-volume workloads and the tokenizer increase matters to your margins

Anthropic hasn't announced an EOL date for Sonnet 4.6 yet.

FAQ

What is the model ID for Claude Sonnet 5?

The model ID is claude-sonnet-5. Use this in API calls to specify the model. The previous model ID claude-sonnet-4-6 continues to work for Sonnet 4.6.

Does Claude Sonnet 5 support extended thinking?

What is the context window for Claude Sonnet 5?

Sonnet 5 has a 1M-token context window and 128K max output tokens. There is no long-context pricing premium - the same per-token rates apply regardless of context length.

How much more do prompts cost with the new tokenizer?

Is Claude Sonnet 5 available in Claude Code?

When does the introductory pricing end?

August 31, 2026. After that date, pricing moves from $2/$10 per MTok to $3/$15 per MTok.

Should I use Sonnet 5 or Opus 4.8?

Can I disable thinking in Sonnet 5?

Yes. Pass thinking: { type: "disabled" } to turn off adaptive thinking entirely. This produces simpler, faster responses but loses the reasoning capability.

Sources

Introducing Claude Sonnet 5 - verified July 4, 2026
Sonnet 5 Migration Guide - verified July 4, 2026
What's New in Sonnet 5 - verified July 4, 2026
Effort Parameter Documentation - verified July 4, 2026
Prompting Claude Sonnet 5 - verified July 4, 2026
Claude Pricing - verified July 4, 2026
Claude Sonnet 5 Benchmarks - June 30, 2026

Official Sources

Quick Migration Checklist

Breaking Change 1: Sampling Parameters Removed

Breaking Change 2: Manual Extended Thinking Removed

Dan Luu's Agentic Coding Notes Point to the Real Bottleneck

Image Token Compression Is a Real Agent Cost Lever

Jamesob's Guide to Running SOTA LLMs Locally: The Hardware and Config That Actually Works

Leanstral 1.5: Mistral's Open Theorem-Proving Model Hits 100% on miniF2F

Breaking Change 3: Adaptive Thinking On By Default

The New Effort Parameter

Effort Level Decision Guide

Tokenizer Impact

Benchmarks at a Glance

Pricing Summary

Complete Migration Example

When to Stay on Sonnet 4.6

FAQ

What is the model ID for Claude Sonnet 5?

Does Claude Sonnet 5 support extended thinking?

What is the context window for Claude Sonnet 5?

How much more do prompts cost with the new tokenizer?

Is Claude Sonnet 5 available in Claude Code?

When does the introductory pricing end?

Should I use Sonnet 5 or Opus 4.8?

Can I disable thinking in Sonnet 5?

Sources

Claude Sonnet 5 Launch Analysis: The Most Agentic Sonnet Yet

Claude Sonnet 5 vs Sonnet 4.6: Should You Upgrade?

AI Coding Tools Pricing: The June 2026 Reality Check

Related Tools

Claude

Claude Haiku 4.5

Claude Opus 4.8

Claude Opus 4.7

Apps from Developers Digest

Agent Hub

Skill Builder

Skills Pro

Related Guides

Effort Levels - Claude Code

Model Aliases - Claude Code

OpusPlan Alias - Claude Code

Related Videos

Anthropic Sonnet 4.5 in Claude Code in 10 Minutes

OpenAI's GPT 4.5 ChatGPT Compared to Anthropic Claude 3.7 Sonnet

Anthropic Claude Code with Sonnet 3.7 in 15 Minutes

Related Posts

Claude Sonnet 5 Launch Analysis: The Most Agentic Sonnet Yet

Claude Sonnet 5 vs Sonnet 4.6: Should You Upgrade?

AI Coding Tools Pricing: The June 2026 Reality Check

Claude Science Developer Guide 2026: AI Workbench for Research

Fable 5 Is Back: The Anthropic Model the Government Switched Off

Claude Mythos 5 Explained: What It Is, Who Can Access It, and Why It's Gated

Build with the member tools

Get Smarter About AI Dev

Official Sources

Quick Migration Checklist

Breaking Change 1: Sampling Parameters Removed

Breaking Change 2: Manual Extended Thinking Removed

Dan Luu's Agentic Coding Notes Point to the Real Bottleneck

Image Token Compression Is a Real Agent Cost Lever

Jamesob's Guide to Running SOTA LLMs Locally: The Hardware and Config That Actually Works

Leanstral 1.5: Mistral's Open Theorem-Proving Model Hits 100% on miniF2F

Breaking Change 3: Adaptive Thinking On By Default

The New Effort Parameter

Effort Level Decision Guide

Tokenizer Impact

Benchmarks at a Glance

Pricing Summary

Complete Migration Example

When to Stay on Sonnet 4.6

FAQ

What is the model ID for Claude Sonnet 5?

Does Claude Sonnet 5 support extended thinking?

What is the context window for Claude Sonnet 5?

How much more do prompts cost with the new tokenizer?

Is Claude Sonnet 5 available in Claude Code?

When does the introductory pricing end?

Should I use Sonnet 5 or Opus 4.8?

Can I disable thinking in Sonnet 5?