TL;DR
GPT-5.5 vs Claude Opus 4.8: both cost $5 per million input tokens, so the workhorse-tier decision comes down to output pricing, benchmarks, and tooling.
Direct answer
GPT-5.5 vs Claude Opus 4.8: both cost $5 per million input tokens, so the workhorse-tier decision comes down to output pricing, benchmarks, and tooling.
Best for
Developers comparing real tool tradeoffs before choosing a stack.
Covers
Verdict, tradeoffs, pricing signals, workflow fit, and related alternatives.
Read next
Same-day-verified llm api pricing june 2026: Claude Fable 5, GPT-5.5, Gemini 3.1 Pro, and DeepSeek V4 compared per million tokens, plus the three caveats that change the math.
10 min readFable 5 launched June 9 at 2x GPT-5.5's price with a 22-point SWE-Bench Pro gap. Here is the decision framework for choosing between them.
7 min readGPT-5.4 vs Gemini 3.1 Pro vs DeepSeek V4: pricing, benchmarks, context behavior, and license terms for the mid-tier models that carry most production traffic.
8 min readLast updated: June 11, 2026
When Anthropic shipped Claude Fable 5 at $10 input / $50 output per million tokens, the flagship pricing conversation moved up a tier - and quietly left a more interesting fight behind. GPT-5.5 and Claude Opus 4.8 now sit at exactly the same input price: $5.00 per million tokens (verified June 11, 2026 on OpenAI's pricing page and Anthropic's pricing page). These are the models most teams will actually run all day. We covered the flagship matchup in our Fable 5 vs GPT-5.5 benchmark comparison - this post is about the workhorse tier, where the prices finally meet and the decision gets genuinely close. Every number below was checked against the live source pages on June 11, 2026.
A quick framing note, because the naming is confusing. GPT-5.5 is OpenAI's mainline frontier model - the model page calls it "a new class of intelligence for coding and professional work," with GPT-5.5-pro above it at $30/$180. Claude Opus 4.8 is Anthropic's mainline Opus model, with Fable 5 above it at $10/$50.
Both vendors now run a two-tier structure: an expensive ceiling model and a $5-input workhorse. Anthropic's own models overview tells developers who are unsure which model to pick to "consider starting with Claude Opus 4.8," reserving Fable 5 for workloads that need the highest available capability. OpenAI's equivalent default for complex work is GPT-5.5 itself. That makes this the fair fight: the model each vendor expects you to use for most serious work, at the same input price.
All pricing verified June 11, 2026 against developers.openai.com/api/docs/pricing and platform.claude.com/docs/en/about-claude/pricing. Benchmark figures are from Vellum's breakdown of Anthropic's launch materials.
| GPT-5.5 | Claude Opus 4.8 | |
|---|---|---|
| Input (per MTok) | $5.00 | $5.00 |
| Output (per MTok) | $30.00 | $25.00 |
| Cached input read | $0.50 | $0.50 |
| Cache writes | No separate line item on the pricing page | $6.25 (5-minute) / $10 (1-hour) |
| Batch input / output | $2.50 / $15.00 | $2.50 / $12.50 |
| Context window | 1,050,000 tokens | 1M tokens (200K on Microsoft Foundry) |
| Max output | 128K tokens | 128K tokens |
| Knowledge cutoff | December 1, 2025 | January 2026 (reliable cutoff) |
| SWE-Bench Pro | 58.6% | 69.2% |
| FrontierCode Diamond | 5.7% | 13.4% |
| GDP.pdf (vision) | 24.9% | 22.5% |
| Speed premium option | Priority tier at $12.50/$75.00 (2.5x base rates) | Fast mode at $10/$50, about 2.5x faster (research preview) |
The spec convergence is striking. Both models offer a 1M-class context window and an identical 128K max output. Knowledge cutoffs are a month apart. On raw capacity, these models are interchangeable. The differences live in three places: output price, benchmark profile, and ecosystem.
The headline is simple: input is a tie, and Claude is cheaper on output - $25 versus $30 per million tokens, verified June 11, 2026. That is a 17% discount on the token type that dominates agentic workloads. A team generating 50 million output tokens a month pays $1,250 on Opus 4.8 versus $1,500 on GPT-5.5. The same ratio holds in batch mode, where both vendors apply a 50% discount: $12.50 versus $15.00 per million output tokens.
Caching is closer to a wash than it looks. Both vendors charge $0.50 per million tokens for cached input reads - a 90% discount off base input. The difference is on the write side: Anthropic bills cache writes at $6.25 per MTok (5-minute) or $10 (1-hour), while OpenAI's pricing page lists a single cached-input rate with no separate write charge. High cache churn adds real cost on the Claude side; stable system prompts read many times make it disappear into the noise.
Two honest caveats before declaring Claude the value winner:
One more structural note: Anthropic dropped its long-context premium, so a 900K-token request on Opus 4.8 bills at the same per-token rate as a 9K one. OpenAI's pricing page lists flat rates for GPT-5.5 as well, with a 10% uplift only for regional data-residency processing.
Get the weekly deep dive
Tutorials on Claude Code, AI agents, and dev tools - delivered free every week.
From the archive
Jun 11, 2026 • 10 min read
Jun 10, 2026 • 8 min read
Jun 10, 2026 • 9 min read
Jun 10, 2026 • 9 min read
The most complete public side-by-side comes from Vellum's analysis of the Fable 5 launch materials, which includes workhorse-tier numbers. On SWE-Bench Pro, the agentic coding benchmark built on real repository tasks, Opus 4.8 scores 69.2% against GPT-5.5's 58.6% - a 10.6-point gap. DataCamp's independent coverage cites the same 58.6% figure for GPT-5.5, which is a useful cross-check. On FrontierCode Diamond, a harder production-codebase split, Opus 4.8 leads 13.4% to 5.7%.
GPT-5.5 takes the vision-heavy document benchmark: 24.9% on GDP.pdf versus Opus 4.8's 22.5%. That is a small but real edge for knowledge work that runs through PDFs, charts, and scanned documents. On ExploitBench, a cybersecurity evaluation, Opus 4.8 scores 40% to GPT-5.5's 34% (the restricted Claude Mythos 5 scores far higher, but it is not generally available).
The usual caveats apply with full force. These numbers were compiled from Anthropic's launch materials, so treat the exact margins with skepticism even where third parties republish them. Benchmarks also compress latency, instruction-following style, and refusal behavior into single numbers. Our Claude vs GPT coding comparison goes deeper on how the two families behave in practice, and our look at Opus 4.8's agent honesty covers a dimension no leaderboard captures: how reliably the model reports what it actually did.
The fair summary: Opus 4.8 has the stronger published profile for agentic coding, GPT-5.5 edges ahead on vision-document work, and both are far enough above last year's models that either feels strong day to day.
Model choice increasingly follows tooling choice. In the last week of May 2026, Opus 4.8 became the default model in Claude Code for Max, Team Premium, Enterprise pay-as-you-go, and Anthropic API accounts, defaulting to high effort, per the Week 22 digest. Even after Fable 5 shipped on June 9, Anthropic's models overview still points developers at Opus 4.8 as the starting point for complex tasks. If your team lives in Claude Code, Opus 4.8 is the path of least resistance.
OpenAI's pitch is surface area. The GPT-5.5 model page lists support across Chat Completions, Responses, Realtime, Batch, fine-tuning, image generation, and speech endpoints - a breadth the Opus 4.8 docs pages do not match. If your product needs one model id wired into voice, vision, and fine-tuning pipelines, GPT-5.5's ecosystem argument is real. Our GPT-5.5 developer guide walks through that surface in detail.
The agent builder. Opus 4.8. The published agentic coding gap (69.2% vs 58.6% on SWE-Bench Pro) plus 17% cheaper output - the token type agents burn most - stacks two advantages on the same side.
The high-volume output shop. Opus 4.8 on the Batch API: $12.50 per million output tokens versus $15.00. At volume, that 17% compounds. Just re-baseline token counts first, since the Opus 4.7+ tokenizer can inflate counts for identical text.
The document and vision knowledge team. GPT-5.5. It posts the better GDP.pdf score and pairs it with native image and speech endpoints if your pipeline goes multimodal.
The platform consolidator. GPT-5.5. One model id across Realtime, fine-tuning, and generation endpoints is an operational simplification Anthropic does not offer at this tier.
The undecided. Run both at $5 input on a two-week pilot against your real tasks. The pricing symmetry makes this the cheapest A/B test this tier has ever offered. Our OpenAI vs Anthropic 2026 comparison covers the platform-level factors beyond this matchup.
Neither model is the right call for everything.
On input, no - both cost $5.00 per million tokens (verified June 11, 2026). On output, yes: Opus 4.8 charges $25 versus GPT-5.5's $30, about 17% less, and the gap persists in batch mode ($12.50 vs $15.00). The caveat is tokenizer differences: compare cost per completed task, not per-token rates.
On published benchmarks, Opus 4.8 leads agentic coding: 69.2% versus 58.6% on SWE-Bench Pro and 13.4% versus 5.7% on FrontierCode Diamond, per Vellum's compilation of Anthropic's launch data. Those figures originate from one vendor's materials, so run your own evaluation - but the direction matches Opus 4.8's role as the default model in Claude Code.
Effectively yes. GPT-5.5 lists 1,050,000 tokens and Opus 4.8 lists 1M, both with 128K max output. One exception: Opus 4.8 is limited to 200K context on Microsoft Foundry. Neither vendor charges a long-context premium at this tier.
When your workload is vision-document heavy (GPT-5.5 scores 24.9% vs 22.5% on GDP.pdf), when you need one model across Realtime, fine-tuning, speech, and image endpoints, or when your stack is already built on OpenAI's APIs and the 17% output-price difference does not move your bill enough to justify a migration.
Only for workloads that need the highest available capability. Fable 5 costs exactly double ($10/$50), and Anthropic's own models overview still directs most complex work to Opus 4.8 first. Start at the workhorse tier and escalate only when you have evidence the cheaper model is failing.
Note: OpenAI's GPT-5.5 launch announcement page could not be fetched directly (it returns an error to automated requests), so launch-post-only claims are not cited in this comparison. Priority-tier rates come from the API pricing page, which loads normally.
Technical content at the intersection of AI and development. Building with AI agents, Claude Code, and modern dev tools - then showing you exactly how it works.
Anthropic's agentic coding CLI. Runs in your terminal, edits files autonomously, spawns sub-agents, and maintains memory...
View ToolAnthropic's AI. Opus 4.6 for hard problems, Sonnet 4.6 for speed, Haiku 4.5 for cost. 200K context window. Best coding m...
View ToolAnthropic's flagship reasoning model. Best-in-class for coding, long-context analysis, and agentic workflows. 1M token c...
View ToolOpen-source AI pair programming in your terminal. Works with any LLM - Claude, GPT, Gemini, local models. Git-aware ed...
View ToolEvery coding agent in one window. Stop alt-tabbing between Claude, Codex, and Cursor.
View AppTurn a one-liner into a working Claude Code skill. From idea to installed in a minute.
View AppUnlock pro skills and share private collections with your team.
View AppDeep comparison of the top AI agent frameworks - LangGraph, CrewAI, Mastra, CopilotKit, AutoGen, and Claude Code.
AI AgentsUse opus, sonnet, haiku, and best to switch models easily.
Claude CodeHybrid mode: Opus for planning, Sonnet for execution.
Claude Code
Anthropic Releases Claude Opus 4.7: Benchmarks, Vision Upgrades, Memory, Pricing & New Claude Code Features Anthropic has released Opus 4.7, and the video covers the announcement, benchmark results, ...

Claude Fable 5 Released: Benchmarks, Pricing, Availability, and Real-World Examples Anthropic has released Claude Fable 5, the first general-use “Mythos class” model, and the video reviews the announ...

Nimbalyst Demo: A Visual Workspace for Codex + Claude Code with Kanban, Plans, and AI Commits Try it: https://nimbalyst.com/ Star Repo Here: https://github.com/Nimbalyst/nimbalyst This video demos N...
Claude Agent SDK vs Claude Code explained: same engine, two surfaces. Here is the concrete decision line, plus where Man...
Claude Fable 5 vs Gemini: how Anthropic's $10/$50 Mythos-class model compares to Gemini 3.1 Pro's $2/$12 preview on pric...
Same-day-verified llm api pricing june 2026: Claude Fable 5, GPT-5.5, Gemini 3.1 Pro, and DeepSeek V4 compared per milli...
GPT-5.4 vs Gemini 3.1 Pro vs DeepSeek V4: pricing, benchmarks, context behavior, and license terms for the mid-tier mode...
Fable 5 launched June 9 at 2x GPT-5.5's price with a 22-point SWE-Bench Pro gap. Here is the decision framework for choo...

A deep comparison of Codex's new /goal loop and Claude managed agents outcomes, with practical workflow examples, contro...

New tutorials, open-source projects, and deep dives on coding agents - delivered weekly.