Claude Opus 4.5: Anthropic's Most Intelligent Model

Anthropic has released Claude Opus 4.5, positioning it as their most capable model yet for coding agents and computer use. The release brings significant price cuts, efficiency gains, and enough autonomous capability to outscore human candidates on the company's notoriously difficult technical assessment.
Pricing That Changes the Economics
Opus 4.5 drops to $5 per million input tokens and $25 per million output tokens—three times cheaper than its predecessor. The model is available across Anthropic's web app, Claude Code, and all major cloud providers. This price reduction makes high-performance agentic workflows economically viable at scale.
Benchmarks and Efficiency
On software engineering benchmarks, Opus 4.5 leads across the board. It tops SWE-bench Verified, TerminalBench, and shows strong performance on multilingual coding tasks with an 89.4% on Polyglot. Browser automation scores hit 72.9% on BrowserComp, and the model achieved $4,967 on VendingBench—though still trailing Gemini 3 Pro on that specific metric.

The headline metric, however, is token efficiency. Opus 4.5 matched Sonnet 4.5's best SWE-bench Verified score using 76% fewer output tokens. At maximum effort, it exceeds Sonnet 4.5 by 4.3 percentage points while consuming 48% fewer tokens. Raw performance is easy when you burn unlimited compute—efficiency at the frontier is what matters for production deployments.
Agent Architecture and Control
The model introduces an effort parameter in the API, letting developers control how much compute to allocate per task. This pairs with new features including tool search, programmatic tool calling, tool use examples, and context compaction.

Anthropic emphasizes Opus 4.5's ability to manage teams of sub-agents and build complex multi-agent systems without constant intervention. The model handles ambiguous tasks, reasons through trade-offs, and operates autonomously without the handholding earlier models required. Early testers consistently report that Opus 4.5 "just gets it" when handed open-ended technical tasks.
Ecosystem Expansion
Claude Code now ships as a desktop application alongside the existing CLI and web interfaces. The release adds Microsoft Office integrations for PowerPoint, Excel, and Word, plus expanded Chrome extension support. Conversation limits have increased, and the system supports longer-running agentic workflows.

The Human Benchmark
Perhaps the most striking claim: Opus 4.5 is the first model to outperform human candidates on Anthropic's technical take-home exam. The assessment tests technical ability and judgment under time pressure—areas where the model now exceeds the strongest human applicants.
This result raises concrete questions about how AI reshapes engineering as a profession. Anthropic acknowledges their exam doesn't measure collaboration, communication, or the instincts developed over years of experience. But on core technical skills, the machine has crossed the threshold.
First Impressions in Practice
In a demo building a glassmorphism-themed SaaS landing page with Next.js, Opus 4.5 completed the task in approximately five minutes with minimal instruction. The model handled design decisions, component structure, and styling autonomously. Image understanding capabilities suggest it can interpret Figma screenshots and other visual references to match specific design requirements.

The shift is clear: less time prompting, more time reviewing. Opus 4.5 operates as a system you delegate to rather than direct step-by-step.


