
TL;DR
Cursor just shipped Composer 2 - a major upgrade to their AI coding assistant. Here is what changed and why it matters.
Read next
Cursor just dropped their first in-house model. Composer is 4x faster than similar models and completes most coding tasks in under 30 seconds. Here's what actually changed and why it matters.
4 min readA detailed comparison of Cursor and Claude Code from someone who uses both daily. When to use each, how they differ, and the ideal setup.
9 min readBoth fork VS Code and add AI. Windsurf has Cascade. Cursor has Composer 2. Here is how they compare for TypeScript.
5 min readCursor dropped Composer 2 today. It is their second-generation in-house coding model, and the jump from Composer 1 is significant. CursorBench scores went from 38.0 to 61.3. Terminal-Bench 2.0 went from 40.0 to 61.7. SWE-bench Multilingual climbed from 56.9 to 73.7. These are not incremental improvements. This is a fundamentally better model.
Cursor announced on X that Composer 2 achieves these benchmark results while staying cheaper than competing frontier models. They shared detailed benchmark comparisons showing the jump from Composer 1 to Composer 2 across every category. The team also highlighted the continued pretraining approach that made these gains possible, along with pricing details that undercut most of the market. The full writeup is on the Cursor blog.
The pricing is aggressive too. Standard tier runs $0.50/M input and $2.50/M output tokens. There is also a faster variant at $1.50/M input and $7.50/M output that ships as the default. Even the fast option undercuts most competing models at comparable intelligence levels.
Composer 2 is the result of Cursor's first continued pretraining run. That is a big deal. Composer 1 was trained primarily through reinforcement learning on top of an existing base model. Composer 2 starts from a much stronger foundation because Cursor actually did continued pretraining on coding-specific data before layering RL on top.
For broader context, pair this with Cursor vs Claude Code in 2026 - Which Should You Use? and Every AI Coding Tool Compared: The 2026 Matrix; those companion pieces show where this fits in the wider AI developer workflow.
From that stronger base, they scaled their reinforcement learning on long-horizon coding tasks - the kind that require hundreds of sequential actions across files, terminals, and search tools. The model learned to plan more deliberately, use tools in parallel when it makes sense, and avoid premature edits. It reads before it writes. That behavioral shift alone makes it noticeably more reliable on real codebases.
The architecture remains mixture-of-experts, which is why the speed is still there. Most tasks complete in under 30 seconds, even with the quality jump.
Here is how Composer 2 stacks up against its predecessors:
| Model | CursorBench | Terminal-Bench 2.0 | SWE-bench Multilingual |
|---|---|---|---|
| Composer 2 | 61.3 | 61.7 | 73.7 |
| Composer 1.5 | 44.2 | 47.9 | 65.9 |
| Composer 1 | 38.0 | 40.0 | 56.9 |
The Terminal-Bench 2.0 numbers are particularly interesting. That benchmark tests real terminal-based agent work, the same kind of tasks you would use Claude Code or Codex for. Composer 2 scoring 61.7 puts it in the same conversation as the frontier models from Anthropic and OpenAI, but at a fraction of the cost.
SWE-bench Multilingual at 73.7 is strong. For context, that benchmark tests the model's ability to resolve real GitHub issues across multiple programming languages. Going from 56.9 to 73.7 in one generation is a 30% jump.
We tested Composer 2 against 5 other AI models on 10 web development tasks. Composer 2 achieved 10/10 task completion. See the full results on our Web Dev Arena.
Synthetic benchmarks tell part of the story, but real-world web dev tasks tell the rest. Composer 2 handled everything we threw at it - React component generation, API integration, database queries, auth flows, and multi-file refactors. It completed all 10 tasks without needing manual intervention. That is rare. Most models stumble on at least one or two edge cases in a set like this.
Get the weekly deep dive
Tutorials on Claude Code, AI agents, and dev tools - delivered free every week.
From the archive
Mar 19, 2026 • 5 min read
Mar 19, 2026 • 4 min read
Mar 19, 2026 • 5 min read
Mar 19, 2026 • 10 min read
The AI coding landscape has gotten crowded. Here is where Composer 2 fits.
Claude Code still uses the best reasoning models available (Opus 4.6, Sonnet 4.6). For complex architectural decisions, novel problem-solving, and tasks where you need the model to think deeply before acting, Claude Code remains the strongest option. It is terminal-native, which some developers prefer and others avoid. The tradeoff is speed. Claude Code prioritizes accuracy over velocity.
OpenAI Codex runs on GPT-5.3 and has strong performance on structured engineering tasks. It is a solid all-rounder with good IDE integration. But it is more expensive per token than Composer 2, and for iterative coding work, the speed difference matters.
Windsurf takes a more guided approach with its Cascade system. It is good for developers who want more hand-holding and a structured workflow. But it does not have its own frontier model. It relies on third-party models, which means it is always one step behind on model quality.
Composer 2 carves out a specific niche: fast, cheap, and smart enough for most coding tasks. If you are doing iterative development where you send 20-30 prompts in a session, the speed advantage compounds. You stay in flow. You do not context-switch while waiting for responses. That matters more than most benchmarks capture.
The real answer, though, is that most serious developers use multiple tools. Use Composer 2 for fast iteration and routine work. Switch to Claude Code or Codex for the hard stuff. The tools are not mutually exclusive.
Use Composer 2 if you want speed. If your workflow is prompt-heavy and iterative, 30-second completions at $0.50/M input tokens are hard to beat. You will get more iterations per hour than any other option.
Use it for multi-agent parallel work. Cursor's multi-agent interface runs up to eight agents simultaneously with git worktree isolation. Composer 2 is the cheapest frontier-quality model you can run in those parallel slots. Running eight Claude Code agents in parallel gets expensive fast. Eight Composer 2 agents is reasonable.
Use it alongside other models. Cursor lets you swap models mid-session. Start with Composer 2 for scaffolding and routine edits, then switch to Sonnet 4.6 or GPT-5 for the parts that need deeper reasoning. This hybrid approach gives you the best of both worlds.
Skip it if accuracy on first attempt matters more than iteration speed. If you are running background agents on long autonomous tasks where you will not be reviewing intermediate steps, you want the smartest model possible. That is still Claude Code with Opus or Sonnet.
Cursor building their own model is the signal that matters here. They are not just wrapping API calls to Anthropic and OpenAI anymore. They are training models specifically for their IDE, their tools, their workflow patterns. That vertical integration is powerful.
The broader trend is clear. The gap between "fast and cheap" models and "smart and expensive" models is closing. Composer 2 at $0.50/M input tokens delivers results that would have required a $15/M token model a year ago. That compression is accelerating.
We are also seeing the rise of model-switching as a first-class workflow. No single model wins every task. The winning setup in 2026 is an IDE that lets you fluidly move between models based on what you are doing right now. Cursor understood this early. Their multi-model, multi-agent architecture is built for exactly this future.
The next frontier is not smarter models. It is smarter coordination of multiple agents running multiple models on different parts of your codebase simultaneously. Cursor is betting heavily on that with Automations, Bugbot, and now Composer 2 as the cost-efficient workhorse model that makes running many agents economically viable.
Composer 2 is available now. Select it from the model dropdown in Cursor or try it in the new Glass interface alpha at cursor.com/glass.
Composer 2 is Cursor's second-generation in-house AI coding model. It was built through continued pretraining on coding-specific data followed by reinforcement learning on long-horizon coding tasks. The result is a significant jump in benchmark performance - CursorBench scores went from 38.0 (Composer 1) to 61.3 (Composer 2), with similar gains across Terminal-Bench 2.0 and SWE-bench Multilingual.
Composer 2 has two pricing tiers. Standard runs at $0.50/M input and $2.50/M output tokens. The faster variant (the default) costs $1.50/M input and $7.50/M output tokens. Both undercut competing frontier models at similar intelligence levels. For Cursor Pro and Business subscribers, Composer 2 is included in the 500 "fast" requests per month.
Claude Code uses Anthropic's frontier models (Opus 4.6, Sonnet 4.6) and prioritizes accuracy over speed - ideal for complex architectural decisions and novel problem-solving. Composer 2 prioritizes speed and cost - completing most tasks in under 30 seconds at a fraction of the token cost. Many developers use both: Composer 2 for fast iteration and routine work, Claude Code for the hard stuff.
Yes. Cursor lets you swap models mid-session. A common workflow is starting with Composer 2 for scaffolding and routine edits, then switching to Sonnet 4.6 or GPT-5 for parts that need deeper reasoning. This hybrid approach maximizes both speed and quality.
Glass is Cursor's new interface alpha available at cursor.com/glass. It provides an alternative way to interact with Composer 2 and other models outside the main Cursor IDE. The interface is designed for quick interactions and testing.
Cursor's multi-agent interface supports up to eight agents running simultaneously with git worktree isolation. Composer 2 is the most cost-effective frontier-quality model for these parallel slots - running eight Claude Code agents in parallel gets expensive fast, while eight Composer 2 agents remains economical.
Composer 2 scored 61.3 on CursorBench (up from 38.0 on Composer 1), 61.7 on Terminal-Bench 2.0 (up from 40.0), and 73.7 on SWE-bench Multilingual (up from 56.9). The SWE-bench Multilingual score is particularly notable - that benchmark tests the model's ability to resolve real GitHub issues across multiple programming languages.
Use Claude Code or Codex when accuracy on first attempt matters more than iteration speed. If you're running background agents on long autonomous tasks where you won't review intermediate steps, you want the smartest model possible. Composer 2 excels at fast, iterative development where you're actively prompting and reviewing results - not at unsupervised autonomous work.
Technical content at the intersection of AI and development. Building with AI agents, Claude Code, and modern dev tools - then showing you exactly how it works.
AI-native code editor forked from VS Code. Composer mode rewrites multiple files at once. Tab autocomplete predicts your...
View ToolCodeium's AI-native IDE. Cascade agent mode handles multi-file edits autonomously. Free tier with generous limits. Stron...
View ToolStackBlitz's in-browser AI app builder. Full-stack apps from a prompt - runs Node.js, installs packages, and deploys....
View ToolOpen-source AI pair programming in your terminal. Works with any LLM - Claude, GPT, Gemini, local models. Git-aware ed...
View ToolA concrete step-by-step guide to moving your development workflow from Cursor to Claude Code - settings, rules, keybindings, and the habits that transfer.
Getting StartedConfigure Claude Code for maximum productivity -- CLAUDE.md, sub-agents, MCP servers, and autonomous workflows.
AI AgentsStep-by-step guide to building an MCP server in TypeScript - from project setup to tool definitions, resource handling, testing, and deployment.
AI Agents
Cursor just dropped their first in-house model. Composer is 4x faster than similar models and completes most coding task...

A detailed comparison of Cursor and Claude Code from someone who uses both daily. When to use each, how they differ, and...

Both fork VS Code and add AI. Windsurf has Cascade. Cursor has Composer 2. Here is how they compare for TypeScript.

From terminal agents to cloud IDEs - these are the AI coding tools worth using for TypeScript development in 2026.

12 AI coding tools across 4 architecture types, compared on pricing, strengths, weaknesses, and best use cases. The defi...

Complete pricing breakdown for every major AI coding tool. Claude Code, Cursor, Copilot, Windsurf, Codex, Augment, and m...

New tutorials, open-source projects, and deep dives on coding agents - delivered weekly.