Cursor Composer 2.5 Developer Guide 2026

Official Sources

Resource	Link
Cursor Composer 2.5 Announcement	cursor.com/blog/composer-2-5
Cursor Pricing	cursor.com/pricing
Cursor Documentation	docs.cursor.com
Kimi K2.5 Base Model	Moonshot AI
SWE-bench Multilingual	swebench.com

Cursor shipped Composer 2.5 on May 18, 2026 - just two months after Composer 2. The headline: it matches Claude Opus 4.7 and GPT-5.5 on coding benchmarks at roughly one tenth the cost per token. But the story underneath is more interesting than the benchmark numbers suggest.

Last updated: July 1, 2026

This guide covers what Composer 2.5 actually is, how to set it up, when to use it versus external models, and the training approach that made the performance jump possible.

What Composer 2.5 Actually Is

Composer 2.5 is Cursor's own agentic coding model, purpose-built to plan, edit files, run terminal commands, and verify its own work inside the Cursor editor. It is not a general-purpose chatbot. The training and evaluation targets are software engineering trajectories, not single-shot Q&A.

Like its predecessor, Composer 2.5 is based on Moonshot's open-weights Kimi K2.5. The architecture is a mixture-of-experts transformer with 1.04 trillion parameters total and 32 billion active parameters per token. It supports up to 200,000 tokens of context with native function calling, reasoning, and context caching.

Inside Cursor, it can:

Read files across your entire project
Edit code in multiple files simultaneously
Search the project semantically
Run terminal commands
Check errors and iterate
Keep working through a task until completion

The key improvement over Composer 2 is sustained effort. Composer 2.5 maintains focus across long tasks, follows complex instructions more reliably, and calibrates how much work a request actually needs instead of over- or under-doing it.

How to Set It Up

Composer 2.5 ships in Cursor 3.4 and later (3.5 is the current release as of May 20, 2026).

Step 1: Open the Composer panel or chat sidebar with Cmd+I on macOS or Ctrl+I on Windows and Linux.

Step 2: Click the model picker in the top-right corner of the Composer panel.

Step 3: Select Composer 2.5 from the dropdown.

For interactive coding sessions, leave the default Fast variant on. For background agents and Cloud Agent runs, switch to the Standard variant in Settings > Models > Composer 2.5.

The Fast variant prioritizes low latency for real-time interactions. The Standard variant prioritizes quality for autonomous tasks where you are not waiting on each response.

Pricing Breakdown

Composer 2.5 ships in two variants:

Variant	Input	Cached	Output
Standard	$0.50/MTok	$0.20/MTok	$2.50/MTok
Fast	$3.00/MTok	$0.50/MTok	$15.00/MTok

For context, Claude Opus 4.8 is $5/$25 per MTok and GPT-5.5 runs between $10-$15/$30-$45 per MTok depending on variant. Composer 2.5 is meaningfully cheaper at the Standard tier.

The practical impact: Cursor reports that Composer 2.5 completes CursorBench tasks at an average cost of under $1, while Opus 4.7 and GPT-5.5 run between $3 and $11 per task for comparable results.

For Cursor subscribers, both variants draw from your usage pool. Pro users get it as part of their $20/month. Teams Standard and Teams Premium get it with their split usage pools (first-party models including Composer 2.5 get their own allocation as of July 1, 2026).

Get the weekly deep dive

Tutorials on Claude Code, AI agents, and dev tools - delivered free every week.

From the archive

Godot Bans AI-Authored Code Contributions - What It Means for Open Source

Jul 1, 2026 • 6 min read

Webernetes: Kubernetes Ported to the Browser in TypeScript

Jul 1, 2026 • 5 min read

Claude Code Is Steganographically Marking Requests

Jun 30, 2026 • 7 min read

Claude in Microsoft Foundry on Azure: Developer Guide 2026

Jun 30, 2026 • 8 min read

Benchmark Performance

Here is where Composer 2.5 sits against the other frontier models as of mid-2026:

Benchmark	Composer 2.5	Claude Opus 4.7	GPT-5.5
SWE-bench Multilingual	79.8%	80.1%	78.4%
CursorBench v3.1	63.2%	64.8%	62.7%
Terminal-Bench 2.0	69.5%	70.2%	82.7%

The numbers tell a clear story:

Where Composer 2.5 competes: On multi-file coding tasks and repository-level refactors, Composer 2.5 matches Opus 4.7 and GPT-5.5 within noise. The benchmark differences are 1-2 percentage points - not enough to change your choice based on raw capability.

Where Composer 2.5 falls behind: Terminal-Bench 2.0 measures shell and terminal workflows - compiling code, setting up servers, system administration. GPT-5.5 leads by roughly 13 points. If your work is heavy in terminal trajectories, GPT-5.5 is the better tool.

Cost efficiency: At one tenth the token cost, Composer 2.5 is the default choice for agentic coding inside Cursor unless your task specifically benefits from Opus or GPT-5.5.

How They Trained It

Cursor's training approach is worth understanding because it explains why Composer 2.5 improved so much over Composer 2 with the same base model.

25x more synthetic tasks. Composer 2.5 was trained on 25 times as many synthetic tasks as Composer 2. Cursor developed harder synthetic problems dynamically throughout the training run.

Feature deletion training. One method: the agent is given a working codebase with a full set of tests, asked to delete specific features while keeping the codebase functional, and then tasked with reimplementing those features. The tests serve as a verifiable reward signal - either the tests pass or they do not.

Targeted textual feedback. Instead of one reward signal at the end of a task, Cursor writes a short hint describing the fix they want, drops that hint into the agent's local context, and uses on-policy distillation to incorporate the behavior back into the model. This provides denser credit assignment than end-of-task rewards.

Agentic monitoring. The training pipeline includes monitors that detect and prevent reward hacking behaviors before they compound.

The infrastructure side: Cursor uses a sharded Muon optimizer with distributed orthogonalization and dual-mesh HSDP. They report 0.2s optimizer step time on the 1T parameter model - fast enough to iterate quickly on training runs.

When to Use Each Model

Pick your model based on task type, not brand loyalty:

Use Composer 2.5 when:

You are working inside Cursor (it is the native option)
Cost matters and you are doing high-volume agentic work
The task is multi-file editing, codebase-wide refactors, or CI fixers
You want sustained effort across a long session

Use Claude Opus 4.8 when:

The task requires deep architectural reasoning across very long contexts
You need the strongest single-shot reliability for one-shot generation
The work involves nuanced judgment rather than raw throughput
You are working outside Cursor and need an API

Use GPT-5.5 when:

The work is heavy in shell and terminal trajectories
You need fast cloud execution with OpenAI's infrastructure
You are using Codex as your primary agentic tool

Use Fable 5 when:

You need the absolute highest capability for a single complex task
The cost is justified by task completion rate improvements
You have API access (through July 7, Fable 5 is temporarily included in claude.ai subscriptions)

Practical Workflow Patterns

Long refactors. Composer 2.5 excels at multi-file refactors that require sustained attention. Start with a clear instruction ("refactor all API handlers to use the new error handling pattern") and let it work through the codebase.

Test-driven development. Write failing tests first, then ask Composer 2.5 to implement the features. The verification loop gives it clear success criteria.

CI fixers. Point Composer 2.5 at a failing CI run and let it iterate. The combination of file editing and terminal access means it can run the tests locally, see the failures, and fix them.

Code review assistance. Use Composer 2.5 to review your own changes before committing. It can catch issues you missed and suggest improvements.

Batch operations. If you have 20 similar changes to make across a codebase, describe the pattern once and let Composer 2.5 apply it everywhere.

Limitations to Know

Not a replacement for external models in all cases. Terminal-Bench scores show GPT-5.5 is still better for shell-heavy work. For architecture decisions requiring the deepest reasoning, Opus or Fable 5 may justify the cost premium.

Cursor-native. Composer 2.5 is built for Cursor. If you are using VS Code, Neovim, or another editor, you need to use the external model APIs directly.

200K context window. Large but not unlimited. For massive codebases, you still need to be selective about what context you load.

Model-specific behaviors. Composer 2.5 is trained for agentic coding patterns. For general chat, creative writing, or non-coding tasks, general-purpose models may perform better.

FAQ

What is Cursor Composer 2.5?

Cursor Composer 2.5 is Cursor's own agentic coding model, released in May 2026. It is based on Moonshot's Kimi K2.5 with a mixture-of-experts architecture (1T parameters, 32B active per token). It is purpose-built for multi-file editing, terminal commands, and sustained agentic coding inside the Cursor editor.

How much does Composer 2.5 cost?

Standard variant: $0.50 input, $0.20 cached, $2.50 output per million tokens. Fast variant: $3.00 input, $0.50 cached, $15.00 output per million tokens. For Cursor subscribers, usage draws from your plan's allocation.

How does Composer 2.5 compare to Claude Opus 4.7?

On SWE-bench Multilingual and CursorBench, Composer 2.5 matches Opus 4.7 within 1-2 percentage points. Composer 2.5 costs roughly one tenth as much per token. Opus 4.7 may have an edge on tasks requiring the deepest architectural reasoning.

How does Composer 2.5 compare to GPT-5.5?

On coding benchmarks, the two are comparable. GPT-5.5 leads significantly on Terminal-Bench 2.0 (82.7% vs 69.5%) - for shell-heavy workflows, GPT-5.5 is the better choice. Composer 2.5 wins on cost.

When should I use Composer 2.5 vs external models?

Use Composer 2.5 as your default for agentic coding inside Cursor when cost matters. Reach for Opus 4.8 for deep reasoning tasks, GPT-5.5 for terminal-heavy work, and Fable 5 when the task justifies the premium price.

What is the context window for Composer 2.5?

Up to 200,000 tokens with native function calling, reasoning, and context caching.

Does Composer 2.5 work outside Cursor?

No. Composer 2.5 is integrated into Cursor and is not available as a standalone API. For external usage, you need Claude, GPT, or another API-accessible model.

What training improvements made Composer 2.5 better than Composer 2?

25x more synthetic training tasks, feature deletion training with test-based rewards, targeted textual feedback for denser credit assignment, and agentic monitoring to prevent reward hacking.

Sources

Cursor Composer 2.5 Announcement - May 18, 2026
Cursor Pricing - verified July 1, 2026
Lushbinary Composer 2.5 Guide - May 2026
DevOps.com Composer 2.5 Coverage - May 2026
Emergent AI Substack Guide - May 2026
Memeburn Benchmark Comparison - May 2026

Official Sources

What Composer 2.5 Actually Is

How to Set It Up

Pricing Breakdown

Godot Bans AI-Authored Code Contributions - What It Means for Open Source

Webernetes: Kubernetes Ported to the Browser in TypeScript

Claude Code Is Steganographically Marking Requests

Claude in Microsoft Foundry on Azure: Developer Guide 2026

Benchmark Performance

How They Trained It

When to Use Each Model

Practical Workflow Patterns

Limitations to Know

FAQ

What is Cursor Composer 2.5?

How much does Composer 2.5 cost?

How does Composer 2.5 compare to Claude Opus 4.7?

How does Composer 2.5 compare to GPT-5.5?

When should I use Composer 2.5 vs external models?

What is the context window for Composer 2.5?

Does Composer 2.5 work outside Cursor?

What training improvements made Composer 2.5 better than Composer 2?

Sources

Cursor Composer 2: Everything You Need to Know

AI Coding Tools Pricing: The June 2026 Reality Check

Best AI Coding Tools June 2026: Updated After Fable 5 Changes Everything

Related Tools

Cursor

Windsurf

Codeburn

Conductor

Apps from Developers Digest

Agent Hub

Migrate

Related Guides

Migrating from Cursor to Claude Code

Claude Code Setup Guide

Building Your First MCP Server

Related Videos

Open Design: Turn Websites into Design Assets for Cursor & Claude Code

Related Posts

Cursor Composer 2: Everything You Need to Know

AI Coding Tools Pricing: The June 2026 Reality Check

Best AI Coding Tools June 2026: Updated After Fable 5 Changes Everything

Enterprise AI Coding Budget Blowouts: What Uber and Microsoft Teach Us

Cursor vs Devin Desktop (formerly Windsurf): The 2026 IDE Agent Decision

Web Dev Arena: How to Test AI Coding Models on Real Frontend Work

Get Smarter About AI Dev

Official Sources

What Composer 2.5 Actually Is

How to Set It Up

Pricing Breakdown

Godot Bans AI-Authored Code Contributions - What It Means for Open Source

Webernetes: Kubernetes Ported to the Browser in TypeScript

Claude Code Is Steganographically Marking Requests

Claude in Microsoft Foundry on Azure: Developer Guide 2026

Benchmark Performance

How They Trained It

When to Use Each Model

Practical Workflow Patterns

Limitations to Know

FAQ

What is Cursor Composer 2.5?

How much does Composer 2.5 cost?

How does Composer 2.5 compare to Claude Opus 4.7?

How does Composer 2.5 compare to GPT-5.5?

When should I use Composer 2.5 vs external models?

What is the context window for Composer 2.5?

Does Composer 2.5 work outside Cursor?

What training improvements made Composer 2.5 better than Composer 2?

Sources

Cursor Composer 2: Everything You Need to Know

AI Coding Tools Pricing: The June 2026 Reality Check

Best AI Coding Tools June 2026: Updated After Fable 5 Changes Everything

Related Tools

Cursor

Windsurf

Codeburn

Conductor

Apps from Developers Digest