
TL;DR
Four agents, same tasks. Honest trade-offs from a developer shipping production apps with all of them.
I use all four of these daily. Not as demos. As the tools that close PRs, fix regressions, and push code to production on live apps. So when people ask which one "wins," the honest answer is: they each have a lane, and pretending otherwise wastes your subscription.
Here is the short version for anyone skimming, then the deeper cuts on install, what each agent is actually good at, where each one fumbles, and how to pick.
| Agent | Runtime | Best Model | Pricing Model | Where It Wins |
|---|---|---|---|---|
| Claude Code | Local CLI + subagents | Claude Opus 4.6 / Sonnet 4.6 | Subscription (Pro / Max) or API | Long coherent sessions, refactors, skill-driven workflows |
| Codex CLI | Local CLI + cloud runners | GPT-5.3-Codex / GPT-5.4 | ChatGPT plan or API | Parallel agent fleets, fast iteration, cloud-native work |
| Cursor Agent | IDE-integrated + CLI | Multi-model (Claude, GPT-5.x, Gemini) | Pro ($20/mo) or Business | Tight edit loops inside an IDE, model switching |
| OpenCode | Local CLI, open source | Bring-your-own (any provider) | Free (your API keys) | Self-hosted, model-agnostic, no vendor lock-in |
Pricing context: current frontier model floors on models.json from dd-subagents put Claude Opus 4.6 at $10/M tokens, GPT-5.3-Codex at $4.81/M, GPT-5.4 at $5.63/M, and GLM-5 at $1.55/M. If you run OpenCode against Kimi K2.5 at $1.20/M, you can cover a lot of tokens for the price of one Max plan. Whether that's smart depends on what you're building.
Now the honest breakdown.
npm install -g @anthropic-ai/claude-code
claudeSign in with your Anthropic account or set ANTHROPIC_API_KEY. A Pro or Max plan routes through subscription quota instead of per-token API billing, which matters at volume.
Long-horizon sessions. Claude Code is the only agent in this lineup where I can run a multi-hour refactor across 40 files and trust the context to stay coherent. Subagents, hooks, and project-level CLAUDE.md rules let me shape behavior without retraining the model. The skill system (~/.claude/skills/) lets me drop in reusable workflows like /handoff, /qa, or /devdigest:ship-product that fire the right sequence of tools without re-prompting.
The tool use discipline is the differentiator. Claude reads before it writes, proposes before it edits, and will stop to ask rather than hallucinate a file path. That's boring in a demo and priceless at 2am debugging a deploy.
Parallelism. Claude Code runs one main loop at a time. Subagents help, but if you want to spin up 10 agents each building a separate feature, you'll feel the single-session ceiling. Also: rate limits on Max plans are real. Shipping heavy on Opus 4.6 will eventually hit a reset window and you'll be stuck.
Model switching is also awkward. You can swap between Opus and Sonnet, but you can't easily swap in GPT-5.4 or Gemini for a second opinion without a wrapper.
Pro is $20/mo, Max is $100 or $200/mo, API is pay-per-token. At production volume the $200 Max plan pays for itself in a week compared to raw API.
Serious builders doing deep work on a single complex codebase. If you're refactoring, architecting, or running a "one human, one codebase, ship daily" workflow, this is the pick.
Get the weekly deep dive
Tutorials on Claude Code, AI agents, and dev tools - delivered free every week.
npm install -g @openai/codex
codexSign in with your ChatGPT account. Plus, Pro, and Business plans include Codex usage; API keys work too. The Codex desktop app launched on macOS in February 2026 and Windows a month later, but the CLI is still the workhorse.
Parallel fleets. Codex was built from the jump around the idea that the bottleneck isn't model capability, it's human supervision of many concurrent agents. Worktree isolation, cloud runners, and the codex exec headless mode make it the best option when you want to fan out work across branches or machines.
GPT-5.3-Codex is fast. At 89 tokens/sec versus Claude's 44-46 tokens/sec, you feel the difference on iterative loops where you're waiting on a diff to land. For plumbing work, test generation, or scripting, Codex is often done before Claude has finished reading.
Depth on long sessions. Codex will cheerfully edit a file it hasn't read, and on hour three of a complex refactor it starts drifting. Hooks and tool discipline are less mature than Claude Code's. For a greenfield script, no problem. For surgery on a 50k-line app, you feel it.
The "cloud runner" story is also uneven. When it works, it's magic. When it doesn't, debugging why the runner can't see your repo is its own side quest.
ChatGPT Plus is $20/mo, Pro is $200/mo, Business/Enterprise are seat-based. API pricing on GPT-5.3-Codex is $4.81/M tokens, GPT-5.4 is $5.63/M.
Parallel work. If your workflow is "spawn five agents, each takes a ticket, I review PRs," Codex is built for that.
Download the IDE from cursor.com, or use the CLI:
curl -fsSL https://cursor.com/install | bash
cursor-agent -p "your prompt"
The IDE loop. Cursor's advantage is not the agent itself, it's that the agent lives inside the editor where you're already reading the code. Tab completion, inline diffs, and "agent mode" in the sidebar mean you're never copy-pasting between a terminal and a file. For front-end work especially, this is the tightest feedback loop in the lineup.
Model switching is the other win. You pick Claude Sonnet for one task, swap to GPT-5.4 for another, drop down to Gemini 3.1 for a cheap pass. The Pro plan at $20/mo includes a generous pool of "fast" requests across models.
Agent depth. Cursor's agent mode is improving fast, but it still behaves more like "smart autocomplete with a plan" than a true autonomous loop. It will ask for approval more often than Claude Code and lose context on longer runs. Headless CLI mode (cursor-agent -p) works but feels like an afterthought next to Claude or Codex native CLIs.
Pro is $20/mo, Business is $40/user/mo. Request quotas reset monthly and heavy users will hit them.
Editor-native work. If you live in your IDE and want an agent that augments your typing rather than replacing your session, Cursor is the fit.
curl -fsSL https://opencode.ai/install | bash
opencode
Set OPENAI_API_KEY, ANTHROPIC_API_KEY, or any compatible endpoint in the config. It will pick up local Ollama, MiniMax, or OpenRouter without ceremony.
No lock-in. OpenCode is open source, model-agnostic, and self-hosted. You point it at whatever provider you want. Running Claude Sonnet one day, GLM-5 the next, Kimi K2.5 on the third. The UI is a respectable TUI that mirrors what you'd get from Claude Code or Codex without the subscription.
For teams with sensitive code that can't touch a vendor API, OpenCode plus a local model via Ollama is the only option in this lineup that runs fully offline. DGX Spark or a decent local GPU and you have an agent that never phones home.
Polish and skills. OpenCode gives you the loop, but you assemble the rest. No equivalent of Claude skills, no hook system as mature, no desktop app supervising a fleet. If you want "it just works," this isn't it. You're trading convenience for control.
Model quality is also your problem. Point it at a weak model and you'll get weak output, and no amount of prompt engineering fixes a 35-intelligence model trying to refactor a Next.js app.
Free. You pay for model API usage directly. At $1.20-1.55/M tokens on GLM-5 or Kimi K2.5, heavy usage can run under $20/mo total.
Tinkerers, self-hosters, and teams that refuse to be locked into a single vendor. Also a great third agent for when Claude and Codex are both rate-limited.
If you're shipping one product and want the deepest single-agent experience, Claude Code with a Max plan.
If your bottleneck is parallelism, you want more tickets closed per day, Codex CLI.
If you live in an IDE and want the agent there with you, Cursor.
If you hate lock-in, want to run local models, or just want to see how the sausage is made, OpenCode.
The real pro move: run two of them. My daily setup is Claude Code as the primary loop and Codex CLI for parallel side-quests. They complement more than they compete.
Every agent above is only as good as the model inside it. I built a comparison tool that tracks all 208 frontier models by quality score, speed, cost, and context window. Filter by "AI Coding" to see how Claude Opus 4.6, GPT-5.3-Codex, Gemini 3.1 Pro, and the open-weight alternatives actually stack up.
Head to subagent.developersdigest.tech for the live leaderboard, cost calculator, and task-based recommendations. Pick the model. Then pick the agent. In that order.
Technical content at the intersection of AI and development. Building with AI agents, Claude Code, and modern dev tools - then showing you exactly how it works.
Anthropic's agentic coding CLI. Runs in your terminal, edits files autonomously, spawns sub-agents, and maintains memory...
View ToolOpenAI's cloud coding agent. Runs in a sandboxed container, reads your repo, executes tasks, and submits PRs. Uses GPT-5...
View Tool
New tutorials, open-source projects, and deep dives on coding agents - delivered weekly.
Codeium's AI-native IDE. Cascade agent mode handles multi-file edits autonomously. Free tier with generous limits. Stron...
Deep comparison of the top AI agent frameworks - architecture, code examples, strengths, weaknesses, and when to use each one.
AI AgentsConfigure Claude Code for maximum productivity -- CLAUDE.md, sub-agents, MCP servers, and autonomous workflows.
AI AgentsInstall Claude Code, configure your first project, and start shipping code with AI in under 5 minutes.
Getting Started
Check out Zed here! https://zed.dev In this video, we dive into Zed, a robust open source code editor that has recently introduced the Agent Client Protocol. This new open standard allows...

Anthropic's Big Claude Code & Cowork Update: Remote Control, Scheduled Tasks, Plugins, Auto Memory + New Simplify/Batch Skills The script recaps a consolidated update on new Anthropic releases across

Leveraging Anthropic's Subagent for Claude Code: A Step-by-Step Guide In this video, we explore Anthropic's newly released subagent feature for Cloud Code, which allows developers to create...

From Claude Code to Gladia, the ten CLIs every AI-native developer should know. Install commands, trade-offs, and when t...
An opinionated guide to the MCP server ecosystem in 2026. Curated picks by category, real configuration examples, instal...

AI agents fail in ways traditional debugging cannot catch. Here are the tools and patterns for finding and fixing broken...