
TL;DR
Autocomplete wrote the line. Agents write the pull request. The shift from Copilot to Claude Code, Cursor Agent, and Devin - explained with links to the docs that prove every claim.
Read next
Claude Code is Anthropic's AI coding agent for your terminal. What it does, how it works, how it compares to Cursor and Codex, and how to ship your first feature with it. Fact-checked against official docs.
15 min readFour agents, same tasks. Honest trade-offs from a developer shipping production apps with all of them.
10 min readCursor is a VS Code fork with AI at the center instead of bolted on. What it actually does, how it compares to Copilot and Claude Code, and when to reach for it - every fact checked against the official docs.
11 min readFour years ago GitHub Copilot finished your line of code. Today a tool called Devin opens pull requests while you sleep, and GitHub itself ships a cloud agent you can assign issues to like a coworker. Something changed between those two sentences, and the word for that change is agent. The buying question lives one layer up in the AI coding tools comparison matrix, but the mechanics start here.
Most writing about AI coding agents either oversells them as autonomous developers or dismisses them as fancy autocomplete. Neither is useful. This guide does the boring thing instead: it defines what a coding agent actually is in 2026, maps the categories, names the leaders with links to their own docs, and tells you what they still cannot do. Every capability claim points to a primary source. For the developer-tool buying view, keep the AI coding tools comparison matrix open next to this guide.
Use this as the beginner entry point, then branch by intent:
| If you want... | Read next |
|---|---|
| Tool selection | AI coding tools comparison matrix |
| Pricing and limits | AI coding tools pricing 2026 |
| The core agent shoot-out | Claude Code vs Codex vs Cursor vs OpenCode |
| Claude Code fundamentals | What is Claude Code |
| Cursor fundamentals | What is Cursor AI code editor |
| MCP and external tools | Complete MCP server guide |
Primary documentation for every agent covered in this guide. Verify capability claims and pricing against these before making decisions.
| Agent | Official Source | What It Covers |
|---|---|---|
| Claude Code | Claude Code Overview | Features, installation, MCP, skills, routines |
| Claude Code | Anthropic Pricing | Subscription tiers, usage limits |
| Cursor | Cursor Docs | Agent mode, cloud agents, composer |
| Cursor | Cloud Agents Docs | Sandboxed execution, MCP support |
| OpenAI Codex | Codex CLI GitHub | Installation, CLI usage, GitHub Action |
| OpenAI Codex | Using Codex with ChatGPT | Plan access, credit model |
| GitHub Copilot | Copilot Coding Agent | Cloud agent capabilities, issue assignment |
| GitHub Copilot | Copilot Plans | Free, Pro, Pro+, Business, Enterprise tiers |
| Devin | Devin Pricing | Pro, Max, Teams, Enterprise plans |
| Factory Droids | Factory AI | Multi-surface agents, pricing |
| Aider | Aider Documentation | Open-source CLI, model support |
| Replit Agent | Replit Agent | App building, parallel execution |
An AI coding agent is a program that takes a task in natural language, decides which actions to take, uses tools like file editors, shells, and browsers to carry them out, and keeps going across multiple steps until the task is done or it hits a limit.
The word that does the work is tools. Copilot suggested text. An agent suggests, then runs the tests, reads the output, edits the file again, runs the tests again, and opens a pull request. Anthropic's own description of Claude Code puts it this way: "Claude Code is an agentic coding tool that reads your codebase, edits files, runs commands, and integrates with your development tools" (docs). That tool loop is the same primitive behind how to build AI agents in TypeScript.
That sentence captures the three moves: read, edit, run. Stitch those together in a loop with a language model deciding what to do next, and you have an agent.
Not every agent does the same job. It helps to split the category into five shapes.
1. Terminal agents. Interactive CLIs that sit in your shell and drive your machine. Claude Code, OpenAI's Codex CLI, Aider, Factory's CLI droids. These are REPL-style tools you run in a repo and talk to. Codex CLI's GitHub repo tagline is "Lightweight coding agent that runs in your terminal" (github.com/openai/codex).
2. IDE agents. Embedded in your editor. Cursor Agent lives inside Cursor. The Claude Code VS Code extension adds inline diffs, plan review, and chat in the editor (docs). GitHub Copilot's agent mode runs inside VS Code and JetBrains. These agents see the file you have open and the selection you made.
3. Cloud/background agents. Spawn a sandboxed VM, hand it a task, close the tab, come back to a pull request. Cursor's cloud agents (formerly background agents) "leverage the same agent fundamentals but run in isolated environments in the cloud instead of on your local machine" (Cursor docs). GitHub Copilot's cloud agent "can work independently in the background to complete tasks, just like a human developer" (GitHub docs). Devin, Factory Droids, and Replit Agent all belong in this bucket too.
4. App builders. Agents that build a deployable app from a prompt, not a patch. Replit Agent 4 promotes "Parallel Task Execution" and "Multi-Format Building" for "web apps, mobile apps, landing pages, decks, and videos within a single project" (Replit). v0 sits here too. You type what you want, the agent ships a URL.
5. Managed agent platforms. The newest category. Infrastructure for running your own agents as a service. Anthropic launched Claude Managed Agents on April 8, 2026 with "sandboxed code execution, checkpointing, credential management, scoped permissions, and end-to-end tracing" (claude.com/blog/claude-managed-agents). OpenAI's Agents SDK and the platform layer underneath it play the same role for OpenAI's stack, which is why the OpenAI Codex and managed agents piece is the natural follow-up.
Most tools in 2026 blur these lines. Claude Code ships in terminal, VS Code, JetBrains, Desktop, Web, and iOS (docs). Factory Droids describe themselves as "The only software development agents that work everywhere you do" across IDE, terminal, desktop, web, CLI, and Slack (factory.ai).
An agent is three things stitched together: a model, a set of tools, and a loop.
The model is the brain. Usually a frontier model like Claude Opus 4.7, GPT-5.4, or Gemini 3. The tools are the hands. Read, Write, Edit, Bash, Grep, WebSearch, and anything else you expose. The loop is what makes it agentic: the model picks a tool, runs it, reads the result, decides what to do next, and repeats.
Claude Code's public documentation lists the kinds of work the loop handles: "writing tests for untested code, fixing lint errors across a project, resolving merge conflicts, updating dependencies, and writing release notes" (docs). None of those are single-shot completions. They are multi-step procedures that only work because the agent can read, act, check, and react.
Two extensions matter in 2026. They are also where the internal-link map branches: MCP belongs with the MCP beginner guide, while sandboxed execution belongs with the Codex cloud security playbook.
MCP (Model Context Protocol). An open standard for connecting agents to external data and tools. Postgres, GitHub, Linear, Figma, Playwright, whatever you have. Claude Code treats MCP as a first-class primitive: "With MCP, Claude Code can read your design docs in Google Drive, update tickets in Jira, pull data from Slack, or use your own custom tooling" (docs). Cursor's cloud agents support MCP servers for "access to external tools and data sources like databases, APIs, and third-party services" (docs). The complete MCP server guide goes deeper on how that protocol actually works.
Sandboxed execution. Running an agent on your laptop is fine for solo work. Running it for a team means isolating every task in a fresh environment so a stuck loop cannot rm-rf your home directory. GitHub Copilot's cloud agent runs "in its own ephemeral development environment, powered by GitHub Actions" (docs). Claude Managed Agents ship this as infrastructure you rent by the session-hour (claude.com).
Get the weekly deep dive
Tutorials on Claude Code, AI agents, and dev tools - delivered free every week.
From the archive
Apr 19, 2026 • 13 min read
Apr 19, 2026 • 12 min read
Apr 18, 2026 • 10 min read
Apr 18, 2026 • 9 min read
Pull this from the docs, not from hype.
Claude Code can "plan the approach, write the code across multiple files, and verify it works" and for bugs "traces the issue through your codebase, identifies the root cause, and implements a fix" (docs). It stages changes, writes commit messages, creates branches, and opens pull requests. It runs on a schedule via Routines on Anthropic infrastructure so "they keep running even when your computer is off" (docs).
GitHub Copilot's cloud agent "excels at low-to-medium complexity tasks in well-tested codebases, from adding features and fixing bugs to extending tests, refactoring code, and improving documentation" (docs). You assign a GitHub issue to it, and it opens a PR on a branch.
Cursor's cloud agents can "build, test, and interact with the changed software" and use desktop and browser control for UI-verifying changes (docs).
Aider describes itself as a tool that "lets you pair program with LLMs to start a new project or build on your existing codebase" with automatic git integration and a map of your whole codebase for context (aider.chat).
Factory Droids handle "complete tasks like refactors, incident response, and migrations" across IDE, terminal, desktop, web, CLI, and chat (factory.ai).
Replit Agent 4 runs "independent tasks simultaneously" and covers "auth, database, back-end functionality and front-end design all at once" (replit.com/agent).
That is the honest list. Writing code across files, fixing bugs, extending tests, handling migrations, running on a schedule, opening pull requests, deploying apps.
Every claim below points to the vendor's own page. Pricing and capability both.
The multi-surface agent platform. Terminal, VS Code, JetBrains, Desktop app, Web, iOS, Slack, Chrome (docs). Install with curl -fsSL https://claude.ai/install.sh | bash on macOS or Linux. Extensible through Skills, Subagents, Hooks, and MCP servers. Runs scheduled Routines on Anthropic infrastructure.
Pricing: included with Claude subscriptions (see claude.com/pricing).
Inside the Cursor editor. Cloud Agents run in isolated cloud environments, can control desktop and browser, and can be launched from Web, Desktop, Slack, GitHub, Linear, and API (docs). Cloud agents bill at "API pricing for the selected model" with user-defined spend limits (docs).
Open-source coding agent installed with npm install -g @openai/codex or brew install --cask codex. Integrates with ChatGPT plans (Plus, Pro, Business, Edu, Enterprise) when you sign in with your ChatGPT account (github.com/openai/codex). Built in Rust. Ships a GitHub Action at openai/codex-action for CI.
Assign a GitHub issue to @copilot, get a PR on a branch. Runs in an ephemeral GitHub Actions environment (docs).
Plans per GitHub's pricing page:
The original "autonomous software engineer" framing. Plans 2026 per devin.ai/pricing:
Multi-surface coding agents with five interfaces. Per factory.ai/pricing:
Open source terminal pair programmer. Works with Claude, DeepSeek, OpenAI, and "almost any LLM, including local models" (aider.chat). 42K GitHub stars and 88% of recent code written by Aider itself, per the project.
Pricing: free. You pay the model provider.
Cloud-first app builder. Per replit.com/pricing:
Infrastructure layer for running agents as a service. Launched April 8, 2026. Includes "sandboxed code execution, checkpointing, credential management, scoped permissions, and end-to-end tracing" (claude.com/blog/claude-managed-agents). Costs "$0.08 per session hour in addition to the standard API Claude token prices." Public beta on the Claude Platform with Notion, Rakuten, and Asana as early adopters.
| Agent | Best for | Where it runs | Billing model |
|---|---|---|---|
| Claude Code | Multi-surface agent work, skills, plugins | Terminal, IDE, Desktop, Web, Cloud | Claude subscription |
| Cursor Agent + Cloud | IDE-native and parallel cloud tasks | Editor + isolated cloud VMs | API pricing per model |
| Copilot Cloud Agent | Issue-to-PR in GitHub-native teams | GitHub Actions ephemeral env | Per-seat + premium requests |
| Devin | Long-horizon autonomous coding | Devin cloud | Per-seat, pay-as-you-go overages |
| Factory Droids | Multi-surface with token transparency | IDE, CLI, Slack, Web, Desktop | Per-seat, token-based |
| Replit Agent | App building, zero to deployed | Replit cloud | Credits, seat-based |
| Aider | Local terminal pair programming, BYOM | Your terminal | Free + model costs |
| OpenAI Codex CLI | Terminal agent, ChatGPT-integrated | Your terminal | ChatGPT plan |
| Claude Managed Agents | Running agents as a service | Anthropic infra | Per session-hour + tokens |
Docs will not tell you this. Experience will.
Visual and spatial changes. "Move the button, resize the card, align the grid" still trips most agents. They can read the CSS and guess, but they cannot see the result without computer-use tooling, and even then their eyes are bad.
Cost predictability. A stuck agent burns tokens in a loop. Managed session-hour pricing (Anthropic's $0.08/hr + tokens, per claude.com) helps but does not cap the token side. GitHub sells premium requests in buckets of 300 (Pro) or 1,500 (Pro+) with overflow billed by usage (plans).
Large refactors that span shared state. An agent can rename a function in 40 files. It struggles when the rename implies a data-model change that ripples through types, migrations, and tests unevenly.
Model and tool picking. Most platforms expose five or six models. Picking the right one for the task is on you.
Review quality. GitHub's own docs call out that the cloud agent "excels at low-to-medium complexity tasks in well-tested codebases" (docs). Read that as: give it narrow, testable work. Architecture still belongs to a human.
Pick one and try a real task. Do not read another comparison article.
If you want the fastest local install: Claude Code.
curl -fsSL https://claude.ai/install.sh | bash
cd your-project
claude
Then ask it to write tests for one module and run them. Full install flow at code.claude.com.
If you want an open-source option with BYO model: Aider.
Install from aider.chat and run it in a repo with your Anthropic or OpenAI key. Works with local models too.
If you want the ChatGPT-integrated terminal agent: OpenAI Codex CLI.
npm install -g @openai/codex
codexSign in with your ChatGPT account per the repo README.
If you are already on GitHub and want issue-to-PR: assign a GitHub issue to Copilot. Requires Pro, Pro+, Business, or Enterprise (docs).
If you want a cloud VM to run long tasks: Cursor Cloud Agents or Devin. Cursor's are launched from Web, Desktop, Slack, GitHub, Linear, or API (docs).
Run one task. Watch what happens. Adjust.
An AI coding agent is a language model with tools and a loop. In 2026 it will write tests, open pull requests, trace bugs, and deploy web apps. It will not replace the engineer reading the diff. Pick the surface that matches where you already work. Start with narrow tasks. Measure what it costs you on real work, not on demo videos.
The frontier is moving faster than any blog post. Check the docs linked below before quoting any number from this one, and use the AI coding tools pricing guide when the question shifts from capability to budget.
GitHub Copilot's autocomplete suggests code inline as you type - single-line or small-block completions. An AI coding agent like Claude Code, Cursor Agent, or GitHub Copilot's own cloud agent takes a task description, then runs a multi-step loop: reading files, editing code, running tests, checking results, and iterating until the task is done. The agent uses tools (shell, file system, browser) while autocomplete only suggests text. GitHub now offers both: Copilot completions for inline suggestions, and Copilot cloud agent for issue-to-PR workflows.
No. Agents excel at well-scoped, testable tasks like writing tests, fixing lint errors, handling migrations, and updating dependencies. GitHub's own documentation describes its cloud agent as handling "low-to-medium complexity tasks in well-tested codebases." Architecture decisions, debugging novel issues, understanding business context, and reviewing agent output still require human judgment. Agents multiply developer output on routine work but do not replace the developer reading the diff.
Pricing varies by vendor. Claude Code is included with Claude subscriptions. Cursor Agent uses API pricing per model with user-defined spend limits. GitHub Copilot Pro costs $10/month with cloud agent access. Devin Pro is $20/month with pay-as-you-go overages. Aider is free - you pay the model provider directly. Claude Managed Agents charge $0.08 per session-hour plus standard API token costs. The AI coding tools pricing guide has current numbers for all major tools.
Yes. Terminal agents like Claude Code, Codex CLI, and Aider run directly in your repo. IDE agents like Cursor Agent and the Claude Code VS Code extension work with whatever project you have open. Cloud agents clone your repo into a sandboxed environment. All modern agents support git, can read your entire codebase for context, and write changes across multiple files. MCP (Model Context Protocol) extends agent reach to external tools like databases, issue trackers, and design files.
It depends on where you work. Claude Code covers the most surfaces - terminal, VS Code, JetBrains, Desktop, Web, iOS, Slack. Cursor Agent is best if you already use Cursor as your editor. GitHub Copilot cloud agent fits teams that want issue-to-PR automation within GitHub. Aider is the best open-source option with bring-your-own-model support. Devin and Factory Droids handle longer-horizon autonomous work. The AI coding tools comparison matrix breaks down capabilities by use case.
An agent is three components: a model (the brain), tools (the hands), and a loop. The model - usually a frontier model like Claude Opus 4.7 or GPT-5.4 - receives a task. It picks a tool (read file, edit code, run shell command), executes it, reads the result, decides what to do next, and repeats until the task is done or a limit is reached. MCP (Model Context Protocol) lets agents connect to external data sources and services. Sandboxed execution isolates agent work so a stuck loop cannot damage your system.
Yes, with limits. Cloud agents from Cursor, GitHub, and Devin run in isolated environments while you close the tab. Claude Code Routines run on a schedule on Anthropic infrastructure. But "without supervision" does not mean "without review." Agents still produce pull requests that need human review. The safer pattern is narrow, testable tasks with clear success criteria, not open-ended autonomous coding. GitHub's cloud agent documentation explicitly recommends well-tested codebases with good CI coverage.
Visual and spatial UI changes - moving buttons, aligning grids, resizing cards - still trip most agents because they cannot see the rendered result. Large refactors that span shared state are risky when changes ripple unevenly through types, migrations, and tests. Cost predictability is weak when a stuck agent burns tokens in a loop. Architecture decisions and novel debugging require human judgment. Use agents for narrow, testable work and review everything they produce.
Technical content at the intersection of AI and development. Building with AI agents, Claude Code, and modern dev tools - then showing you exactly how it works.
AI-native code editor forked from VS Code. Composer mode rewrites multiple files at once. Tab autocomplete predicts your...
View ToolAnthropic's agentic coding CLI. Runs in your terminal, edits files autonomously, spawns sub-agents, and maintains memory...
View ToolThe original AI coding assistant. 77M+ developers. Inline completions in VS Code and JetBrains. Copilot Workspace genera...
View ToolAI app builder - describe what you want, get a deployed full-stack app with React, Supabase, and auth. No coding requi...
View ToolA concrete step-by-step guide to moving your development workflow from Cursor to Claude Code - settings, rules, keybindings, and the habits that transfer.
Getting StartedConfigure Claude Code for maximum productivity -- CLAUDE.md, sub-agents, MCP servers, and autonomous workflows.
AI AgentsA complete, citation-backed Claude Code course with setup, prompting systems, MCP, CI, security, cost controls, and capstone workflows.
ai-development
Claude Code is Anthropic's AI coding agent for your terminal. What it does, how it works, how it compares to Cursor and...

Four agents, same tasks. Honest trade-offs from a developer shipping production apps with all of them.

Cursor is a VS Code fork with AI at the center instead of bolted on. What it actually does, how it compares to Copilot a...

From Claude Code to Gladia, the ten CLIs every AI-native developer should know. Install commands, trade-offs, and when t...

A Q2 2026 pricing and packaging update for AI coding tools, based on official plan docs and release notes. Includes prac...

12 AI coding tools across 4 architecture types, compared on pricing, strengths, weaknesses, and best use cases. The defi...

New tutorials, open-source projects, and deep dives on coding agents - delivered weekly.