TL;DR
An ops guide to managing a fleet of Claude agents: spawning patterns, worktree isolation, build gates, orphaned-agent failure modes, and OpenTelemetry monitoring.
Read next
Claude Code dynamic workflows turn orchestration into a JavaScript script that runs up to 1,000 agents per run - here is how scripts, schemas, budgets, and resume actually work.
10 min readThe AI coding market is noisy. The changes that matter are easier to spot when you separate model capability, editor loops, terminal agents, background agents, agent frameworks, UI layers, context, security, and cost.
10 min readIf I were rebuilding my AI coding workflow on May 30, 2026, I would not pick one magic tool. I would pick a layered stack: terminal agent, editor, background agent, Mastra, CopilotKit, MCP, context, security, and cost controls.
11 min readLast updated: June 11, 2026
One Claude Code session is a tool. Ten concurrent sessions are infrastructure, and infrastructure needs operations: spawning discipline, isolation, gates, monitoring, and a plan for the ways it falls over. Addy Osmani put the shift bluntly in The Code Agent Orchestra: "You used to pair with one AI. Now you manage an agent team." This guide covers the operational layer - what breaks when you run many agents at once, and the patterns that keep a fleet productive.
Everything here is grounded in the official Claude Code docs as of June 2026, plus practitioner cost data. For the conceptual comparison of the primitives, start with subagents vs agent teams vs workflows and come back for the ops.
Claude Code now ships four parallelism primitives, and the cleanest way to tell them apart is asking who holds the plan. The workflows documentation draws most of these lines; the background-session column comes from the agent view docs:
| Subagents | Agent teams | Workflows | Background sessions | |
|---|---|---|---|---|
| Who decides what runs next | Claude, turn by turn | The lead agent | The script | You, per dispatch |
| Intermediate results live in | Claude's context | A shared task list | Script variables | Each session's own context |
| Scale | A few per turn | A handful of peers | Up to 16 concurrent, 1,000 per run | As many as your quota survives |
| Survives terminal close | No | No | No (resumable in-session only) | Yes, via supervisor process |
That last row is the one fleet operators get burned by, so let's start there.
Subagents are synchronous children. They live inside the parent session's turn, do their work in an isolated context window, and report a summary back. When the parent dies, they die. The same is true of workflow runs: they are resumable within the same session, but the docs are explicit that if you exit Claude Code mid-run, the next session starts the workflow fresh.
Background sessions are the opposite. They run under a separate supervisor process, so they survive closing the terminal, machine sleep, and even Claude Code auto-updates. Dispatch them with claude --bg "prompt", move a live session to the background with /bg, or dispatch from the claude agents view itself.
The operational rule: never build an orchestrator that backgrounds its own children and then exits. Agents tied to a parent's lifetime become orphans the moment the parent finishes. Either run children synchronously and wait, or dispatch independent background sessions through the supervisor and let it own their lifecycle.
Agent teams have their own orphan variant. The agent teams docs note that /resume does not restore in-process teammates - after resuming, the lead may try to message teammates that no longer exist. Split-pane mode can also leave orphaned tmux sessions behind (tmux ls, then tmux kill-session). And cleanup must always run from the lead; teammates running cleanup can leave shared resources in an inconsistent state.
A worked example: say you want to audit 30 API route files for missing auth checks overnight.
Pattern 1, subagent fan-out. Ask the main session to spawn a few audit subagents per batch. Cheap and simple, but everything funnels through one context window, and the whole run dies with the session. Fine for 5 files. Wrong for 30.
Pattern 2, a workflow. Include ultracode in the prompt and Claude writes a JavaScript orchestration script that a runtime executes in the background. The runtime enforces hard caps - 16 concurrent agents, 1,000 agents total per run - and intermediate results stay in script variables instead of polluting context. Every run's script lands under ~/.claude/projects/, and good ones can be saved to .claude/workflows/ as reusable slash commands. This is the right shape for the 30-file audit: bounded, repeatable, observable in /workflows.
Pattern 3, an agent team. Reserve teams for work that needs peer discussion - competing debugging hypotheses, multi-lens code review. Teams are experimental, enabled with CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1, and coordinate through a shared task list where claiming uses file locking to prevent two teammates grabbing the same task. The official sizing guidance: 3-5 teammates, 5-6 tasks per teammate, because "three focused teammates often outperform five scattered ones."
Pattern 4, background session dispatch. For independent tasks across repos - a bug fix here, a PR review there - dispatch each as its own background session and monitor from claude agents. The docs explicitly warn that each session draws subscription quota independently, so check your limits before dispatching many at once.
Get the weekly deep dive
Tutorials on Claude Code, AI agents, and dev tools - delivered free every week.
From the archive
Jun 11, 2026 • 10 min read
Jun 11, 2026 • 10 min read
Jun 10, 2026 • 7 min read
Jun 10, 2026 • 8 min read
The classic fleet failure: two agents share one checkout, one switches branches mid-task, and both produce garbage. The fix is git worktrees, and Claude Code now applies it automatically in two places.
First, background sessions move themselves into a worktree under .claude/worktrees/ before editing files, so parallel sessions share a checkout but write separately. You can opt out per repo with worktree.bgIsolation: "none" if you want direct edits. Second, subagent definitions accept isolation: worktree in frontmatter, which runs the subagent in a temporary worktree branched from your default branch - not the parent's HEAD - and auto-cleans it if the subagent makes no changes.
One sharp edge from the agent view docs: deleting a session from the view also deletes a Claude-created worktree, including any uncommitted changes in it. Commit or push before you prune the fleet. For the broader workflow, see the git worktrees parallel agents guide.
Isolation solves write collisions but creates a new problem: N agents finish with N branches, and something has to merge them. Two gates make consolidation survivable.
Gate one: serialize expensive shared resources. Worktrees isolate file writes, not CPU. Ten agents each running a full framework build on one machine will contend for cores and disk until everything slows to a crawl or flakes. A simple flock-style lock script that agents call before building - so only one build runs at a time while everything else proceeds in parallel - removes a whole class of phantom failures. Boring, effective.
Gate two: machine-enforced quality checks. Agent teams expose three hooks for this - TeammateIdle, TaskCreated, and TaskCompleted - and exiting with code 2 blocks the action and sends feedback to the agent. That means "tests must pass before a task is marked complete" can be a hard gate rather than a polite request. The same hooks philosophy applies across the fleet; Claude Code hooks explained covers the mechanics.
Osmani's data point on context files shapes consolidation quality upstream: per the ETH Zurich research he cites (Gloaguen et al.), developer-written AGENTS.md files improved agent success by roughly 4 percent, while LLM-generated ones reduced success by about 3 percent and increased inference costs over 20 percent. The spec a fleet runs on is the highest-leverage artifact you own; generating it with the same model that consumes it is a trap. His larger point stands too: "The bottleneck is no longer generation. It's verification." Plan-approval gates for teammates, hooks at task boundaries, and a human merge review are the minimum viable verification stack - and the merge step deserves its own discipline, covered in parallel coding agents merge discipline.
Three layers, in increasing order of effort:
The built-in dashboards. claude agents groups every background session into Needs input, Working, and Completed. /workflows shows per-phase agent counts, token totals, and elapsed time, with keys to pause (p), stop (x), and restart (r) individual agents mid-run.
OpenTelemetry metrics. Set CLAUDE_CODE_ENABLE_TELEMETRY=1 and Claude Code exports claude_code.cost.usage (in USD), claude_code.token.usage, session counts, and commit counts to any OTLP backend, per the monitoring docs. Token usage can be broken down by agent.name for per-agent-type cost attribution - with one caveat: built-in and official-marketplace agent names appear verbatim, but user-defined agent names are reported as custom.
Distributed traces (beta). Add CLAUDE_CODE_ENHANCED_TELEMETRY_BETA=1 and an OTEL_TRACES_EXPORTER, and every prompt becomes a claude_code.interaction root span whose children carry agent_id and parent_agent_id attributes - you can see exactly which subagent or teammate issued every request. Better still for orchestrators: claude -p and Agent SDK sessions read TRACEPARENT from their environment, so a script that dispatches a fleet can parent every agent's spans under one trace. Interactive sessions deliberately ignore inbound trace context.
There is no separate billing for agents. Every subagent, teammate, workflow agent, and background session draws from the same plan quota, so ten parallel agents burn it ten times faster. CloudZero's May 2026 analysis estimates roughly $13/day for a solo dev's normal single-session workflow, $30-40/day with three parallel agents, and $50-130/day at five to ten - vendor estimates, not Anthropic pricing, but directionally consistent with the official guidance that team token costs scale linearly with teammate count.
The biggest lever is model tiering: CloudZero found a tiered fleet - strong model for orchestration, cheaper models for workers - runs about 40 percent cheaper than defaulting everything to the top tier, and you can enforce the tiers in subagent YAML with the model field so nobody forgets. For the full breakdown, see what parallel Claude agents actually cost.
| Failure | Cause | Fix |
|---|---|---|
| Orphaned agents | Parent exits while children depend on it | Sync children, or supervisor-hosted background sessions |
| Ghost teammates | /resume does not restore in-process teammates | Tell the lead to spawn fresh teammates |
| Branch collisions | Agents sharing one checkout | Worktree isolation (automatic for background sessions) |
| Lost work on cleanup | Deleting a session deletes its Claude-created worktree | Commit or push before pruning |
| Build contention | N agents building simultaneously | Lock file gate around expensive builds |
| Silent quality decay | No verification between generation and merge | Hooks with exit code 2, plan approval, human merge review |
| Runaway spend | Linear quota burn nobody is watching | OTel cost metrics, /workflows token view, small pilot runs first |
Fewer than you think. The official agent teams guidance is 3-5 teammates with 5-6 tasks each, and workflows cap at 16 concurrent agents regardless. Scale comes from bounded, repeatable runs - not from maximizing live agent count.
Background sessions dispatched via claude --bg, /bg, or agent view keep running - a separate supervisor process hosts them, and they persist through terminal close, machine sleep, and auto-updates. Subagents and in-flight workflow runs do not survive their session ending.
Background sessions automatically move into git worktrees under .claude/worktrees/ before editing, and subagents can opt in with isolation: worktree. Each agent writes to its own branch in its own directory; consolidation happens at merge time.
No separate fee, but every agent draws from the same quota, so cost scales linearly with the fleet. CloudZero estimates $50-130/day at five to ten parallel agents. Tier your models per agent and watch /usage and the OTel cost metric.
Yes, as of Claude Code v2.1.172 subagents can spawn their own subagents up to 5 levels deep, per the official changelog (the subagents docs page still describes the older no-nesting behavior, so expect the docs to catch up). Use it sparingly - every level adds cost and removes visibility.
Technical content at the intersection of AI and development. Building with AI agents, Claude Code, and modern dev tools - then showing you exactly how it works.
Anthropic's agentic coding CLI. Runs in your terminal, edits files autonomously, spawns sub-agents, and maintains memory...
View ToolInteractive TUI dashboard that shows exactly where your Claude Code and Cursor tokens are going, in real time.
View ToolAnthropic's flagship reasoning model. Best-in-class for coding, long-context analysis, and agentic workflows. 1M token c...
View ToolEvery coding agent in one window. Stop alt-tabbing between Claude, Codex, and Cursor.
View AppTurn a one-liner into a working Claude Code skill. From idea to installed in a minute.
View AppUnlock pro skills and share private collections with your team.
View AppConfigure Claude Code for maximum productivity -- CLAUDE.md, sub-agents, MCP servers, and autonomous workflows.
AI AgentsA practical walk-through of how to design, write, and ship a Claude Code skill - from choosing when to trigger, through allowed-tools, to the steps the agent will actually follow.
Getting StartedA concrete step-by-step guide to moving your development workflow from Cursor to Claude Code - settings, rules, keybindings, and the habits that transfer.
Getting Started
Nimbalyst Demo: A Visual Workspace for Codex + Claude Code with Kanban, Plans, and AI Commits Try it: https://nimbalyst.com/ Star Repo Here: https://github.com/Nimbalyst/nimbalyst This video demos N...

Composio: Connect AI Agents to 1,000+ Apps via CLI (Gmail, Google Docs/Sheets, Hacker News Workflows) Check out Composio here: http://dashboard.composio.dev/?utm_source=Youtube&utm_channel=0426&utm_...

Anthropic has released Channels for Claude Code, enabling external events (CI alerts, production errors, PR comments, Discord/Telegram messages, webhooks, cron jobs, logs, and monitoring signals) to b...
Claude Code dynamic workflows turn orchestration into a JavaScript script that runs up to 1,000 agents per run - here is...

The AI coding market is noisy. The changes that matter are easier to spot when you separate model capability, editor loo...

If I were rebuilding my AI coding workflow on May 30, 2026, I would not pick one magic tool. I would pick a layered stac...

Four agents, same tasks. Honest trade-offs from a developer shipping production apps with all of them.
Twelve documented Claude Fable 5 use patterns - agent orchestration, overnight runs, 1M-context refactors, effort tuning...
Fable 5 posts an 80.3% SWE-Bench Pro score and costs 2x Opus 4.8 - here is the task-profile scoring guide that tells you...

New tutorials, open-source projects, and deep dives on coding agents - delivered weekly.