7 AI Agent Orchestration Patterns Every Developer Should Know

Official Sources
Anthropic Claude Code Documentation	Claude Code overview, Task tool, subagents
Anthropic Agents Guide	Building agents with Claude
LangGraph Documentation	State machine orchestration for complex workflows
CrewAI Documentation	Role-based multi-agent teams
AutoGen Documentation	Multi-agent conversation framework
Model Context Protocol	Tool connectivity standard for agents

The vocabulary developers are missing#

One of the quieter frustrations of 2026 AI development is that everyone is building multi-agent systems and no two people are using the same words for the shapes. Someone says "I spawned a swarm." Another says "I ran a supervisor." A third says "I pipelined it." The shapes are real. The vocabulary is not yet shared.

This post catalogs the seven patterns that keep showing up when developers wire AI agents together in production. It is not an exhaustive taxonomy. It is the minimum shared vocabulary that makes architecture conversations productive. Pick the pattern that matches your problem, understand the tradeoffs, combine them when needed.

1. Single Agent#

The baseline. One model, one system prompt, one task. No orchestration.

When it works: the task is self-contained, the context fits in a single window, no specialized tools or hand-offs are needed. Write a function, summarize a doc, answer a question.

When it breaks: the task requires multiple areas of expertise, the output needs verification from a different perspective, or the work exceeds a single context window.

Implementation:

Terminal

claude -p "Refactor this function to use async/await"

That is the entire pattern. The moment you reach for a second agent, you have left this pattern.

The mistake here: jumping to orchestration too early. Most tasks do not need a supervisor. Most tasks do not need a swarm. Start with one agent, prove you need more, then reach for a heavier pattern.

2. Supervisor#

One coordinator agent receives the task, decomposes it, routes subtasks to specialists, and synthesizes results.

Code

User → Supervisor
          │
   ┌──────┼──────┐
Agent A  Agent B  Agent C
(research) (code) (review)
          │
       Supervisor → Response

When to use: two to five distinct subtasks requiring different expertise, you want a single point of control and quality gating, subtasks can run in parallel but results need synthesis.

Implementation in Claude Code: the supervisor uses the Task tool to spawn subagents. Each subagent gets a focused system prompt and a scoped set of tools. The supervisor collects results and decides what to do next.

Real example: the DevDigest /devdigest:research skill spawns parallel research agents, one per source, then a supervisor synthesizes findings into a structured brief.

Tradeoffs: clean separation of concerns and easy to debug, but the supervisor is a bottleneck. If it misroutes, everything fails. Quality of the final output is roughly proportional to the quality of the supervisor's task decomposition.

3. Pipeline#

Sequential processing where each stage's output becomes the next stage's input. Unix pipes for agents.

Code

Input → Stage 1 → Stage 2 → Stage 3 → Output
       (scrape)  (analyze) (format)

When to use: work that is naturally sequential (research, then write, then review), each stage needing different tools or models, a desire for checkpoints between stages.

Implementation: each stage is a separate claude -p call. Intermediate results written to files on disk. The next stage reads from disk, processes, writes its output. A shell script or cron orchestrates.

Real example: a video production pipeline. Firecrawl scrapes sources to /tmp/research/. A research agent reads those files, produces research/summary.md. A script agent reads the summary, produces script.md. A production agent reads the script, produces a YouTube description and thumbnail brief.

Tradeoffs: simple, debuggable, resumable from any stage, but no parallelism. If your stages are actually independent, pipeline is slower than it needs to be.

4. Swarm#

Fan out N independent tasks to parallel workers. All run simultaneously. Results collected and merged.

Code

       Coordinator
      /     │     \
Worker    Worker   Worker
(task 1)  (task 2) (task 3)
      \     │     /
        Collector

When to use: you have N independent, similar tasks. Audit N repos. Research N topics. Process N files. Each task is self-contained and does not depend on others. Speed matters.

Implementation in Claude Code: use the Task tool to spawn multiple agents in a single message. Each gets the same instructions with different input data. Results are collected when all complete.

Real example: an email triage skill that spawns ten parallel agents, one per Gmail label, to analyze the inbox concurrently. Ten minutes of serial work becomes one minute of parallel work.

Tradeoffs: linear speedup with worker count and trivially parallelizable, but no inter-worker communication. If tasks actually depend on each other, swarm produces inconsistent output.

The practical rule: if you cannot write down each worker's instructions before spawning any of them, swarm is not the pattern.

Newsletter

Get the weekly deep dive

Tutorials on Claude Code, AI agents, and dev tools, delivered free every week.

From the archive

Karpathy Skills Show Why CLAUDE.md Is Product Surface Now

Apr 21, 2026 • 8 min read

Multica Turns Coding Agents Into Teammates. The Hard Part Is Receipts.

Apr 20, 2026 • 8 min read

271 MCP Servers Exist. These 5 Actually Make Claude Code Better.

Apr 19, 2026 • 11 min read

The $400 Overnight Bill: Why Managed Agents Need FinOps Now

Apr 19, 2026 • 13 min read

5. Debate#

Two or more agents take opposing positions on a question. A judge agent evaluates and picks a winner, or synthesizes a balanced answer.

Code

Question → Agent A (pro) → Judge → Answer
         → Agent B (con)

When to use: decision-making with real tradeoffs. Technology choice, architecture decision, scope cut. You want to surface arguments you might not have considered.

Implementation: spawn two agents with opposing system prompts ("argue for X", "argue against X"). Feed both responses to a judge agent. The judge synthesizes or picks a winner with reasoning.

Real example: evaluating "should we use Convex or Neon for this feature?" One agent argues real-time. The other argues relational. The judge decides based on the specific requirements the feature actually has.

Tradeoffs: surfaces blind spots and produces higher-quality decisions, but costs three times the tokens of a single-agent answer. Overkill for straightforward tasks. Worth the cost for decisions you will live with for a year.

6. Hierarchical#

Multi-level delegation tree. A director sets strategy, managers decompose into tasks, workers execute.

Code

       Director
      /        \
 Manager A    Manager B
  /    \       /    \
Wkr   Wkr    Wkr   Wkr

When to use: large, complex projects that naturally decompose into teams with different expertise. Build an entire app. Audit a complete codebase. Produce an end-to-end deliverable.

Implementation: a director agent plans the overall approach and creates manager-level tasks. Each manager decomposes its area and spawns workers. Workers execute and report up. Managers synthesize. Director receives the final rollup.

Real example: a full-stack app build delegated to auth, database, UI, and deployment managers. Each manager owns its sub-tree. The director only sees final rollups from each area.

Tradeoffs: scales to large projects with clean responsibility boundaries, but communication overhead between levels grows quickly. Debugging is painful when a bug hides under three layers of delegation. Expensive. Reserve for tasks that genuinely need it.

7. Harness#

Not a one-shot orchestration pattern but a persistent environment that agents live in. The harness compounds across sessions through memory, hooks, skills, and cron.

Code

Inputs (email, git, web, meetings)
          │
   ┌──────▼──────┐
   │   HARNESS    │
   │  Memory      │ ← persists across sessions
   │  Hooks       │ ← event-driven automation
   │  Skills      │ ← packaged capabilities
   │  Subagents   │ ← parallel workers
   │  Cron        │ ← scheduled tasks
   └──────┬──────┘
          │
  Outputs (code, emails, docs, deploys)
          │
   ┌──────▼──────┐
   │ Learn loop   │ ← harness improves itself
   └─────────────┘

The six harness primitives you actually compose:

CLAUDE.md - identity and rules, loaded every session
Memory - persistent markdown the model reads and writes
Hooks - event-driven scripts (PreToolUse, PostToolUse, Stop)
Skills - packaged capabilities with progressive disclosure
Subagents - parallel workers spawned via the Task tool
Headless mode - claude -p for cron, CI, and scripting

When to use: you want an agent system that gets better over time. You have recurring workflows (daily email triage, weekly reporting). You want to encode institutional knowledge that persists across sessions.

Tradeoffs: compounds over time (every session builds on the last), but requires setup investment and file organization discipline. The first week feels like overhead. The third month feels like a superpower.

Choosing a pattern#

The decision tree in one paragraph: is the task a single scoped question? Single Agent. Can it split into independent chunks? Swarm. Is it a sequence where each step feeds the next? Pipeline. Does it need multiple areas of expertise with synthesis? Supervisor. Is it a decision with real tradeoffs? Debate. Is it a large project with teams and sub-teams? Hierarchical. Do you want an always-on system that improves? Harness.

Patterns combine#

In practice, the interesting work happens at the seams.

Harness + Swarm. The harness cron triggers a swarm of research agents every morning.
Supervisor + Pipeline. A supervisor decomposes work, each subtask is a mini-pipeline.
Hierarchical + Debate. Manager agents debate approach before directing workers.
Pipeline + Swarm. Each pipeline stage fans out to parallel workers.

Most production systems are actually a Harness with one or two of the other patterns layered inside. The Harness provides persistence and scheduling. The inner pattern handles the task shape.

The tool map#

Tool	Best fit	Notes
Claude Code (Task tool)	Supervisor, Swarm, Hierarchical	Native subagent support
Claude Code (headless)	Pipeline, Harness, Cron	`claude -p` for scripting
Claude Managed Agents	Long-running supervised agents	Cloud-hosted
LangGraph	Complex state machines	Good for Debate, Hierarchical
CrewAI	Role-based teams	Opinionated Supervisor
AutoGen	Multi-agent conversations	Good for Debate

Most developers I know start with Claude Code because it supports most patterns natively, then reach for LangGraph or AutoGen when they need a state machine complex enough that markdown files and shell scripts are no longer enough.

The meta-point#

Pattern literacy is the shortcut that lets you build the right system the first time. If you know your problem is a Swarm, you will not accidentally build a Supervisor and wonder why it is slow. If you know you want a Harness, you will invest in the primitives early rather than rebuilding them for every new script.

The seven patterns are not academic. They are what show up after a year of shipping agents. The faster you learn to name the shape you are building, the faster you stop fighting the tools.

Frequently Asked Questions#

What is AI agent orchestration?#

Agent orchestration is the practice of coordinating multiple AI agents to complete a task. Instead of one model handling everything, you split work across specialists - a researcher, a coder, a reviewer - and wire them together. The orchestration layer handles task decomposition, routing, parallel execution, and result synthesis.

Which orchestration pattern should I start with?#

Start with Single Agent. Most tasks do not need orchestration. If you find yourself thinking "I wish this agent could also do X at the same time," then reach for Supervisor (for 2-5 subtasks) or Swarm (for N independent parallel tasks). Add complexity only when a simpler pattern fails.

What is the difference between a Supervisor and a Pipeline?#

A Supervisor decomposes work, routes it to specialists in parallel, and synthesizes results. A Pipeline processes sequentially - stage 1 output becomes stage 2 input. Use Supervisor when subtasks are independent and can run simultaneously. Use Pipeline when each stage genuinely depends on the previous one.

When should I use a Swarm pattern?#

Use Swarm when you have N similar, independent tasks - audit 10 repos, research 8 topics, process 50 files. The key test: can you write each worker's instructions before spawning any of them? If workers need to coordinate or depend on each other's output, Swarm is the wrong pattern.

What is a Harness in AI agent orchestration?#

A Harness is not a one-shot pattern but a persistent environment that agents live in across sessions. It includes memory (persistent markdown), hooks (event-driven automation), skills (packaged capabilities), subagents, and scheduled tasks. The Harness compounds over time as every session builds on the last.

How do I implement these patterns in Claude Code?#

Claude Code supports most patterns natively. Single Agent is just claude -p "task". Supervisor and Swarm use the Task tool to spawn parallel subagents. Pipeline chains claude -p calls with file-based intermediate results. Harness uses CLAUDE.md, hooks, skills, and headless mode for cron. See Building Multi-Agent Workflows with Claude Code for implementation details.

How do I choose between Debate and single-agent decision making?#

Use Debate when you face a decision with real tradeoffs - technology choice, architecture decision, scope cut - and want to surface arguments you might not have considered. Debate costs three times the tokens (two advocates plus a judge) but produces higher-quality decisions. For straightforward questions, stick with a single agent.

Can I combine multiple orchestration patterns?#

Yes, most production systems combine patterns. A Harness triggers a Swarm of research agents every morning. A Supervisor decomposes work where each subtask is a mini-Pipeline. Manager agents in a Hierarchical system Debate approach before directing workers. The Harness typically provides persistence and scheduling while an inner pattern handles the task shape.

The vocabulary developers are missing#

1. Single Agent#

The baseline. One model, one system prompt, one task. No orchestration.

When it works: the task is self-contained, the context fits in a single window, no specialized tools or hand-offs are needed. Write a function, summarize a doc, answer a question.

When it breaks: the task requires multiple areas of expertise, the output needs verification from a different perspective, or the work exceeds a single context window.

Implementation:

Terminal

claude -p "Refactor this function to use async/await"

That is the entire pattern. The moment you reach for a second agent, you have left this pattern.

2. Supervisor#

One coordinator agent receives the task, decomposes it, routes subtasks to specialists, and synthesizes results.

Code

User → Supervisor
          │
   ┌──────┼──────┐
Agent A  Agent B  Agent C
(research) (code) (review)
          │
       Supervisor → Response

When to use: two to five distinct subtasks requiring different expertise, you want a single point of control and quality gating, subtasks can run in parallel but results need synthesis.

Real example: the DevDigest /devdigest:research skill spawns parallel research agents, one per source, then a supervisor synthesizes findings into a structured brief.

3. Pipeline#

Sequential processing where each stage's output becomes the next stage's input. Unix pipes for agents.

Code

Input → Stage 1 → Stage 2 → Stage 3 → Output
       (scrape)  (analyze) (format)

When to use: work that is naturally sequential (research, then write, then review), each stage needing different tools or models, a desire for checkpoints between stages.

Tradeoffs: simple, debuggable, resumable from any stage, but no parallelism. If your stages are actually independent, pipeline is slower than it needs to be.

4. Swarm#

Fan out N independent tasks to parallel workers. All run simultaneously. Results collected and merged.

Code

       Coordinator
      /     │     \
Worker    Worker   Worker
(task 1)  (task 2) (task 3)
      \     │     /
        Collector

When to use: you have N independent, similar tasks. Audit N repos. Research N topics. Process N files. Each task is self-contained and does not depend on others. Speed matters.

Implementation in Claude Code: use the Task tool to spawn multiple agents in a single message. Each gets the same instructions with different input data. Results are collected when all complete.

Real example: an email triage skill that spawns ten parallel agents, one per Gmail label, to analyze the inbox concurrently. Ten minutes of serial work becomes one minute of parallel work.

Tradeoffs: linear speedup with worker count and trivially parallelizable, but no inter-worker communication. If tasks actually depend on each other, swarm produces inconsistent output.

The practical rule: if you cannot write down each worker's instructions before spawning any of them, swarm is not the pattern.

Newsletter

Get the weekly deep dive

Tutorials on Claude Code, AI agents, and dev tools, delivered free every week.

From the archive

Karpathy Skills Show Why CLAUDE.md Is Product Surface Now

Apr 21, 2026 • 8 min read

Multica Turns Coding Agents Into Teammates. The Hard Part Is Receipts.

Apr 20, 2026 • 8 min read

271 MCP Servers Exist. These 5 Actually Make Claude Code Better.

Apr 19, 2026 • 11 min read

The $400 Overnight Bill: Why Managed Agents Need FinOps Now

Apr 19, 2026 • 13 min read

5. Debate#

Two or more agents take opposing positions on a question. A judge agent evaluates and picks a winner, or synthesizes a balanced answer.

Code

Question → Agent A (pro) → Judge → Answer
         → Agent B (con)

When to use: decision-making with real tradeoffs. Technology choice, architecture decision, scope cut. You want to surface arguments you might not have considered.

Implementation: spawn two agents with opposing system prompts ("argue for X", "argue against X"). Feed both responses to a judge agent. The judge synthesizes or picks a winner with reasoning.

6. Hierarchical#

Multi-level delegation tree. A director sets strategy, managers decompose into tasks, workers execute.

Code

       Director
      /        \
 Manager A    Manager B
  /    \       /    \
Wkr   Wkr    Wkr   Wkr

When to use: large, complex projects that naturally decompose into teams with different expertise. Build an entire app. Audit a complete codebase. Produce an end-to-end deliverable.

Real example: a full-stack app build delegated to auth, database, UI, and deployment managers. Each manager owns its sub-tree. The director only sees final rollups from each area.

7. Harness#

Not a one-shot orchestration pattern but a persistent environment that agents live in. The harness compounds across sessions through memory, hooks, skills, and cron.

Code

Inputs (email, git, web, meetings)
          │
   ┌──────▼──────┐
   │   HARNESS    │
   │  Memory      │ ← persists across sessions
   │  Hooks       │ ← event-driven automation
   │  Skills      │ ← packaged capabilities
   │  Subagents   │ ← parallel workers
   │  Cron        │ ← scheduled tasks
   └──────┬──────┘
          │
  Outputs (code, emails, docs, deploys)
          │
   ┌──────▼──────┐
   │ Learn loop   │ ← harness improves itself
   └─────────────┘

The six harness primitives you actually compose:

CLAUDE.md - identity and rules, loaded every session
Memory - persistent markdown the model reads and writes
Hooks - event-driven scripts (PreToolUse, PostToolUse, Stop)
Skills - packaged capabilities with progressive disclosure
Subagents - parallel workers spawned via the Task tool
Headless mode - claude -p for cron, CI, and scripting

Choosing a pattern#

Patterns combine#

In practice, the interesting work happens at the seams.

Harness + Swarm. The harness cron triggers a swarm of research agents every morning.
Supervisor + Pipeline. A supervisor decomposes work, each subtask is a mini-pipeline.
Hierarchical + Debate. Manager agents debate approach before directing workers.
Pipeline + Swarm. Each pipeline stage fans out to parallel workers.

Most production systems are actually a Harness with one or two of the other patterns layered inside. The Harness provides persistence and scheduling. The inner pattern handles the task shape.

The tool map#

Tool	Best fit	Notes
Claude Code (Task tool)	Supervisor, Swarm, Hierarchical	Native subagent support
Claude Code (headless)	Pipeline, Harness, Cron	`claude -p` for scripting
Claude Managed Agents	Long-running supervised agents	Cloud-hosted
LangGraph	Complex state machines	Good for Debate, Hierarchical
CrewAI	Role-based teams	Opinionated Supervisor
AutoGen	Multi-agent conversations	Good for Debate

The meta-point#

The seven patterns are not academic. They are what show up after a year of shipping agents. The faster you learn to name the shape you are building, the faster you stop fighting the tools.

The vocabulary developers are missing#

1. Single Agent#

2. Supervisor#

3. Pipeline#

4. Swarm#

Karpathy Skills Show Why CLAUDE.md Is Product Surface Now

Multica Turns Coding Agents Into Teammates. The Hard Part Is Receipts.

271 MCP Servers Exist. These 5 Actually Make Claude Code Better.

The $400 Overnight Bill: Why Managed Agents Need FinOps Now

5. Debate#

6. Hierarchical#

7. Harness#

Choosing a pattern#

Patterns combine#

The tool map#

The meta-point#

Frequently Asked Questions#

What is AI agent orchestration?#

Which orchestration pattern should I start with?#

What is the difference between a Supervisor and a Pipeline?#

When should I use a Swarm pattern?#

What is a Harness in AI agent orchestration?#

How do I implement these patterns in Claude Code?#

How do I choose between Debate and single-agent decision making?#

Can I combine multiple orchestration patterns?#

Further reading#

Zed Just Made Parallel AI Agents a Native Editor Primitive

Building Multi-Agent Workflows in Claude Code: A Practical Tutorial

The 10 Best AI Coding Tools in 2026

Related Tools

Agency Swarm

Conductor

OpenAI Agents SDK

CrewAI

Apps from Developers Digest

Agent Hub

Skill Builder

Cost Tape Cloud

Related Guides

Claude Code Setup Guide

AI Agent Frameworks Compared: LangGraph vs CrewAI vs Mastra vs CopilotKit

Writing Your First Claude Code Skill

Related Videos

AI Tools Every Developer Should Know in 2025

OpenAI Swarm: New Open-Source Multi-Agent Framework Released in 5 Minutes

Agents 101: How to Build and Deploy Anything with AI Agents

Related Posts

Zed Just Made Parallel AI Agents a Native Editor Primitive

Building Multi-Agent Workflows in Claude Code: A Practical Tutorial

The 10 Best AI Coding Tools in 2026

How to Build AI Agents in TypeScript

Multi-Agent Systems: How to Orchestrate Multiple AI Agents in TypeScript

AI Agents Explained: A TypeScript Developer's Guide

Build with the member tools

Get Smarter About AI Dev

The vocabulary developers are missing#

1. Single Agent#

2. Supervisor#

3. Pipeline#

4. Swarm#

Karpathy Skills Show Why CLAUDE.md Is Product Surface Now

Multica Turns Coding Agents Into Teammates. The Hard Part Is Receipts.

271 MCP Servers Exist. These 5 Actually Make Claude Code Better.

The $400 Overnight Bill: Why Managed Agents Need FinOps Now

5. Debate#

6. Hierarchical#

7. Harness#

Choosing a pattern#

Patterns combine#

The tool map#

The meta-point#

Frequently Asked Questions#

What is AI agent orchestration?#

Which orchestration pattern should I start with?#

What is the difference between a Supervisor and a Pipeline?#

When should I use a Swarm pattern?#

What is a Harness in AI agent orchestration?#

How do I implement these patterns in Claude Code?#

How do I choose between Debate and single-agent decision making?#

Can I combine multiple orchestration patterns?#