
TL;DR
From single-agent baselines to multi-level hierarchies, these are the seven patterns for wiring AI agents together in production. Each with a decision rule, an implementation sketch, and the tradeoffs that actually matter.
One of the quieter frustrations of 2026 AI development is that everyone is building multi-agent systems and no two people are using the same words for the shapes. Someone says "I spawned a swarm." Another says "I ran a supervisor." A third says "I pipelined it." The shapes are real. The vocabulary is not yet shared.
This post catalogs the seven patterns that keep showing up when developers wire AI agents together in production. It is not an exhaustive taxonomy. It is the minimum shared vocabulary that makes architecture conversations productive. Pick the pattern that matches your problem, understand the tradeoffs, combine them when needed.
The baseline. One model, one system prompt, one task. No orchestration.
When it works: the task is self-contained, the context fits in a single window, no specialized tools or hand-offs are needed. Write a function, summarize a doc, answer a question.
When it breaks: the task requires multiple areas of expertise, the output needs verification from a different perspective, or the work exceeds a single context window.
Implementation:
claude -p "Refactor this function to use async/await"
That is the entire pattern. The moment you reach for a second agent, you have left this pattern.
The mistake here: jumping to orchestration too early. Most tasks do not need a supervisor. Most tasks do not need a swarm. Start with one agent, prove you need more, then reach for a heavier pattern.
One coordinator agent receives the task, decomposes it, routes subtasks to specialists, and synthesizes results.
User → Supervisor
│
┌──────┼──────┐
Agent A Agent B Agent C
(research) (code) (review)
│
Supervisor → Response
When to use: two to five distinct subtasks requiring different expertise, you want a single point of control and quality gating, subtasks can run in parallel but results need synthesis.
Implementation in Claude Code: the supervisor uses the Task tool to spawn subagents. Each subagent gets a focused system prompt and a scoped set of tools. The supervisor collects results and decides what to do next.
Real example: the DevDigest /devdigest:research skill spawns parallel research agents, one per source, then a supervisor synthesizes findings into a structured brief.
Tradeoffs: clean separation of concerns and easy to debug, but the supervisor is a bottleneck. If it misroutes, everything fails. Quality of the final output is roughly proportional to the quality of the supervisor's task decomposition.
Sequential processing where each stage's output becomes the next stage's input. Unix pipes for agents.
Input → Stage 1 → Stage 2 → Stage 3 → Output
(scrape) (analyze) (format)
When to use: work that is naturally sequential (research, then write, then review), each stage needing different tools or models, a desire for checkpoints between stages.
Implementation: each stage is a separate claude -p call. Intermediate results written to files on disk. The next stage reads from disk, processes, writes its output. A shell script or cron orchestrates.
Real example: a video production pipeline. Firecrawl scrapes sources to /tmp/research/. A research agent reads those files, produces research/summary.md. A script agent reads the summary, produces script.md. A production agent reads the script, produces a YouTube description and thumbnail brief.
Tradeoffs: simple, debuggable, resumable from any stage, but no parallelism. If your stages are actually independent, pipeline is slower than it needs to be.
Get the weekly deep dive
Tutorials on Claude Code, AI agents, and dev tools - delivered free every week.
Fan out N independent tasks to parallel workers. All run simultaneously. Results collected and merged.
Coordinator
/ │ \
Worker Worker Worker
(task 1) (task 2) (task 3)
\ │ /
Collector
When to use: you have N independent, similar tasks. Audit N repos. Research N topics. Process N files. Each task is self-contained and does not depend on others. Speed matters.
Implementation in Claude Code: use the Task tool to spawn multiple agents in a single message. Each gets the same instructions with different input data. Results are collected when all complete.
Real example: an email triage skill that spawns ten parallel agents, one per Gmail label, to analyze the inbox concurrently. Ten minutes of serial work becomes one minute of parallel work.
Tradeoffs: linear speedup with worker count and trivially parallelizable, but no inter-worker communication. If tasks actually depend on each other, swarm produces inconsistent output.
The practical rule: if you cannot write down each worker's instructions before spawning any of them, swarm is not the pattern.
Two or more agents take opposing positions on a question. A judge agent evaluates and picks a winner, or synthesizes a balanced answer.
Question → Agent A (pro) → Judge → Answer
→ Agent B (con)
When to use: decision-making with real tradeoffs. Technology choice, architecture decision, scope cut. You want to surface arguments you might not have considered.
Implementation: spawn two agents with opposing system prompts ("argue for X", "argue against X"). Feed both responses to a judge agent. The judge synthesizes or picks a winner with reasoning.
Real example: evaluating "should we use Convex or Neon for this feature?" One agent argues real-time. The other argues relational. The judge decides based on the specific requirements the feature actually has.
Tradeoffs: surfaces blind spots and produces higher-quality decisions, but costs three times the tokens of a single-agent answer. Overkill for straightforward tasks. Worth the cost for decisions you will live with for a year.
Multi-level delegation tree. A director sets strategy, managers decompose into tasks, workers execute.
Director
/ \
Manager A Manager B
/ \ / \
Wkr Wkr Wkr Wkr
When to use: large, complex projects that naturally decompose into teams with different expertise. Build an entire app. Audit a complete codebase. Produce an end-to-end deliverable.
Implementation: a director agent plans the overall approach and creates manager-level tasks. Each manager decomposes its area and spawns workers. Workers execute and report up. Managers synthesize. Director receives the final rollup.
Real example: a full-stack app build delegated to auth, database, UI, and deployment managers. Each manager owns its sub-tree. The director only sees final rollups from each area.
Tradeoffs: scales to large projects with clean responsibility boundaries, but communication overhead between levels grows quickly. Debugging is painful when a bug hides under three layers of delegation. Expensive. Reserve for tasks that genuinely need it.
Not a one-shot orchestration pattern but a persistent environment that agents live in. The harness compounds across sessions through memory, hooks, skills, and cron.
Inputs (email, git, web, meetings)
│
┌──────▼──────┐
│ HARNESS │
│ Memory │ ← persists across sessions
│ Hooks │ ← event-driven automation
│ Skills │ ← packaged capabilities
│ Subagents │ ← parallel workers
│ Cron │ ← scheduled tasks
└──────┬──────┘
│
Outputs (code, emails, docs, deploys)
│
┌──────▼──────┐
│ Learn loop │ ← harness improves itself
└─────────────┘
The six harness primitives you actually compose:
CLAUDE.md - identity and rules, loaded every sessionclaude -p for cron, CI, and scriptingWhen to use: you want an agent system that gets better over time. You have recurring workflows (daily email triage, weekly reporting). You want to encode institutional knowledge that persists across sessions.
Tradeoffs: compounds over time (every session builds on the last), but requires setup investment and file organization discipline. The first week feels like overhead. The third month feels like a superpower.
The decision tree in one paragraph: is the task a single scoped question? Single Agent. Can it split into independent chunks? Swarm. Is it a sequence where each step feeds the next? Pipeline. Does it need multiple areas of expertise with synthesis? Supervisor. Is it a decision with real tradeoffs? Debate. Is it a large project with teams and sub-teams? Hierarchical. Do you want an always-on system that improves? Harness.
In practice, the interesting work happens at the seams.
Most production systems are actually a Harness with one or two of the other patterns layered inside. The Harness provides persistence and scheduling. The inner pattern handles the task shape.
| Tool | Best fit | Notes |
|---|---|---|
| Claude Code (Task tool) | Supervisor, Swarm, Hierarchical | Native subagent support |
| Claude Code (headless) | Pipeline, Harness, Cron | claude -p for scripting |
| Claude Managed Agents | Long-running supervised agents | Cloud-hosted |
| LangGraph | Complex state machines | Good for Debate, Hierarchical |
| CrewAI | Role-based teams | Opinionated Supervisor |
| AutoGen | Multi-agent conversations | Good for Debate |
Most developers I know start with Claude Code because it supports most patterns natively, then reach for LangGraph or AutoGen when they need a state machine complex enough that markdown files and shell scripts are no longer enough.
Pattern literacy is the shortcut that lets you build the right system the first time. If you know your problem is a Swarm, you will not accidentally build a Supervisor and wonder why it is slow. If you know you want a Harness, you will invest in the primitives early rather than rebuilding them for every new script.
The seven patterns are not academic. They are what show up after a year of shipping agents. The faster you learn to name the shape you are building, the faster you stop fighting the tools.
Technical content at the intersection of AI and development. Building with AI agents, Claude Code, and modern dev tools - then showing you exactly how it works.
Multi-agent orchestration framework built on the OpenAI Agents SDK. Define agent roles, typed tools, and directional com...
View ToolLightweight Python framework for multi-agent systems. Agent handoffs, tool use, guardrails, tracing. Successor to the ex...
View Tool
New tutorials, open-source projects, and deep dives on coding agents - delivered weekly.
Multi-agent orchestration framework. Define agents with roles, goals, and tools, then assign them tasks in a crew. Pytho...
Deep comparison of the top AI agent frameworks - architecture, code examples, strengths, weaknesses, and when to use each one.
AI AgentsConfigure Claude Code for maximum productivity -- CLAUDE.md, sub-agents, MCP servers, and autonomous workflows.
AI AgentsWhat MCP servers are, how they work, and how to build your own in 5 minutes.
AI Agents
Courses: https://links.zerotomastery.io/Courses/DD/Jan25 AI Courses: https://links.zerotomastery.io/AICourses/DD/Jan25 Career Path Quiz: https://links.zerotomastery.io/CPQuiz/DD/Jan25 Prompt...

Introducing Swarm: OpenAI's New Multi-Agent Orchestration Framework Learn The Fundamentals Of Becoming An AI Engineer On Scrimba; https://v2.scrimba.com/the-ai-engineer-path-c02v?via=developersdig...

Check out Zed here! https://zed.dev In this video, we dive into Zed, a robust open source code editor that has recently introduced the Agent Client Protocol. This new open standard allows...

The math of agent pipelines is brutal. 85% reliability per step compounds to about 20% at 10 steps. Here is why long cha...
Production-tested patterns for orchestrating AI agent teams - from fan-out parallelism to hierarchical delegation. Cover...

From swarms to pipelines - here are the patterns for coordinating multiple AI agents in TypeScript applications.