
Building in Public
2 partsTL;DR
We rebuilt and replatformed this site in a day by running a fleet of AI agents in parallel. Here is the honest operating model - the ownership rules, the verification gate on every handoff, and the failure modes we hit, with the guardrail each one produced.
We rebuilt and replatformed this site in a single day by running a fleet of AI agents in parallel. The design story lives in its own post: why we retired the old cream-and-pink system for a hard-edged neutral contract, and what we chose. This post is about the other half, the part that is harder to see and much easier to get wrong: the orchestration. How do you point dozens of agent runs at one codebase in one day and end up with a coherent site instead of a pile of conflicting commits?
The short version is that the model is not clever. It is boring, and boring is the point. Coordination fails in exciting ways and succeeds in dull ones. What follows is the operating model that held, the guardrails that made it safe, and the specific failure modes we hit that day with the fix each one produced. None of it is theoretical. All of it cost us something to learn.
If you want the framework-level vocabulary first - fan-out, pipeline, hierarchical delegation, blackboard - read the definitive guide to coordinating multiple AI agents. This post assumes you already know the patterns and want the field notes.
Seven rules did most of the work. Each one exists because the alternative bit us at some point, either that day or before it.
The first rule is the one everything else rests on: never two writers per file. Every agent gets a scope, and scopes do not overlap at the file level. One agent owns the homepage. Another owns the blog templates. A third owns the global tokens. When two agents both need to touch a shared file, that is a signal to serialize them, not to let them both edit and merge later.
This sounds obvious and is constantly violated in practice, because the natural decomposition of a task ("redesign the site") does not respect file boundaries ("both the nav and the footer import the same tokens file"). The discipline is to decompose along ownership lines, not feature lines. If a change spans a shared file, one agent lands the shared change first, and the others build on top of it.
Package management is a shared-file problem with extra teeth. Two agents running pnpm add at the same time race on the lockfile and package.json, and the loser's install silently vanishes or corrupts the tree. So one agent owns package.json at a time. Dependency installs are serialized through a single owner, full stop. Component-library installs are the same: one agent runs the install, adapts the component to the design contract, commits, and only then does downstream work start.
This is the load-bearing rule. The orchestrator runs the same gate on every single handoff, not at the end of the day. The gate is: typecheck, style check, build, and one more step that catches the failure the other three miss.
That extra step is the committed-tree-in-isolated-worktree trick. Agents leave in-progress files in the working tree. A commit can pass every local check while importing a file that was never staged, because the file exists on disk but not in the commit. Local tooling sees the file; the CI runner, which only has the commit, does not. So the gate checks out the actual commit into a throwaway worktree and typechecks that, in isolation from the working tree. If the commit imports something it did not include, this catches it before it reaches the deploy pipeline. Nothing else does.
The principle generalizes past our specific stack: verify the artifact you are about to ship, not the environment you built it in. The working directory lies. The commit does not.
Anything that leaves the building starts as a draft for review. Content, public copy, anything a reader or a customer would see. The agent produces it, a human or a review pass approves it, and only then does it ship. This is not about distrust of the model. It is that the cost of a bad externally-visible change is asymmetric, and the cost of a review pass is small. When the downside is public and hard to reverse, you pay the small tax every time.
Some rules are not task-specific; they apply to every agent regardless of scope. Banned topics. The design contract - square corners, hairline borders, no gradients, no em dashes. These are broadcast to every agent as standing constraints, and they are codified in the project instructions so they are inherited, not remembered. A constraint you have to remember is a constraint you will eventually break. A constraint the codebase and the brief both enforce stays enforced. This is what let agents working on different pages produce work that looked like it came from one hand.
Any action that spends money or touches a live external system defaults to off. If an agent is unsure whether it is authorized to make a paid call, provision infrastructure, or hit a production endpoint, the default is to stop and ask, not to proceed and apologize. Fail-closed is the only safe default for irreversible or costly actions, because the failure mode of asking is a few seconds of latency and the failure mode of proceeding is a bill or an outage.
The last rule is a rhythm: verify, commit, push, per increment. Never let a day of parallel work pile up into one giant unreviewed merge. Each increment goes through the gate and ships on its own. Batching feels efficient and is a trap: it hides which change broke what, it makes the verification gate slower and scarier, and it turns a small revert into a large one. Small, continuous, verified increments keep the blast radius of any single mistake tiny.
Get the weekly deep dive
Tutorials on Claude Code, AI agents, and dev tools - delivered free every week.
From the archive
Jul 1, 2026 • 8 min read
Jul 1, 2026 • 8 min read
Jul 1, 2026 • 8 min read
Jul 1, 2026 • 6 min read
Rules read as clean in a list. They were not clean to learn. Here are the actual failures from the day and the guardrail each one produced. This is the part worth reading twice, because the failures are more transferable than the successes.
Agents assuming unshipped sibling exports. An agent imported a function it expected a sibling agent to have exported, because the plan said that function would exist. But the sibling had not shipped it yet, or had named it differently. The code looked correct in isolation and broke at integration. Guardrail: agents build against what is committed, not against what is promised. If an export does not exist in the tree yet, you do not import it; you serialize behind the agent that owns it.
Mid-write files breaking global CSS. An agent was partway through editing the global stylesheet when a downstream build picked up the half-written file, and the broken CSS cascaded across every page at once. A shared global file in a mid-write state is a site-wide outage waiting to happen. Guardrail: shared global files get a single owner who lands complete, verified changes, and downstream work does not build against a global file that is mid-edit.
Silent idles with no reports. An agent went quiet. Not failed, not finished, just idle, and it produced no report, so the orchestrator did not know whether it was working, stuck, or done. Silence is ambiguous and ambiguity stalls the whole fleet. Guardrail: every agent reports on handoff. A run that goes silent without a report is treated as stalled and gets checked, not assumed to be making progress.
Env files clobbered by a tool. A tool overwrote an environment file, wiping configuration that other work depended on. Guardrail: treat env and other shared config files as owned, single-writer surfaces exactly like source files, and never let a tool rewrite them as a side effect without that write going through the same ownership and verification path as any other change.
Deploy pipeline broken by a package-manager default. The deploy failed on a package-manager default we did not know had changed: the newer major version hard-blocks dependency build scripts unless they are explicitly approved, in a config key that moved between versions. Installs that worked locally failed in CI. Guardrail: test dependency changes with a clean, frozen-lockfile install that mirrors CI, not just the warm local install, because the warm local environment hides exactly the failures the cold CI environment will hit.
The through-line across every one of these: the failure was never the model being dumb. It was two pieces of work making incompatible assumptions about shared state - a file, an export, an env var, a lockfile, a build default - and the fix was always the same shape. Make the shared thing owned, verify the real artifact, and never assume a sibling's promise is a sibling's commit.
If you are about to point a fleet of agents at your own codebase, start here. This is the shortest version of what took us a day of mistakes to internalize.
package.json at a time.None of these are exotic. That is the lesson. Coordinating a fleet of agents is mostly the same discipline as coordinating a team of people: clear ownership, honest verification, small reversible increments, and safe defaults when the stakes are high. The agents move faster, so the cost of skipping the discipline arrives faster too. Get the operating model right and the speed is a gift. Skip it and the speed just multiplies your surface area for silent breakage.
If you want the next layer down, the orchestration patterns guide covers the mechanics of each coordination shape, and managing a fleet of Claude agents goes deeper on the day-to-day of running one. To go from patterns to a durable skill set, follow a learning path or browse the agents library.
Single-owner file scopes. Every agent gets a scope, and no two scopes touch the same file. Decompose the work along ownership lines rather than feature lines, because features naturally span shared files while ownership does not. When a change genuinely needs a shared file, one agent lands that change first and the others build on top of it. Dependency installs and env files are the highest-risk shared surfaces, so they always go through a single owner.
Four checks: a typecheck, a style check for banned patterns, a full build, and an isolated-worktree check. That last one checks out the actual commit into a throwaway worktree and typechecks it in isolation from the working directory. It catches the case where a commit imports a file that exists on disk but was never staged, which passes every local check and then fails in CI. The rule is to verify the artifact you are shipping, not the environment you built it in.
No. The operating model here is about discipline, not tooling: ownership boundaries, a verification gate, draft-first review, fail-closed defaults, and continuous shipping. Those apply regardless of whether you use a framework or raw orchestration. Frameworks help once you need explicit loops or shared state, which the coordination guide covers, but the guardrails that keep a fleet safe are process, not library choice.
Read next
We retired the playful cream-and-pill design system for a hard-edged neutral, Vercel-inspired contract, and rebuilt the whole site in a day by coordinating parallel AI agents. Here is the design direction, the constraints we picked, how it was built, and what is next.
8 min readProduction-tested patterns for orchestrating AI agent teams - from fan-out parallelism to hierarchical delegation. Covers CrewAI, LangGraph, AutoGen, OpenAI Agents SDK, Google ADK, and custom approaches with real code.
14 min readFrom single-agent baselines to multi-level hierarchies, these are the seven patterns for wiring AI agents together in production. Each with a decision rule, an implementation sketch, and the tradeoffs that actually matter.
10 min readTechnical content at the intersection of AI and development. Building with AI agents, Claude Code, and modern dev tools - then showing you exactly how it works.
Multi-agent orchestration framework built on the OpenAI Agents SDK. Define agent roles, typed tools, and directional com...
View ToolOpen-source terminal agent runtime with approval modes, rollback snapshots, MCP servers, LSP diagnostics, and a headless...
View ToolLightweight Python framework for multi-agent systems. Agent handoffs, tool use, guardrails, tracing. Successor to the ex...
View ToolMulti-agent orchestration framework. Define agents with roles, goals, and tools, then assign them tasks in a crew. Pytho...
View ToolSpec out AI agents, run them overnight, wake up to a verified GitHub repo.
View AppGive your agents a filesystem that branches like git. Crash-safe by default.
View AppPick a model in 30 seconds. Built for the answer, not the marketing.
View AppA practical walk-through of how to design, write, and ship a Claude Code skill - from choosing when to trigger, through allowed-tools, to the steps the agent will actually follow.
Getting StartedConfigure model, tools, MCP, skills, memory, and scoping.
Claude CodeAuto-memory that persists across multiple subagent invocations.
Claude Code
Check out Trae here! https://tinyurl.com/2f8rw4vm In this video, we dive into @Trae_ai a newly launched AI IDE packed with innovative features. I provide a comprehensive demonstration...

Build Anything with Vercel, the Agentic Infrastructure Stack Check out Vercel: https://vercel.plug.dev/cwBLgfW The video shows a behind-the-scenes walkthrough of how the creator rapidly builds and d

MiniMax Token Plan 12% OFF:https://platform.minimax.io/subscribe/coding-plan?code=5MBsFNv1Jf&source=link MiniMax Platform:https://platform.minimax.io API Documentation:https://platform.minimax.io/docs

We retired the playful cream-and-pill design system for a hard-edged neutral, Vercel-inspired contract, and rebuilt the...

Production-tested patterns for orchestrating AI agent teams - from fan-out parallelism to hierarchical delegation. Cover...

From single-agent baselines to multi-level hierarchies, these are the seven patterns for wiring AI agents together in pr...

An ops guide to managing a fleet of Claude agents: spawning patterns, worktree isolation, build gates, orphaned-agent fa...

From swarms to pipelines - here are the patterns for coordinating multiple AI agents in TypeScript applications.

Fable 5 changes multi-agent orchestration because the orchestrator can now hold the whole project in one head. Here is t...

New tutorials, open-source projects, and deep dives on coding agents - delivered weekly.