
TL;DR
I told an agent to improve the site every 10 minutes and went to sleep. Here is what 12 new repos, 60 PRs, and three goofs taught me about overnight orchestration.
Overnight orchestration relies on Claude Code's subagent, loop, and skill features. Verify current behavior against the official docs.
| Resource | What it covers |
|---|---|
| Claude Code Overview | Core agent architecture, parallel subagents, and execution model |
| Claude Code Sub-Agents | How subagents spawn parallel workers and control concurrency |
| Claude Code Skills | Skill definitions, SKILL.md format, and reusable command patterns |
| Claude Code Memory | CLAUDE.md, project rules, and cross-session context persistence |
| GitHub CLI Reference | The gh commands used for PR creation, auth, and billing checks |
At 11:47pm on April 28 I typed eight words into Claude Code:
For the broader agentic coding map, read Claude Code Agent Teams, Subagents, and MCP: The 2026 Playbook and Why Skills Beat Prompts for Coding Agents in 2026; they connect this article to the surrounding tool and workflow decisions.
/loop 10m improve the Developers Digest website overnight
Then I closed the laptop and went to bed.
By 1:53am the orchestrator session had spawned dozens of subagents, opened 59 pull requests across 21 repositories, scaffolded 12 new private repos, drafted 6 blog tutorials, written 2 video scripts, generated 8 distribution packages, and shipped a 5-PR backend migration end to end. By the time I woke up, the morning brief sitting in my repo was the longest tally I have ever seen from a single prompt.
This is not a Claude Code ad. The system did real work, but it also got things wrong, hit billing walls I had not budgeted for, and at one point fabricated a PR number that did not exist. The interesting story is the mix. So this is the candid version: what worked, what broke, and three lessons for anyone considering doing the same.
The shape of the run was simple enough to describe in a paragraph and complicated enough that I am still untangling it.
A parent orchestrator session held the loop. Every 10 minutes a cron-style tick fired off a planning step. The planner read the current state of the empire (24 apps under developersdigest, plus my standing rules), picked a batch of independent goals, and fanned out subagents in parallel to execute them. Some agents wrote code. Some scaffolded new repos. Some drafted blog posts. Some audited cross-repo consistency and filed reports. Each agent worked on its own branch in its own repo, opened a PR, tagged @devin-ai-integration for review, and exited.
The parent never merged. That was deliberate. My standing rule is: branch, PR, tag Devin, never direct-push to main. The overnight session inherited that rule and held to it across 60 PRs without exception.
The other rule it inherited was equally non-negotiable: nothing public on GitHub without my explicit say-so. Every one of the 12 new repos was created with --visibility private. I checked all of them in the morning. None had slipped.
Parallel fan-out scales further than I expected. A single tick would routinely have 5 to 8 subagents running concurrently. One cycle scaffolded mcp-lens, tracetrail, and cost-tape in parallel while a separate group of agents added Sentry observability to four production apps and a third group enriched 817 detail pages with generateMetadata and JSON-LD across four directory sites. The bottleneck was never compute. It was always coordination, and most coordination was avoided by keeping each agent's blast radius tight: one repo, one branch, one PR.
The dogfood loop closed itself. Three separate moments stood out. dd-content-engine PR #5 shipped a real Markdown to X / LinkedIn / newsletter fanout. By the next cycle, distribution agents were drafting their packages with the same fanout. tracetrail, scaffolded around 02:30 UTC, was wired into overnight-agents PR #4 within the hour as an "Open in TraceTrail" button on the runs page. repo-postcard, also new tonight, generated the 12 card PNGs that landed in developers-digest-site PR #47 for the new /apps entries. The system was building tools and using them in the same session.
Voice rules held under load. My DevDigest voice rules are explicit: no em dashes, no emojis, no superlatives, no gradients, no "blazing fast." Across 6 blog drafts, 8 distribution packages, and 2 video scripts, the consistency was genuinely strong. I spot-checked 14 markdown files this morning and found zero em dashes and zero emojis. Whatever is in the system prompt for tone is sticking.
Reports were honest. I asked for cross-repo audits across the empire and got four written deliverables, not code: PRODUCT-IDEAS-2026-04-28.md, agent-ecosystem-2026-04-28.md, APPS-TIGHTEN-STATUS-clerk-neon-2026-04-28-v2.md, GA-IDEAS-2026-04-28.md. The GA audit caught 18 apps hardcoding the same Google Analytics ID, which scaffolded the dd-ga repo to fix it. That is the loop I want from this kind of session: audit produces report, report seeds product, product fixes audit.
The Convex to Neon migration shipped end to end. This was the most ambitious unit of work. Five sequential PRs in dd-clipper (#4 jobs storage, #6 apiKeys, #7 apiCredits, #8 apiUsageLog, #9 clips) walking the schema across one table at a time. The agent that owned this thread held the dependency order, rebased when it hit conflicts on #4, and produced a docs: convex surface + neon migration plan companion PR (#5) so the next person could audit the cutover. That sequence is documented separately in PR #49.
Get the weekly deep dive
Tutorials on Claude Code, AI agents, and dev tools - delivered free every week.
From the archive
Apr 29, 2026 • 11 min read
Apr 29, 2026 • 10 min read
Apr 29, 2026 • 12 min read
Apr 29, 2026 • 12 min read
The GitHub Actions billing wall. Sometime around 03:15 UTC, every CI run on every open PR started failing with "The job was not started because recent account payments have failed or your spending limit needs to be increased." The org card had a billing failure. I did not know about it until I woke up. The agent kept opening PRs anyway because that was the right move, but it meant Devin had no CI signal to review against. Every one of the 60 PRs is currently red for a reason that has nothing to do with the code. Lesson: the orchestrator needs a billing check at the top of every loop, the same way it checks for gh auth status.
The gradient violation. One agent, drafting a redesigned /pro waitlist landing in PR #37, introduced a hero with a gradient background. My rules say no gradients, full stop. A subsequent QA agent on a later cycle caught it, opened a follow-up commit on the same branch, and replaced the gradient with the solid bg-cream and a pink offset card. The system self-corrected, but only because I happen to run a recurring QA agent. Without that, the violation would have shipped to review and waited on Devin to flag.
The fake PR number. This one is the most uncomfortable. Mid-run, an agent reported that it had opened "PR #51" against developers-digest-site for a sitemap improvement. The morning brief picked up the report. When I went to look, there was no PR #51. There was a branch with the work on it, sitting unpushed-as-PR. The agent had described an outcome that had not happened yet, the parent had taken the report at face value, and the brief had repeated it. I caught it because the PR table in the brief sorted by number and #51 was missing between #50 and #52. The actual PR was opened by hand once I confirmed the branch existed. I do not know yet whether the agent hallucinated the action or whether the gh pr create call failed silently and was misreported. Either way: trust nothing the orchestrator says about a PR number until you have seen it in gh pr list.
The rebase cascade. dd-clipper PR #4 hit conflicts because two earlier cycles had touched the same convex/schema.ts region. The owning agent flagged it, but the rebase took a separate agent and a full cycle to resolve. During that window the four downstream PRs (#6, #7, #8, #9) were blocked. Sequential migrations and fan-out parallelism do not mix as cleanly as I thought.
1. Decompose for independence, not for parallelism. The work that paralleled cleanly was work that touched separate repos or separate files. The work that did not (sequential migrations, schema changes, anything with implicit ordering) created queues and rebases. Before a loop starts, ask the planner to draw the dependency graph, then only fan out the leaves.
2. Verify every claim against the system of record. Agents will report what they meant to do, what they think they did, and what they actually did, and these three are not always the same. Run a reconciliation pass at the end of every cycle: gh pr list --json number,title --limit 100 and diff against the agent's claims. The fake PR #51 would have been caught instantly by this.
3. Pre-flight your invariants. Billing was the one I missed. Other ones to check before starting an overnight: disk space on the host, gh rate limit budget, model context budget, any required secrets for the tasks the planner might pick, and whether main is already broken on any repo (one of mine, dd-cron, had a pre-existing /api/health build failure that masked a perfectly good favicon PR). If any invariant is red, the loop should pause and tell me, not push through.
The output is real. 12 private repos, each with a working scaffold and a README. 60 open PRs, each branched, tagged, and reviewable. 6 blog drafts at draft-true so I can edit before publishing. 817 newly-enriched SEO pages across the directory sites. A backend migration shipped in a single night that I had been dragging my feet on for two weeks. If I had to do this with my hands it would have taken a working week.
The cost is not just dollars (the dollars I will know when the bill lands). The cost is the morning I am spending right now reconciling what was claimed against what is real, fixing the billing block, merging the boring PRs first, deciding which of the 12 new repos are worth keeping versus archiving. The agent did the producing. I have to do the curating, and curating 60 PRs is its own non-trivial day.
The cost is also trust calibration. After tonight I trust the system more on bounded tasks (one repo, one PR, clearly scoped) and less on multi-step claims about its own outputs. I will run another loop next week, but with a reconciliation step inside the loop and a billing pre-flight at the top.
If you want to see what came out of it, the /apps page lists the 12 new tools as coming-soon entries, the comparison hub was reorganized in PR #36, and the 10 tools announcement draft sits behind PR #42.
For anyone trying this themselves: the loop works. It works better when you treat the agent like a junior engineer who is genuinely fast, occasionally wrong, and structurally incapable of admitting which is which without help. Build the help in. Then go to bed.
/loop command in Claude Code?The /loop command tells Claude Code to run a task repeatedly at a specified interval. The syntax is /loop <interval> <prompt>, where interval can be in minutes (e.g., 10m), hours (2h), or other time formats. The orchestrator session stays open and re-evaluates the prompt on each tick, spawning subagents in parallel to execute independent work. This is different from a one-shot prompt because the agent maintains context across cycles and can build on previous work.
In this overnight session, 5 to 8 subagents ran concurrently per tick without issues. The theoretical limit depends on your machine's resources and the model's context budget, but the practical bottleneck is coordination, not compute. Each subagent should work on a separate repo or file set to avoid merge conflicts. If agents need to touch the same files, run them sequentially or use a dependency graph to order execution.
Before starting a long-running agent loop, verify: (1) gh auth status passes and your GitHub token has not expired, (2) your GitHub Actions billing is current with no payment failures, (3) disk space on the host is sufficient, (4) rate limit budgets for GitHub API and model providers are adequate, (5) required secrets and environment variables are set for all repos the agent might touch, and (6) main is passing CI on all target repos so you can distinguish agent failures from pre-existing breaks.
Always run a reconciliation pass after each cycle. Compare the agent's claims against the system of record: gh pr list --json number,title --limit 100 diffed against what the orchestrator reported. In this session, an agent claimed to open PR #51 but the PR did not exist - only the branch. The branch was valid; the gh pr create call had failed silently. Trust outputs you can verify independently, not self-reported success messages.
Decompose tasks so that each subagent can complete its work without waiting on or conflicting with others. Work that touches separate repos or separate files parallelizes cleanly. Work with implicit ordering (sequential migrations, schema changes, files shared across agents) creates rebases and blocking queues. Before the loop starts, ask the planner to draw the dependency graph and only fan out the leaves that have no upstream blockers.
Build self-correction into the loop. In this session, a QA agent caught a gradient violation that an earlier agent had introduced, then opened a follow-up commit on the same branch to fix it. Without the recurring QA pass, the violation would have reached review. Consider running a validation agent every N cycles that checks code against your style rules, lints for common errors, and flags PRs that need human attention before merge.
Dollar cost depends on the model, tokens processed, and run duration. The hidden cost is curation time: reviewing 60 PRs, reconciling claims against reality, fixing billing blocks, and deciding which scaffolded repos to keep versus archive. The agent does the producing; you do the curating. Budget time the next morning for triage. A 6-hour overnight session can easily generate a full day of review work.
Use overnight orchestration for parallelizable, low-stakes work where you trust the agent to branch and PR without your live supervision: scaffolding repos, enriching metadata, drafting content, running audits. Use interactive sessions for high-stakes decisions, work that requires human judgment mid-flight, or anything with tight sequential dependencies. The loop is a productivity multiplier, not a replacement for judgment-intensive work.
Read next
Claude Code is turning into an orchestration layer for agent teams. Here is how subagents, MCP, hooks, and long context fit together in 2026.
9 min readThe coding-agent workflow is maturing past giant hand-written prompts. The winning pattern in 2026 is a control stack: project rules, reusable skills, bounded sub-agents, and deterministic tools around the model.
9 min readThe definitive collection of Claude Code tips - sub-agents, hooks, worktrees, MCP, custom agents, keyboard shortcuts, and dozens of hidden features most developers never discover.
25 min readTechnical content at the intersection of AI and development. Building with AI agents, Claude Code, and modern dev tools - then showing you exactly how it works.
Multi-agent orchestration framework built on the OpenAI Agents SDK. Define agent roles, typed tools, and directional com...
View ToolAnthropic's agentic coding CLI. Runs in your terminal, edits files autonomously, spawns sub-agents, and maintains memory...
View ToolGives AI agents access to 250+ external tools (GitHub, Slack, Gmail, databases) with managed OAuth. Handles the auth and...
View ToolMulti-agent orchestration framework. Define agents with roles, goals, and tools, then assign them tasks in a crew. Pytho...
View ToolSpec out AI agents, run them overnight, wake up to a verified GitHub repo.
View AppEvery coding agent in one window. Stop alt-tabbing between Claude, Codex, and Cursor.
View AppTurn a one-liner into a working Claude Code skill. From idea to installed in a minute.
View AppA practical walk-through of how to design, write, and ship a Claude Code skill - from choosing when to trigger, through allowed-tools, to the steps the agent will actually follow.
Getting StartedConfigure model, tools, MCP, skills, memory, and scoping.
Claude CodeLimit which tools a subagent can access.
Claude Code
Claude Code is turning into an orchestration layer for agent teams. Here is how subagents, MCP, hooks, and long context...

The coding-agent workflow is maturing past giant hand-written prompts. The winning pattern in 2026 is a control stack: p...

The definitive collection of Claude Code tips - sub-agents, hooks, worktrees, MCP, custom agents, keyboard shortcuts, an...

Claude Code is Anthropic's AI coding agent for your terminal. What it does, how it works, how it compares to Cursor and...

31 deployed apps. 7 down. Favicons missing on 20 of 24 reachable hosts. Sentry on zero. Here is how a single audit turne...

Notes from a single session running 200+ Claude Code subagents in parallel across 35 repos. What worked, what broke, and...

New tutorials, open-source projects, and deep dives on coding agents - delivered weekly.