
TL;DR
DeepSeek, Kimi, and GLM are cheap enough to run as sidecar subagents for drafts and exploration. The catch is that cheap work you cannot inspect is just expensive noise. A shared canvas makes the output reviewable.
| Resource | Description |
|---|---|
| DeepSeek pricing | Current DeepSeek V4 Flash and V4 Pro per-token rates |
| Claude Code subagents | Subagents run in their own context window with restricted tools |
| Codex CLI subagents | Codex subagent workflows for parallelizing larger tasks |
| AgentCanvas | The board cheap subagents write to |
The economics of subagents flipped in 2026. DeepSeek V4 Flash is $0.14 per million input tokens and $0.28 per million output tokens. GLM-5.2 is open-weights and effectively free if you host it. Kimi is in the same band. At those prices you can afford to spin up a dozen sidecar subagents to draft, explore, and sketch - work you would never pay frontier-model prices for.
The reason most people do not do this is not cost. It is that the output of a cheap subagent is usually invisible. It lives in a transcript nobody opens, in a context window that closes when the subagent returns, and the only thing that survives is a one-line summary. Cheap work you cannot inspect is just expensive noise.
Subagents are designed to isolate context. Each one runs in its own fresh conversation, does its work, and returns a single text result to the parent. The intermediate tool calls and outputs stay inside the subagent. That is the feature: the parent's context stays clean.
It is also the trap. When the subagent is cheap and exploratory, the interesting part is the exploration - the drafts it tried, the options it sketched, the dead ends it hit. All of that gets thrown away by design. You paid $0.004 for a subagent to explore five approaches and you get back "approach 3 looks best" with no evidence.
This is the same dynamic covered in the agent teams playbook: specialization is good, but specialization without a shared surface means every handoff is lossy. The fix for cheap subagents is the same as the fix for expensive ones: give them a place to put the work where a human or another agent can look at it.
Newsletter
Get the weekly deep dive
Tutorials on Claude Code, AI agents, and dev tools, delivered free every week.
From the archive
Jul 5, 2026 • 7 min read
Jul 4, 2026 • 8 min read
Jul 4, 2026 • 8 min read
Jul 4, 2026 • 8 min read
The move is to point every cheap subagent at the same AgentCanvas board. Instead of returning a summary, the subagent calls create_html_asset to pin its drafts, create_image_asset to attach sketches, and append_html to stream its reasoning as it goes.
Now the economics work the way they are supposed to:
The subagent still runs in its own context window, so your main agent's context stays clean. The difference is that the output is on a board instead of trapped in a transcript. When the work is visible, cheap subagents stop being a gamble and start being a pipeline.
Not every task belongs on a cheap model. The pattern that works:
The decision is not really about which model is best. It is about which model is cheap enough that you can run it speculatively without flinching. For the budget end, the DeepSeek V4 budget coding agents guide and the GLM-5.2 cost math walk through the numbers.
You could argue the same thing is achievable by having subagents write files to a directory. You can. The difference is that a directory is a flat list and a canvas is a layout. When three subagents each produce two drafts, a directory gives you six files with no relationship. A canvas gives you three columns, each with its drafts stacked, and you can see at a glance which lane is winning.
That spatial structure is the whole point of AgentCanvas. It is what turns cheap speculative subagents from a pile of files into a reviewable workspace.
A subagent running on a low-cost model like DeepSeek V4 Flash, GLM-5.2, or Kimi, used for drafts, exploration, and speculative work where the cost is low enough to run several in parallel.
Because their value is in the exploration, not the summary. Subagents return only a single text result to the parent, so the drafts and sketches they produced are lost unless they are written somewhere persistent.
It gives subagents MCP tools to pin HTML docs, images, and video to a shared board. The subagent's full output stays visible to humans and to other agents instead of being discarded with the subagent's context window.
Yes. Claude Code subagents inherit MCP tools from the parent by default, so a subagent can call the AgentCanvas tools to write its work to the board.
For final implementation, security review, and anything that ships directly. Use cheap subagents for the speculative first passes and frontier models for the work that has to be right.
Read next
Claude Code is turning into an orchestration layer for agent teams. Here is how subagents, MCP, hooks, and long context fit together in 2026.
9 min readDeepSeek V4 is trending because it is close enough to frontier coding models at a much lower token price. The real question for developers is where cheap reasoning belongs in an agent stack.
8 min readDeepSeek V4-Flash costs $0.28 per million output tokens. Fable 5 costs $50. That 178x gap is real - but so is the quality difference. Here is where it matters and where it does not.
7 min readTechnical content at the intersection of AI and development. Building with AI agents, Claude Code, and modern dev tools - then showing you exactly how it works.
Gives AI agents access to 250+ external tools (GitHub, Slack, Gmail, databases) with managed OAuth. Handles the auth and...
View ToolLightweight Python framework for multi-agent systems. Agent handoffs, tool use, guardrails, tracing. Successor to the ex...
View ToolMulti-agent orchestration framework built on the OpenAI Agents SDK. Define agent roles, typed tools, and directional com...
View ToolMac app for running parallel Claude Code, Codex, and Cursor agents in isolated workspaces. Watch every agent work at onc...
View Tool
Claude Code is turning into an orchestration layer for agent teams. Here is how subagents, MCP, hooks, and long context...

DeepSeek V4 is trending because it is close enough to frontier coding models at a much lower token price. The real quest...

DeepSeek V4-Flash costs $0.28 per million output tokens. Fable 5 costs $50. That 178x gap is real - but so is the qualit...

Z.ai's GLM-5.2 lands as a 753B open-weights coding model that beats GPT-5.5 on SWE-bench Pro for roughly one-sixth the p...

Claude Code subagents let you split coding work across specialized assistants with their own context, tools, and instruc...

Claude Code and Codex both ship great agents and terrible transcripts. AgentCanvas is a visual adapter that puts the art...

New tutorials, open-source projects, and deep dives on coding agents - delivered weekly.