
TL;DR
Vercel just declared the agent stack: AI Gateway, Sandbox, Flags, and Microfrontends. Here is how the four primitives compose, with code, and where each one actually fits in a real product.
Read next
Durable execution lands on Vercel. What it means for agents, long-running flows, and indie dev stacks - with code, gotchas, and where it fits the agent stack.
10 min readCloudflare Flagship is feature flags built for AI: model swaps, agent gates, and prompt rollouts as first-class primitives. Here is how to use it without rebuilding your control plane.
9 min readOne dev, one CLI, 24 subdomains, and a lot of parallel agents. The playbook for shipping an AI app portfolio.
9 min readFor two years the conversation about what an "agent stack" actually means has been a list of vendors and a vague hand-wave. LangChain for orchestration, OpenAI for inference, some vector DB, some queue, some place to run untrusted code, some flagging system bolted on after the first incident. Every team rebuilt the same plumbing.
For the larger agent workflow map, read AI Agents Explained: A TypeScript Developer's Guide and How to Build AI Agents in TypeScript; they give the architecture and implementation context this piece assumes.
Vercel's agentic infrastructure announcement is the first time a major platform has named the primitives explicitly and shipped them as a coherent stack. Four pieces: AI Gateway for model routing, Sandbox for code execution, Flags for runtime control, and Microfrontends for composing UIs that agents render into. None of these are individually novel. The bet is that you want them as one platform with one auth model, one observability surface, and one billing line.
This post is a working developer's read of what was announced, what it looks like in code, and where the seams are. I am skeptical of platform consolidation as a default but I think Vercel got the abstractions mostly right, and the parts they got wrong are the parts you can route around.
AI Gateway is a model-router with a single OpenAI-compatible endpoint. Point your SDK at https://gateway.ai.vercel.com/v1, pass a model identifier like anthropic/claude-opus-4.7 or openai/gpt-5.3, and get back a response. The gateway handles failover, caching, rate limit smoothing across keys, and per-request cost tracking. You can also define routing policies - for example, "route reasoning-heavy prompts to opus, route summarization to haiku" - without rewriting your application code.
Sandbox is a microVM-backed code execution environment with a Node-like API. You hand it a snippet of Python, JavaScript, or shell, and it runs in an isolated VM with file system, network egress controls, and a 60-second to 30-minute lifetime depending on plan. This is the primitive every coding agent has been hand-rolling on top of Firecracker, E2B, or Modal. Vercel collapsed it.
Flags is the lightweight feature flag service Vercel has been quietly building for two years, now positioned as the runtime control plane for agents. Toggle which model an agent uses, which tools it can call, which prompt template applies to which user, all from a dashboard, all evaluated at the edge. There is no SDK weight beyond a tree-shakeable function call.
Microfrontends lets you compose UI from independently deployed apps. The agent angle is that an agent can render a generated UI fragment from a separate deployment without taking over the whole page. Think generative UI scoped to a region of your existing product.
The minimal agent that uses three of the four primitives. AI Gateway for the model call, Sandbox for tool execution, Flags for the kill switch.
import { generateText } from "ai";
import { gateway } from "@vercel/ai-gateway";
import { Sandbox } from "@vercel/sandbox";
import { flag } from "@vercel/flags";
const codeExecEnabled = flag({
key: "agent_code_exec",
defaultValue: false,
});
export async function runAgent(userPrompt: string, userId: string) {
const model = await flag({
key: "agent_model",
defaultValue: "anthropic/claude-sonnet-4.7",
})();
const result = await generateText({
model: gateway(model),
system: "You are a code agent. Use the run_code tool when needed.",
prompt: userPrompt,
tools: {
run_code: {
description: "Execute Python in a sandbox",
parameters: { code: "string" },
execute: async ({ code }) => {
if (!(await codeExecEnabled({ user: userId }))) {
return { error: "Code execution disabled for this user" };
}
const sandbox = await Sandbox.create({ runtime: "python3.12" });
const out = await sandbox.exec(code, { timeout: 30_000 });
await sandbox.destroy();
return out;
},
},
},
});
return result.text;
}
A few things worth pointing out.
The gateway(model) call returns a model object that the Vercel AI SDK already understands. There is no separate fetch client to manage. If the upstream provider 503s, the gateway transparently fails over to your configured fallback, which is set in the dashboard rather than in code. That is the right place for it because failover policy is an ops concern, not a code concern.
The flag evaluation happens at the edge with a typical latency of single-digit milliseconds. You can target by user ID, geography, cohort, or anything in the request. The agent_model flag in the example lets you do canary rollouts of new model versions to 5% of users without a deploy.
The Sandbox lifecycle is explicit. You create, you exec, you destroy. There is no ambient pool, which is good for predictability and bad if you are running thousands of short executions per second. For high-volume cases there is a Sandbox.persistent API that keeps a warm pool, but you pay for it.
Get the weekly deep dive
Tutorials on Claude Code, AI agents, and dev tools - delivered free every week.
From the archive
Apr 28, 2026 • 7 min read
Apr 28, 2026 • 10 min read
Apr 28, 2026 • 12 min read
Apr 28, 2026 • 11 min read
Pricing is not free of routing logic. AI Gateway adds a small markup on token costs and a per-request fee on top. For a high-volume product the math can flip the other way against direct provider keys. Run the numbers before you migrate everything.
Sandbox cold starts are real. First execution of a runtime image is 600-900 ms. Subsequent executions in the same sandbox are sub-100 ms. If your agent calls a tool once per turn and turns are infrequent, you eat the cold start every time. The persistent pool is worth it when execution rate exceeds about 1 per minute per user.
Flags evaluated client-side leak. If you ship a Next.js page that reads a flag in the browser, the flag value is in the response. Use server-side evaluation for anything sensitive - model selection, tool gating, anything cost-related. The SDK supports both modes; pick the right one consciously.
Microfrontends has the smallest agent story right now. It is genuinely useful for partitioning team ownership of a UI but the "agent renders a fragment" use case is more hype than substance today. There is no first-class generative UI primitive in the announcement; you can build one on top, but you are building it.
The right way to read this announcement is as a redrawing of the seams. Vercel is saying: model calls go through Gateway, code goes through Sandbox, behavior is controlled by Flags, UI is composed by Microfrontends. Everything else - orchestration, memory, evals, observability - is left to other tools or to your application code.
That is a defensible split. Orchestration frameworks (LangGraph, Mastra, the AI SDK itself) sit on top of Gateway. Memory layers (Mem0, Letta, your own pgvector) live alongside. Evals (Braintrust, Langfuse) consume the gateway's request logs. The platform takes the parts that have to be infrastructure and leaves the parts that benefit from competition.
Where I think it gets interesting for indie devs is the MCP angle. The natural complement to AI Gateway is a hosted MCP server registry - a place where you publish tools your agents can use, with auth and rate limits and observability. That is exactly what we built MCPaaS for: deploy an MCP server in one command, get a public endpoint, plug it into any agent runtime including Vercel's. The two stacks compose cleanly because Gateway treats MCP tools the same as any other tool call.
The other adjacent need is filesystem state for agents. Sandbox gives you ephemeral compute, but agents that work on files for hours need persistence and addressability. AgentFS is the DD product for that - a virtual filesystem with versioning that any sandbox or agent can mount. Vercel does not solve this problem and arguably should not; it is a different shape than what Sandbox is for.
I walked through the full stack composition on the Developers Digest YouTube channel including a live build of an agent that uses all four primitives plus an external MCP server.
A working pattern that has held up across three production agents I have shipped.
Define your model selection as a flag from day one. Even if you only have one model, wrap it. The day you want to A/B a new release or fail over to a cheaper model during a billing emergency you will be glad you did.
Treat Sandbox as the only place untrusted code runs, including code generated by your own agents. The temptation to "just eval this small Python snippet" in your Node process is the temptation that ends careers. The Sandbox primitive is cheap enough that there is no excuse.
Log every Gateway request to your own store, not just Vercel's. The gateway has good observability but vendor logs are not your logs. Pipe them into whatever you use for production telemetry. We pipe ours into a Postgres table partitioned by day and it is the single most useful debugging tool we have.
Use Flags for incident response, not just feature releases. When a model provider is degraded at 3am, the move is to flip a flag that routes around it, not to push a deploy. Build that muscle when nothing is on fire.
The open question is whether Vercel adds an opinionated orchestration layer. Right now they are deliberately neutral - the AI SDK is provider-agnostic, the gateway is router-only, and the rest is up to you. That is the right call for adoption but it leaves a gap that someone will fill. Either Vercel ships a Mastra-style framework as a first-party product or a third party becomes the default on top of the stack.
The other thing to watch is pricing on Sandbox at scale. Microvm execution is genuinely expensive infrastructure. Either the price comes down as utilization improves or the high-volume case migrates to Modal, E2B, or self-hosted Firecracker. Vercel's bet is that the convenience of one platform will hold most workloads inside it.
Either way, the abstraction is now named. If you are designing an agent stack in 2026, these are the four boxes to start from. You can swap the implementations later. You cannot easily swap the architecture.
Technical content at the intersection of AI and development. Building with AI agents, Claude Code, and modern dev tools - then showing you exactly how it works.
Headless browser built in Rust specifically for AI agents and web scraping. Lighter and faster than Chromium-based alter...
View ToolVercel's high-performance monorepo build system. Remote caching, task pipelines, and incremental builds. Drop into any p...
View ToolAnthropic's agentic coding CLI. Runs in your terminal, edits files autonomously, spawns sub-agents, and maintains memory...
View ToolVercel's generative UI tool. Describe a component, get production-ready React code with shadcn/ui and Tailwind. Iterate...
View ToolSpec out AI agents, run them overnight, wake up to a verified GitHub repo.
Open AppEvaluation harness for AI coding agents. Plus tier adds private benchmarks, CI hooks, and historical comparisons.
Open AppVirtualized filesystem on Neon for AI agents. $20/mo Plus.
Open AppWhat MCP servers are, how they work, and how to build your own in 5 minutes.
AI AgentsConfigure Claude Code for maximum productivity -- CLAUDE.md, sub-agents, MCP servers, and autonomous workflows.
AI AgentsStep-by-step guide to building an MCP server in TypeScript - from project setup to tool definitions, resource handling, testing, and deployment.
AI Agents
Durable execution lands on Vercel. What it means for agents, long-running flows, and indie dev stacks - with code, gotch...

Cloudflare Flagship is feature flags built for AI: model swaps, agent gates, and prompt rollouts as first-class primitiv...

One dev, one CLI, 24 subdomains, and a lot of parallel agents. The playbook for shipping an AI app portfolio.

Claude Code now has a native Loop feature for scheduling recurring prompts - from one-minute intervals to three-day wi...

State-of-the-art computer use, steerable thinking you can redirect mid-response, and a million tokens of context. GPT 5....

Anthropic dropped a batch of updates across Claude Code and Cowork - remote control from your phone, scheduled tasks,...

New tutorials, open-source projects, and deep dives on coding agents - delivered weekly.