
TL;DR
AgentKit gives you Agent Builder, Connector Registry, and ChatKit. I rebuilt my newsletter-research agent on it. Here is where the visual canvas wins and where I bailed back to code.
| Resource | Link |
|---|---|
| OpenAI Platform Documentation | platform.openai.com/docs |
| OpenAI API Reference | platform.openai.com/docs/api-reference |
| OpenAI Agents Overview | platform.openai.com/docs/guides/agents |
| OpenAI Responses API | platform.openai.com/docs/api-reference/responses |
| OpenAI Pricing | openai.com/api/pricing |
OpenAI's AgentKit launch was three products dressed up as one announcement. If you treat it as a single thing, you will be confused. If you split it apart, each piece has a clear job:
For the design side of the same problem, read OpenAI Codex: Cloud AI Coding With GPT-5.3 with OpenAI vs Anthropic in 2026 - Models, Tools, and Developer Experience; they show how agent-generated interfaces fail and how to give coding agents better visual constraints.
The shared value prop is: stop writing the boring 60% of every agent app - auth, UI, glue code - and concentrate on the actual logic. The question is whether the visual canvas is a feature or a tax.
I rebuilt my newsletter-research agent on AgentKit over a weekend. Below is what worked, what did not, and the decision tree I now use.
My newsletter agent does four things: pull RSS feeds, scrape new articles, cluster by topic, draft a digest. Here is the Agent Builder flow I ended up with:
[Trigger: Webhook] -> [Tool: RSS Fetch] -> [LLM: Filter Relevance, gpt-5.5]
-> [Branch: relevance_score > 0.7?]
-> yes -> [Tool: Firecrawl Scrape] -> [LLM: Summarize, gpt-5.3]
-> [Tool: Embedding] -> [Tool: Cluster] -> [Human Approval]
-> [LLM: Draft Newsletter] -> [Tool: Send via Resend]
-> no -> [End]
Three things became immediately obvious in the visual canvas:
Branching is dramatically clearer than code. When I had this in TypeScript with nested ifs, the relevance branch was buried 80 lines deep. On the canvas it is a single yellow diamond. New collaborators understand the flow in 30 seconds.
Versioning is built-in. Every save creates a numbered version. I can fork v12 to test a new prompt, run it side-by-side with prod v11, and promote when evals pass. Doing this in code means git branches plus a feature-flag system. Builder gives it to you free.
Debugging is a timeline, not a log file. When a run fails, you click the failed node and see the exact prompt, the model response, the token count, and the tool I/O. No more console.log archaeology.
For a side-by-side comparison of how this looks in a Claude Code-flavored designer, see Subagent Studio - same visual-first thesis, different model ecosystem.
Get the weekly deep dive
Tutorials on Claude Code, AI agents, and dev tools - delivered free every week.
From the archive
Apr 29, 2026 • 10 min read
Apr 29, 2026 • 13 min read
Apr 29, 2026 • 11 min read
Apr 29, 2026 • 10 min read
After two weeks I have a clear pattern. The canvas wins when:
That last one is the sleeper feature. I was about to write Gmail OAuth for the newsletter agent. I deleted that ticket and used the Connector Registry's Gmail node. Token refresh, scope upgrade flow, error handling - all done.
Three places I bailed:
Custom embedding logic. My clustering uses a non-OpenAI embedding model (Voyage) plus a custom HDBSCAN. AgentKit's "custom tool" node lets you call an HTTP endpoint, but the round trip added 400ms per call and cost me a node on the canvas for what was a 20-line function. I exposed a single /cluster endpoint on my existing API and called it as one node. Canvas stayed clean, performance stayed good.
Tight loops. AgentKit nodes have per-execution overhead - roughly 100-200ms - that adds up if you are looping 50 times per run. My RSS fetch processes ~80 feeds. Doing that as 80 canvas iterations was wasteful. I batched the entire fetch into one custom-tool call and let my own code handle the loop.
Streaming token-level logic. If you need to react to tokens as they stream (e.g. to cut off generation early on a stop sequence), AgentKit's node abstraction hides that. Drop to the Responses API directly for those.
The pattern: Builder for the workflow, code for the hot loops and custom math. Same instinct as React server components - render the structure visually, push the heavy compute to a function.
ChatKit is the one I expected the least and got the most from. The basic embed:
import { ChatKit } from "@openai/chatkit-react";
export function NewsletterChat() {
return (
<ChatKit
agentId="agent_abc123"
apiKey={process.env.NEXT_PUBLIC_OPENAI_CHATKIT_KEY!}
theme={{
primary: "#FF4F8B",
background: "#FFF8EE",
font: "Geist",
}}
onToolCall={(call) => console.log("tool:", call.name)}
/>
);
}
That is the full integration. You get streaming, tool-call rendering, file upload, message history, and a polished UI that matches your brand tokens. Before/after on my newsletter agent: the "before" was a 600-line custom React chat component with three streaming bugs. "After" is the snippet above plus 40 lines of theme config.
The one gotcha: ChatKit's API key is a publishable key scoped to a single agent. Do not paste your standard OPENAI_API_KEY in the browser. Generate a ChatKit-specific key in the dashboard.
Is this a one-off internal automation?
-> Yes: AgentKit. The connector and approval nodes alone pay for themselves.
-> No: continue.
Will non-engineers review or edit the flow?
-> Yes: AgentKit. The canvas is the artifact they read.
-> No: continue.
Do you need bare-metal control over streaming or model parameters?
-> Yes: roll your own with the Responses API.
-> No: AgentKit, drop to code only for hot paths.
Is your orchestration multi-tenant, multi-region, or > 100 RPS?
-> Probably your own infra. AgentKit is fine for the first 90% - see
[DD Orchestrator](https://orchestrator.developersdigest.tech) for when
you need to own the runtime.
The honest answer for most builders shipping agent features in 2026: start in AgentKit, escape to code where it hurts. The "all visual" maximalists will hit walls; the "all code" purists are leaving days of OAuth plumbing on the table. The blended pattern wins.
For the full screen-recording walkthrough of building this newsletter agent on the canvas, the DevDigest YouTube channel has the AgentKit deep-dive. The canvas is one of those things where seeing it move beats reading about it.
AgentKit will not replace your code. It will replace the boring 60% of your code. That is enough.
AgentKit is three products in one: Agent Builder (a visual canvas for designing agent workflows), Connector Registry (managed OAuth integrations for Gmail, Slack, GitHub, Notion, and more), and ChatKit (an embeddable React component for agent UIs). Together they eliminate the repetitive 60% of agent development - auth, UI, and orchestration glue code.
AgentKit uses your existing OpenAI API credits. You pay per token for LLM calls (same rates as the Responses API), plus execution overhead for managed connectors. There is no separate AgentKit subscription. The Connector Registry connectors have rate limits tied to your API tier.
Use Agent Builder when your workflow has multiple branches, when non-engineers need to review the flow, when you version prompts frequently, or when you need human approval steps. Drop to code for tight loops (50+ iterations), custom embedding logic, streaming token-level control, or when per-node latency (100-200ms overhead) matters.
Every save creates a numbered version automatically. You can fork any version to test changes, run multiple versions side-by-side, and promote a tested version to production. This replaces the need for git branches plus feature flags for workflow iteration.
The launch registry includes Gmail, Slack, GitHub, Notion, Linear, Google Drive, and Calendar. Each connector handles OAuth, token refresh, and scope management. You configure credentials once in the dashboard and reference the connector in your flow nodes.
Import ChatKit from @openai/chatkit-react, pass your agent ID and a publishable ChatKit-specific API key (not your standard OPENAI_API_KEY), and customize the theme with your brand colors. A basic integration is under 20 lines of code and includes streaming, tool-call rendering, file upload, and message history.
Yes. Agent Builder has a "custom tool" node that calls any HTTP endpoint. For complex logic, expose a single endpoint on your own API and call it as one node. This keeps the canvas clean while letting your code handle heavy compute, custom models, or third-party integrations.
Each node adds 100-200ms execution overhead. For high-frequency loops, batch operations into a single custom-tool call. For multi-tenant or high-RPS deployments (100+ requests per second), you may need your own orchestration infrastructure. AgentKit works well for internal automations and moderate-scale production use.
Read next
Production-tested patterns for orchestrating AI agent teams - from fan-out parallelism to hierarchical delegation. Covers CrewAI, LangGraph, AutoGen, OpenAI Agents SDK, Google ADK, and custom approaches with real code.
14 min readFour mature, production-ready TypeScript frameworks have made building agents genuinely enjoyable. Here is how to pick the right one - and how they fit together.
10 min readCodex is no longer just a terminal agent. Here is when to use the Codex SDK, Codex CLI, or openai/codex-action, and how to avoid building the same agent loop three times.
8 min readTechnical content at the intersection of AI and development. Building with AI agents, Claude Code, and modern dev tools - then showing you exactly how it works.
OpenAI's coding agent for terminal, cloud, IDE, GitHub, Slack, and Linear workflows. Reads repos, edits files, runs comm...
View ToolLightweight Python framework for multi-agent systems. Agent handoffs, tool use, guardrails, tracing. Successor to the ex...
View ToolMulti-agent orchestration framework built on the OpenAI Agents SDK. Define agent roles, typed tools, and directional com...
View ToolThe TypeScript toolkit for building AI apps. Unified API across OpenAI, Anthropic, Google. Streaming, tool calling, stru...
View ToolSet up Codex Chronicle on macOS, manage permissions, and understand privacy, security, and troubleshooting.
Getting StartedClickable PR link in the footer with review state color coding.
Claude Code2.5x faster Opus at a higher token cost (research preview).
Claude CodeSame-day-verified llm api pricing june 2026: Claude Fable 5, GPT-5.5, Gemini 3.1 Pro, and DeepSeek V4 compared per milli...
GPT-5.4 vs Gemini 3.1 Pro vs DeepSeek V4: pricing, benchmarks, context behavior, and license terms for the mid-tier mode...
GPT-5.5 vs Claude Opus 4.8: both cost $5 per million input tokens, so the workhorse-tier decision comes down to output p...
Migrating off retired GPT models in 2026: the live retirement table, what maps to what, an eval-before-switch day plan,...
Anthropic added three new primitives to Claude Managed Agents in spring 2026 - dreaming, outcomes, and multi-agent orche...
The Codex changelog from April through June 2026 covers GPT-5.5, Goal mode going stable, Sites, a Chrome extension, Amaz...

New tutorials, open-source projects, and deep dives on coding agents - delivered weekly.