
TL;DR
OpenAI's June deprecations put Agent Builder, hosted Evals, and reusable prompts on a November 30 shutdown path. Here is the practical migration plan: Agents SDK, repo-owned prompts, and eval receipts.
OpenAI just made the agent-builder lesson explicit: production agent workflows need to live closer to code.
The official deprecations page now lists three June 3, 2026 deprecations that agent teams should not ignore:
v1/prompts API are scheduled to shut down on November 30, 2026.ChatKit remains available, and OpenAI points Agent Builder users toward the Agents SDK or ChatGPT Workspace Agents. For builders, the direction is clear: visual builders are useful for exploration, but the durable production surface is code, versioned prompts, and eval runs you can replay.
That does not make visual builders useless. It does mean you should not let your production agent logic live only in a hosted canvas.
Last updated: June 23, 2026
OpenAI's deprecations page is the source to read first. The relevant timeline:
| Surface | Deprecation announced | Read-only date | Shutdown date | Migration direction |
|---|---|---|---|---|
| Agent Builder | June 3, 2026 | Not listed | November 30, 2026 | Agents SDK or ChatGPT Workspace Agents |
| Evals platform | June 3, 2026 | October 31, 2026 | November 30, 2026 | Promptfoo or repo-owned eval workflows |
| Reusable prompts API | June 3, 2026 | Not listed | November 30, 2026 | Move prompt content into application code |
The Evals docs repeat the same point: the hosted Evals platform is being deprecated, existing eval content stays available during the transition window, and teams should look at alternatives if they are new to evaluations or want a more iterative environment.
This is not only a product cleanup. It changes the advice I would give any team building agents on OpenAI.
For the original builder-side take, read OpenAI AgentKit in Production. This post is the migration follow-up.
The lesson is not "never use visual tools."
The lesson is: do not let the only copy of your agent logic live in a hosted visual tool.
Production agents need the same boring properties as production code:
Hosted builders can accelerate discovery. They are great when a PM, designer, support lead, or ops teammate needs to see the shape of a workflow. They are less durable when they become the sole source of truth for branching logic, prompts, tool permissions, or evaluation criteria.
That is why this deprecation matters. It pushes the ecosystem toward a healthier split:
Before moving anything, list the state that currently lives outside the repo.
For Agent Builder, that usually means:
For Evals, it means:
For reusable prompt objects, it means:
Treat this like an API migration, not a copy-paste exercise. If a production service calls a prompt object by ID, the migration is not finished until that service reads a versioned prompt from code or config and has a rollback path.
Get the weekly deep dive
Tutorials on Claude Code, AI agents, and dev tools - delivered free every week.
From the archive
Jun 23, 2026 • 8 min read
Jun 23, 2026 • 7 min read
Jun 23, 2026 • 8 min read
Jun 23, 2026 • 7 min read
OpenAI's deprecation guidance for reusable prompt objects is blunt: move reusable prompt content into your application code.
That does not mean scattering giant strings through handlers.
Use a small repo convention:
agents/
support/
agent.ts
prompts/
system.md
escalation.md
evals/
fixtures.jsonl
rubric.md
The important part is not the exact folder name. The important part is that prompts get reviewed with the code that depends on them.
Good prompt files should include:
That pairs naturally with OpenAI Agents SDK for TypeScript, where agent definitions, tools, handoffs, guardrails, and structured outputs already live in code.
The Agents SDK is the obvious destination when the workflow is owned by engineers.
The current SDK docs emphasize a few primitives that map well from visual builders:
| Builder concept | Code-first replacement |
|---|---|
| node | function, tool, agent step, or handoff |
| branch | normal control flow or guardrail |
| human approval | human-in-the-loop checkpoint |
| connector | MCP server, hosted tool, or typed integration |
| visual run trace | SDK tracing and saved receipts |
| workflow version | git commit and deployment version |
This is not always a one-to-one migration. A visual canvas often has too many tiny nodes because each node is easy to add. Code lets you collapse low-value nodes into one function and expose only the actual decision points.
The migration rule:
Keep branch decisions explicit.
Batch mechanical steps into code.
Preserve approval gates.
Preserve tool permission boundaries.
Preserve trace receipts.
For the SDK-side architecture details, read Agents SDK Evolution and Managed Agents vs LangGraph vs DIY.
The Evals platform deprecation is the more important deadline for serious teams.
An agent without evals is just a workflow you hope still works.
When you move evals out of the hosted dashboard, do not only move the final score. Move the evidence. A useful eval receipt should include:
| Receipt field | Why it matters |
|---|---|
| fixture ID | ties the run to a stable test case |
| baseline version | prevents comparing against a moving target |
| candidate version | maps behavior to a branch or commit |
| model and tool config | explains why behavior changed |
| inputs and expected behavior | keeps the task reviewable |
| run trace | shows tool calls, retries, and decisions |
| score and rubric notes | separates correctness, safety, cost, and style |
| cost and latency | prevents expensive "wins" from hiding |
That is the point of baseline receipts. You are not trying to recreate a pretty dashboard first. You are trying to preserve enough evidence that a developer can replay the important claim.
OpenAI's docs point to Promptfoo as one migration path. That is reasonable if your evals are prompt and output focused. If your agent uses tools, files, browsers, sandboxes, or multi-step state, you may need a custom harness around the SDK so the eval can capture the whole run.
OpenAI's deprecations page says Agent Builder users can continue with the Agents SDK or ChatGPT Workspace Agents.
That split matters.
Use code-first Agents SDK when:
Use Workspace Agents when:
The mistake is treating those as interchangeable. They are not. One is a developer runtime. The other is a workspace automation surface.
There is a fair counterargument: code-first systems exclude the people who understand the process.
Support leaders, product managers, sales engineers, data analysts, and operations teams often know the workflow better than the engineer implementing it. A visual builder gives them a shared artifact. A TypeScript file does not.
That is why the answer is not "delete the canvas."
The better pattern is dual-surface:
The visual artifact explains the workflow. The repo owns the workflow.
That distinction becomes more important as agent systems grow. A diagram can show intent. Code and evals prove what actually runs.
Use the deprecation clock to force discipline:
Day 1: Inventory. Export every Agent Builder flow, hosted eval, prompt object, and service call that references those IDs.
Day 2: Freeze baselines. Save representative successful and failed runs before changing anything. Capture inputs, outputs, tool calls, cost, latency, and human notes.
Day 3: Move prompts. Put prompts in the repo with owners, review rules, and version history. Replace prompt-ID lookups with file or config loading.
Day 4: Rebuild the loop. Implement the agent in the Agents SDK, LangGraph, or your own loop. Preserve approval gates and tool boundaries first, then optimize.
Day 5: Replay evals. Run the old baseline against the new implementation. Do not ship until the candidate beats or matches baseline behavior on the cases that matter.
That is the practical standard. Not "we copied the graph." The standard is "we can prove the migrated agent behaves at least as well as the old one."
Yes. OpenAI's deprecations page says Agent Builder deprecation was announced on June 3, 2026, and Agent Builder is scheduled to shut down on November 30, 2026. ChatKit remains available.
The hosted Evals platform is being deprecated. OpenAI's docs say existing evals become read-only on October 31, 2026, and the platform is scheduled to shut down on November 30, 2026.
For product-owned workflows, move the production loop into the OpenAI Agents SDK, LangGraph, or a repo-owned agent loop. For internal workspace automations where non-engineer editing matters, evaluate ChatGPT Workspace Agents.
Move reusable prompt content into application code or repo-managed prompt files. Keep prompts versioned, reviewed, and tied to the eval fixtures that protect their behavior.
No. Visual builders remain useful for prototyping, design reviews, and non-engineer collaboration. The change is where production truth should live: in code, reviewed prompts, deploy history, and replayable eval receipts.
Read next
AgentKit gives you Agent Builder, Connector Registry, and ChatKit. I rebuilt my newsletter-research agent on it. Here is where the visual canvas wins and where I bailed back to code.
11 min readOpenAI released their Agents SDK for TypeScript with first-class support for tool calling, structured outputs, multi-agent coordination, streaming, and human-in-the-loop approvals. Here is how each piece works.
9 min readConfigurable memory, sandbox-aware orchestration, Codex-like filesystem tools. Here is how the new Agents SDK actually behaves in prod.
10 min readTechnical content at the intersection of AI and development. Building with AI agents, Claude Code, and modern dev tools - then showing you exactly how it works.
Lightweight Python framework for multi-agent systems. Agent handoffs, tool use, guardrails, tracing. Successor to the ex...
View ToolMulti-agent orchestration framework built on the OpenAI Agents SDK. Define agent roles, typed tools, and directional com...
View ToolOpenAI's coding agent for terminal, cloud, IDE, GitHub, Slack, and Linear workflows. Reads repos, edits files, runs comm...
View ToolOpenAI's flagship. GPT-4o for general use, o3 for reasoning, Codex for coding. 300M+ weekly users. Tasks, agents, web br...
View ToolTurn a one-liner into a working Claude Code skill. From idea to installed in a minute.
View AppRun hundreds of agent evals in parallel. Find regressions in minutes.
View AppCompare AI coding agents on reproducible tasks with scored, shareable runs.
View AppConfigure Claude Code for maximum productivity -- CLAUDE.md, sub-agents, MCP servers, and autonomous workflows.
AI AgentsDeep comparison of the top AI agent frameworks - LangGraph, CrewAI, Mastra, CopilotKit, AutoGen, and Claude Code.
AI AgentsA practical walk-through of how to design, write, and ship a Claude Code skill - from choosing when to trigger, through allowed-tools, to the steps the agent will actually follow.
Getting Started
Build Anything with Vercel, the Agentic Infrastructure Stack Check out Vercel: https://vercel.plug.dev/cwBLgfW The video shows a behind-the-scenes walkthrough of how the creator rapidly builds and d...

OpenAI Codex Desktop App: Plan/Goal Modes, Plugins, Multi-Agent Workflows & UI Annotation Demo The video showcases OpenAI’s Codex desktop app, which the creator calls OpenAI’s best product and a prem...

Open Design: Open-Source n8n App That Turns Any Website into a Brand Kit, Design System, HTML + Images The video introduces Open Design, an MIT-licensed full-stack template that combines AI and n8n a...

AgentKit gives you Agent Builder, Connector Registry, and ChatKit. I rebuilt my newsletter-research agent on it. Here is...

OpenAI released their Agents SDK for TypeScript with first-class support for tool calling, structured outputs, multi-age...

Configurable memory, sandbox-aware orchestration, Codex-like filesystem tools. Here is how the new Agents SDK actually b...

Hex's data-agent lab shows the practical eval pattern AI teams should copy: compare candidates against stable baselines,...

Migrating off retired GPT models in 2026: the live retirement table, what maps to what, an eval-before-switch day plan,...

Codex-Maxxing should mean bounded autonomy: AGENTS.md, small worktrees, explicit stop conditions, subagents only when wo...

New tutorials, open-source projects, and deep dives on coding agents - delivered weekly.