
TL;DR
A deep comparison of Codex's new /goal loop and Claude managed agents outcomes, with practical workflow examples, control tradeoffs, and migration guidance for long-running tasks.
Direct answer
A deep comparison of Codex's new /goal loop and Claude managed agents outcomes, with practical workflow examples, control tradeoffs, and migration guidance for long-running tasks.
Best for
Developers comparing real tool tradeoffs before choosing a stack.
Covers
Verdict, tradeoffs, pricing signals, workflow fit, and related alternatives.
Last updated: June 11, 2026. Every version, availability, and plan-gating claim below was re-verified against the official sources in this table on June 11, 2026.
| Source | What to verify |
|---|---|
| OpenAI Codex changelog | Release notes for Codex CLI including /goal and persisted workflow behavior |
| OpenAI Codex docs hub | Current Codex product surfaces (web, CLI, cloud tasks, integrations) |
| Codex CLI slash commands | /goal command reference and related CLI control commands |
| OpenAI Codex pricing | Current pricing and plan details |
| Claude Managed Agents overview | Beta status, required beta header, and availability for managed agents |
| Claude outcomes docs | Outcomes definitions, rubric shape, iteration caps, and result states |
| Anthropic pricing | Current pricing and plan details |
There are two similar sounding directions to make long-running agents less flaky.
/goal in version 0.128.0, and it is now a documented core slash command (Codex CLI is at 0.139.0 as of June 9, 2026, verified June 11, 2026 via the Codex changelog).They are both about keeping the loop going until quality is actually acceptable, but they solve it at different layers.
If you are here to choose a workflow, the short answer is:
/goal when you want a coding agent to keep making progress inside a terminal session, especially across repo edits, tests, retries, and interruptions.For adjacent decisions, read the broader Codex vs Claude Code comparison, the Claude Code vs Codex side-by-side page, the April Codex changelog analysis, and the AI agent frameworks guide. If you are asking whether Codex can handle tasks beyond code, read Codex as a general-purpose AI agent. If cost is the deciding factor, start with the pricing hub.
If you want to practice the execution side instead of just comparing control loops, the Agentic Coding course covers decomposition, multi-agent workflows, and production agentic development patterns.
/goalCodex's own changelog says 0.128.0 added persisted /goal workflows with app-server APIs, model tools, runtime continuation, and TUI controls to create, pause, resume, and clear goals (OpenAI Codex changelog).
Since then, /goal has stabilized into a documented core command. The slash commands reference lists /goal <objective>, /goal pause, /goal resume, and /goal clear, with goal objectives capped at 4,000 characters (verified June 11, 2026). Releases 0.137.0 and 0.138.0 kept tuning the loop with templates for goal steering prompts, idle continuation, and fixes that stop goals auto-continuing after terminal turn failures.
That sounds simple in a headline, but the interesting part is the implementation shape:
create, pause, resume, clear) and the CLI can continue work without you typing follow-up prompts each turn.The older pattern was: send a goal, let the model act a bit, stop, send next command. /goal is trying to invert that pattern so it keeps iterating in one execution envelope until stop criteria are met.
The old problem is usually one of loop boundary leakage:
A persisted goal narrows this by formalizing loop continuation and reducing "human re-entry overhead."
/goal likely shinesFrom the release context and existing Codex command model:
/statusline//title editing during active turns, which points to a richer TUI-centered workflow control plane./goal ships in the standard CLI command set rather than behind a plan gate, while Codex plan gating applies to cloud surfaces (cloud tasks, code reviews, and integrations require ChatGPT Plus or above per the pricing page, verified June 11, 2026).I read that as: /goal is primarily a tooling-loop enhancement around coding agent endurance.
Get the weekly deep dive
Tutorials on Claude Code, AI agents, and dev tools - delivered free every week.
From the archive
May 2, 2026 • 8 min read
May 2, 2026 • 8 min read
May 2, 2026 • 8 min read
May 2, 2026 • 7 min read
Claude Managed Agents ships outcomes as part of its public beta, where you define what "done" looks like and the system works toward that target with a grader loop.
The managed agents documentation says outcomes let you tell the agent what "done" looks like, then evaluate per-criterion grading in a separate context window until the outcome is satisfied or max iterations are hit (Claude managed agents outcomes, verified June 11, 2026).
Key details in that page:
managed-agents-2026-04-01 beta header (the SDK sets it automatically).max_iterations is optional, defaulting to 3 with a cap of 20.satisfied, needs_revision, max_iterations_reached, failed, or interrupted).This is not just "keep looping." It is close-loop evaluation with explicit quality criteria.
Let's compare from a design perspective.
/goal (Codex): runtime/command-oriented termination via agent loop and manual controls (pause, clear, budget limits in feature context). It sounds like loop completion is driven by model judgment plus command state.outcome (Claude): outcome status is externally graded against rubric criteria in separate context. That makes termination a function of measured rubric satisfaction./goal quality is implicit, shaped by your prompt and agent context./goal integrates with existing Codex sessions and CLI continuity (especially useful when you already live inside terminal loop).span.outcome_evaluation_*) that is useful for observability and audit./goal is a command feature and likely lighter to adopt if you already standardize around Codex in a repo.Use /goal when: you need persistent CLI execution with many shell passes.
pause, status, final diff.This is the right shape when your objective is operational execution and tool orchestration speed.
Use outcomes when: you need objective quality checks.
This is the right shape when acceptance is judgment-heavy and you need repeatability.
Hybrid approach:
goal: first pass that extracts stack trace clusters and prepares candidate fixes.outcome: second pass with rubric requiring reproduction steps, regression test, and evidence artifact links.This gives the endurance of /goal plus rubric-level correctness from outcomes.
goal as output quality control/goal is excellent for keeping work moving, but without explicit criteria it can optimize for forward motion over quality nuance.
Outcomes still depend on rubric design. Bad rubric = bad stopping decision.
Codex has token/continuation limits in feature work; outcomes has max iterations (default 3, max 20) and explicit failed/max_iterations_reached result states.
Managed Agents is still in beta, and Anthropic notes behaviors may be refined between releases. Plan for fallback runbooks.
If you are currently on Codex-only loops, start with:
/goal in staging workspace.If you then add managed-agent workloads:
If you are choosing where to start right now:
/goal.This is the sharp distinction:
/goal = loop state as runtime control.They are converging, but they are not redundant yet.
For teams building production automations, the highest-leverage stack is often both:
/goal for "keep going and recover from interruption."/goal graduated from a 0.128.0 feature rollout into a documented core slash command with pause, resume, and clear plus a 4,000 character objective cap, and Codex CLI reached 0.139.0 on June 9, 2026 (verified June 11, 2026 via the Codex changelog and slash commands reference)./goal edit multiline paste and stopped goals auto-continuing after terminal turn failures. The June 2026 Codex changelog breakdown covers the full release train.managed-agents-2026-04-01 header, outcomes documents a default of 3 and max of 20 iterations plus a fifth interrupted result, and only MCP tunnels and dreaming remain gated research previews (verified June 11, 2026 via the Managed Agents overview).Codex /goal is a runtime control feature built around persisted workflows, runtime continuation, and TUI controls for creating, pausing, resuming, and clearing goal state. Claude managed outcomes is a quality control feature that uses explicit rubrics to grade whether work meets acceptance criteria before stopping. Use /goal for persistent execution, outcomes for measurable deliverables.
Use Codex /goal when your task is terminal-native development work like migrations, test fixes, or repo refactoring where the primary need is durable continuation through repo edits and shell-driven repair cycles. If your task needs an explicit acceptance rubric or audit trail, use Claude outcomes instead.
Yes, a hybrid approach is often the best choice for long-running, high-stakes work. Use Codex /goal for the execution phase where the agent needs to keep making progress across shell commands and test cycles. Then use Claude outcomes as a final quality gate with explicit rubric criteria. This gives you both execution endurance and measurable correctness.
Start by testing /goal in a staging workspace to measure iteration count, interruption frequency, and budget exhaustion. Add manual checkpoint artifacts after each loop. When adding managed-agent workloads, define rubric templates as versioned files: one minimal safety/format rubric and one full business quality rubric. Emit outcome IDs and evaluation summaries to your telemetry store.
The main limitation is that /goal optimizes for forward motion, not output quality. Without explicit acceptance criteria, it can complete work that passes tests but misses quality nuance. It also has token and continuation limits that may halt complex tasks, and goal objectives are capped at 4,000 characters (point the goal at a file for longer instructions). For quality-critical workflows, pair /goal with a rubric-based quality check or use Claude outcomes.
Claude managed outcomes depends entirely on rubric design - a bad rubric leads to bad stopping decisions. The platform is also still in beta, so you should plan for fallback runbooks. The managed-agent API requires the managed-agents-2026-04-01 beta header and has max iteration limits (default 3, max 20). When the grader returns max_iterations_reached or failed, you need a recovery path.
For production, the highest-leverage stack is often both: use Codex /goal for "keep going and recover from interruption" during execution, then use Claude outcomes when handoff quality must be measurable and rubric-traceable. This combines the execution resilience of /goal with the quality assurance of rubric-graded outcomes.
The public docs do not provide a clean apples-to-apples price formula. Codex is bundled with ChatGPT plans (Go at $8/month, Plus at $20/month, Pro from $100/month, Business pay as you go), with heavier usage drawing from credits, so treat Codex /goal cost as the underlying session and model usage on your plan and check OpenAI Codex pricing before estimating. Treat Claude outcomes as managed-agent API usage plus outcome evaluation usage, then check Claude API pricing. For budget-sensitive work, start with the AI coding tools pricing comparison and the pricing hub.
Read next
OpenAI is turning Codex from a coding assistant into a broader agent workspace for files, apps, browser QA, images, automations, and repeatable knowledge work.
8 min readA practical architecture for multi-step Claude agents. Loop patterns, state management, error recovery, and the production gotchas that turn a five-step demo into a 20 percent success rate at scale.
11 min readA long-form technical read on Flue from Fred K Schott, with deeper comparisons against OpenAI Agents, Vercel AI SDK, Google ADK, LangChain, Deep Agents, and CrewAI, plus practical production patterns.
24 min readTechnical content at the intersection of AI and development. Building with AI agents, Claude Code, and modern dev tools - then showing you exactly how it works.
Mac app for running parallel Claude Code, Codex, and Cursor agents in isolated workspaces. Watch every agent work at onc...
View ToolAnthropic's Python SDK for building production agent systems. Tool use, guardrails, agent handoffs, and orchestration. R...
View ToolMulti-agent orchestration framework built on the OpenAI Agents SDK. Define agent roles, typed tools, and directional com...
View ToolAnthropic's agentic coding CLI. Runs in your terminal, edits files autonomously, spawns sub-agents, and maintains memory...
View ToolEvery coding agent in one window. Stop alt-tabbing between Claude, Codex, and Cursor.
View AppTurn a one-liner into a working Claude Code skill. From idea to installed in a minute.
View AppUnlock pro skills and share private collections with your team.
View AppAdmin-controlled allow and deny lists for MCP servers.
Claude CodeA complete, citation-backed Claude Code course with setup, prompting systems, MCP, CI, security, cost controls, and capstone workflows.
ai-developmentLow, medium, high, xhigh, and max for adaptive reasoning control.
Claude Code
Nimbalyst Demo: A Visual Workspace for Codex + Claude Code with Kanban, Plans, and AI Commits Try it: https://nimbalyst.com/ Star Repo Here: https://github.com/Nimbalyst/nimbalyst This video demos N...

Open Design: Open-Source n8n App That Turns Any Website into a Brand Kit, Design System, HTML + Images The video introduces Open Design, an MIT-licensed full-stack template that combines AI and n8n a...

OpenAI Codex Desktop App: Plan/Goal Modes, Plugins, Multi-Agent Workflows & UI Annotation Demo The video showcases OpenAI’s Codex desktop app, which the creator calls OpenAI’s best product and a prem...

OpenAI is turning Codex from a coding assistant into a broader agent workspace for files, apps, browser QA, images, auto...

A practical architecture for multi-step Claude agents. Loop patterns, state management, error recovery, and the producti...

A long-form technical read on Flue from Fred K Schott, with deeper comparisons against OpenAI Agents, Vercel AI SDK, Goo...

OpenAI released their Agents SDK for TypeScript with first-class support for tool calling, structured outputs, multi-age...

Claude Managed Agents now have multiagent sessions, outcomes, webhooks, and vault events. The practical takeaway is not...

Codex automations are useful when recurring engineering work has clear inputs, reviewable outputs, and safe boundaries....

New tutorials, open-source projects, and deep dives on coding agents - delivered weekly.