Codex /goal and Claude Managed Outcomes: The New Control Loops

There are two similar sounding directions to make long-running agents less flaky.

OpenAI's Codex CLI added /goal in version 0.128.0.
Anthropic introduced outcomes for Claude Managed Agents as a research preview.

They are both about keeping the loop going until quality is actually acceptable, but they solve it at different layers.

What changed with `/goal`

Codex's own changelog says 0.128.0 added persisted /goal workflows with app-server APIs, model tools, runtime continuation, and TUI controls to create, pause, resume, and clear goals (OpenAI Codex changelog).

That sounds simple in a headline, but the interesting part is the implementation shape:

It is in the command loop itself, not in your prompt alone.
A goal is durable across restarts.
The TUI can control the cycle (create, pause, resume, clear) and the CLI can continue work without you typing follow-up prompts each turn.
The release note implies model/tool and UI surfaces were added together, which usually means this is productized as a command-state feature, not just a clever instruction hack.

The older pattern was: send a goal, let the model act a bit, stop, send next command. /goal is trying to invert that pattern so it keeps iterating in one execution envelope until stop criteria are met.

Why this matters operationally

The old problem is usually one of loop boundary leakage:

You ask for a non-trivial task.
The agent does multiple shell/tool steps.
Either it runs out of budget or gets into a suboptimal partial path.
You do not have a clean way to continue from state without repeating context.

A persisted goal narrows this by formalizing loop continuation and reducing "human re-entry overhead."

Where Codex `/goal` likely shines

From the release context and existing Codex command model:

Terminal-native development workflows: if you are doing hands-on repo work, compile checks, and iterative shell-driven repair, command-level persistence is a direct fit.
Plan-mode and interruption semantics: the same release also references plan-mode nudges and /statusline//title editing during active turns, which points to a richer TUI-centered workflow control plane.
Feature-flagged evolution: the 0.128.0 bullet list reads like staged rollout and feature gating. In practice, that is good for enterprise operators who want controlled enabling.

I read that as: /goal is primarily a tooling-loop enhancement around coding agent endurance.

Get the weekly deep dive

Tutorials on Claude Code, AI agents, and dev tools - delivered free every week.

From the archive

DeepSeek V4 Changes the Coding Agent Cost Equation

May 2, 2026 • 8 min read

Flue and the Agent Harness Layer

May 2, 2026 • 8 min read

GitHub Copilot Coding Agent and CLI: Why GitHub Is Back in the Agent Race

May 2, 2026 • 8 min read

jcode and the Coding Agent Harness Wars

May 2, 2026 • 8 min read

Claude outcomes: rubric-driven task closure

Claude managed agents exposes outcomes as research preview, where you define what "done" looks like and the system works toward that target with a grader loop.

The managed agents documentation says outcomes let you tell the agent what "done" looks like, then evaluate per-criterion grading in a separate context window until the outcome is satisfied or max iterations are hit (Claude managed agents outcomes).

Key details in that page:

Outcomes are explicitly "Research Preview" and require the managed-agents preview beta header when used with the API.
A rubric is required, as markdown text or uploaded file.
The grader returns structured results per criterion and emits explicit outcome status (satisfied, needs_revision, max_iterations_reached, failed).
You can chain outcomes one after another in a session.

This is not just "keep looping." It is close-loop evaluation with explicit quality criteria.

Real comparison: control primitives

Let's compare from a design perspective.

1) What is the stopping rule?

/goal (Codex): runtime/command-oriented termination via agent loop and manual controls (pause, clear, budget limits in feature context). It sounds like loop completion is driven by model judgment plus command state.
outcome (Claude): outcome status is externally graded against rubric criteria in separate context. That makes termination a function of measured rubric satisfaction.

2) Where does quality live?

/goal quality is implicit, shaped by your prompt and agent context.
outcomes quality is explicit, shaped by rubric design and evaluator output.

3) Operational friction

/goal integrates with existing Codex sessions and CLI continuity (especially useful when you already live inside terminal loop).
outcomes integrates with managed-agent sessions and Files API event stream, and uses event telemetry (span.outcome_evaluation_*) that is useful for observability and audit.

4) Infrastructure complexity

/goal is a command feature and likely lighter to adopt if you already standardize around Codex in a repo.
outcomes demands rubric infrastructure and managed-agent API headers, but gives better reporting for quality-critical workflows.

Novel examples that reveal the difference

Example 1: Large-scale migration with build validation

Use /goal when: you need persistent CLI execution with many shell passes.

Goal text: "Migrate all API v1 clients to v2 and keep tests green."
Agent keeps running: search + patch + run tests + patch again.
Human intervention points: pause, status, final diff.

This is the right shape when your objective is operational execution and tool orchestration speed.

Example 2: Financial model generation from SEC filings

Use outcomes when: you need objective quality checks.

Rubric includes explicit data source, assumption statement fields, forecast horizon, and file structure.
Agent writes output artifacts and grader checks each criterion.
Failure gives exact rubric gaps to revise.

This is the right shape when acceptance is judgment-heavy and you need repeatability.

Example 3: Product support playbook from a messy codebase

Hybrid approach:

goal: first pass that extracts stack trace clusters and prepares candidate fixes.
outcome: second pass with rubric requiring reproduction steps, regression test, and evidence artifact links.

This gives the endurance of /goal plus rubric-level correctness from outcomes.

Common mistakes

Treating goal as output quality control

/goal is excellent for keeping work moving, but without explicit criteria it can optimize for forward motion over quality nuance.

Treating outcomes as "just autopilot"

Outcomes still depend on rubric design. Bad rubric = bad stopping decision.

Ignoring token budgets / iteration caps

Codex has token/continuation limits in feature work; outcomes has max iterations and explicit failed/max_iterations_reached result states.

Not version-gating

Outcomes is explicitly research preview. Plan for fallback runbooks.

Practical migration map

If you are currently on Codex-only loops, start with:

Enable and test /goal in staging workspace.
Measure average iterations, interruption frequency, and budget exhaustion events.
Add manual checkpoint artifacts after each successful loop.

If you then add managed-agent workloads:

Define two rubric templates: a minimal "safety/format" rubric and a full "business quality" rubric.
Prefer rubric templates as versioned files in a session-level directory.
Emit outcome IDs and evaluation summaries to your telemetry store.

If you are choosing where to start right now:

Need immediate coding loop resilience in terminal sessions? /goal.
Need auditable deliverable quality in autonomous tasks? outcomes.

So what's the real difference?

This is the sharp distinction:

Codex /goal = loop state as runtime control.
Claude outcomes = loop state as quality control contract.

They are converging, but they are not redundant yet.

For teams building production automations, the highest-leverage stack is often both:

Use /goal for "keep going and recover from interruption."
Use outcomes when handoff quality must be measurable and rubric-traceable.

Sources

OpenAI Codex changelog entry for 0.128.0 (persisted /goal workflows and related items): https://developers.openai.com/codex/changelog
Codex CLI slash commands overview: https://developers.openai.com/codex/cli/slash-commands
Mintlify Codex slash command listing for command context: https://www.mintlify.com/openai/codex/features/slash-commands
Claude managed agents define outcomes (research preview): https://platform.claude.com/docs/en/managed-agents/define-outcomes

Codex /goal and Claude Managed Outcomes: The New Control Loops

Flue: The Agent Harness Framework and Why It Feels Different

OpenAI Agents SDK for TypeScript: A Practical Guide

Claude Code vs Codex vs Cursor vs OpenCode: Which Agent Ships More Code?

What changed with /goal

Why this matters operationally

Where Codex /goal likely shines