Both shipped within days of each other.
Same command name, same general idea.
Used both enough this week to have actual opinions. Short version: they’re not the same thing in how they work.
What even is /goal
If you use Claude Code or Codex, you know the loop. Task starts, runs for a bit, stops. You type “continue.” It runs again. You type “continue” again. For anything non-trivial that’s 30 manual prompts to finish one thing. /goal kills that loop.
You write one condition — what done looks like — and the tool keeps working across turns until that condition is met. You don’t sit there prompting. You don’t watch it. You walk away and come back to a finished result.
OpenAI shipped it in Codex CLI in late April 2026. Anthropic followed in Claude Code shortly after. Same name, different approaches.
What each one does differently
Codex attaches the goal to the current thread and tracks it while the task runs. Close your laptop, reopen it, /goal resume — it’s still there. Persisted locally. More like pinning a long-term target and leaving it to run autonomously.
Claude Code is more opinionated about it. After every turn, a small fast model — Haiku by default — reads the transcript and answers one question: is the goal met? If no, Claude starts the next turn automatically. If yes, it stops.
So: Codex /goal — attach a long-term target to the thread, experimental, hands-off. Claude Code /goal — set a verifiable stop condition, evaluated every turn, auto-continues until done.
What that actually feels like
Codex wins on autonomy. Longer reliable runs, fewer interruptions, better at true background execution. If you want to walk away for two hours — Codex is more ready for that right now.
Claude Code is smarter about whether it’s done, but tends to need more steering. More check-ins, more moments where it wants input before continuing. The evaluator loop is genuinely good, it’s just more collaborative than fully autonomous.
Neither is “set it and forget it” in the way people imagine. But Codex gets closer.
The important stuff - the condition is everything
Vague goal = loop that never resolves = token bill that hurts.
Bad: “Clean up the codebase.” Nothing to evaluate. Runs forever.
Good: “Migrate the payment module to the new API, tests pass, git diff only touches payment-related files. Stop after 20 turns if not done.”
Three things every good condition needs —
- a clear completed state
- a way to verify it
- and, what not to touch.
The turn limit matters too. Without it, a goal that can’t resolve just burns tokens until you notice.
Use cases where it actually removes work
The ones where it clicked for me:
- Messy file backlog — had 60+ images/PDFs sitting in a folder for months. Set a /goal: extract date, amount, category from each, build a clean expenses.csv, generate a spend summary by category and month. Came back to a finished CSV. Zero prompting.
- Scraping and categorising — give it a list of URLs or a folder of raw data (used it for a content analaysis of 200+ videos including scripts, description and engagement metrics) set a condition for what the clean output looks like. It works through the whole thing without you babysitting it.
Not a good fit: requirement is still fuzzy, task needs judgment calls, touches risky files or “done” is subjective. If you can’t write what done looks like, don’t use /goal.
The actual shift happening here
/goal is a small feature that points at something bigger. These tools are moving from interactive assistants to continuously executable work units. Before, you stayed nearby. It got stuck, you prompted. It finished a step, you told it to continue. /goal compresses that into a completion condition and lets the agent figure out the turns.
The catch: writing a good /goal is harder than writing a prompt. You’re not just describing a task — you’re defining what done means, how to prove it, and what to leave alone. Job shifts from “keep telling it to continue” to “define what done means.” Bigger ask. Better use of your time.