I let Codex run for 6 hours. Here’s what happened.

Categories: AI, Product

Summary

Codex's new /goal feature enables truly autonomous AI tasks by creating self-verifying loops that run for hours without human intervention—the host ran a 5+ hour coding task fully autonomously for the first time, solving the micromanagement problem that plagued previous AI coding agents.

Key Takeaways

  1. Goals differ fundamentally from prompts: instead of turn-based request-response cycles, goals create loops where AI checks its work, decides next steps, and continues until it gathers evidence the goal is met.
  2. The /goal feature solved the 'keep going, try the next thing' problem—if you find yourself repeatedly telling AI agents to continue, you need goals instead of manual prompting.
  3. Lifecycle management matters: use /goal start task, /goal to check status, /goal pause/resume to manage runaway tasks, and /goal remove to cancel—giving you control over multi-hour autonomous runs.
  4. Goal prompting requires product thinking, not engineering thinking: define measurable success criteria and outcomes rather than just tasks—success depends on well-defined, specific goal statements.
  5. Prior to /goal in Codex, the longest autonomous AI coding runs were impractical; the host achieved 5 hours 45 minutes on first use, indicating this is the breakthrough enabling overnight AI task execution.

Related topics

Transcript Excerpt

Welcome back to How I AI. I'm Claravo, product leader and AI obsessive here on a mission to help you build better with these new tools. Today, I'm going to walk through my favorite feature in my most recent favorite AI product, goals in codecs. If you've been wondering how all these people on the timeline are getting their AI to run quote unquote overnight or handle very complex longunning tasks, I'm going to show you goals is the answer. We're going to walk through what it is, how I might use it, and a technical use case along with some non-technical examples of how goals can help you even if you're not coding. Let's get to it. This episode is brought to you by Mercury. As an AI founder, I'm constantly tracking run rate, watching revenue growth, paying vendors, and making sure I'm getting…

More from How I AI Podcast