AI Product Building AI Agents Coding Tools

Autonomous coding loops need small stories and fast feedback to work

The Ralph pattern ships 13 user stories in 1 hour by decomposing into context-window-sized tasks with explicit acceptance criteria and test-based feedback

Ryan Carson — Ralph / Autonomous Coding Loop · Jan 15, 2025 · 18 connections

The Ralph pattern (originally created by Geoff Huntley, popularized by Ryan Carson) — a bash loop that reads a task list, implements, tests, commits, and repeats — shipped 13 user stories in 1 hour. But the loop itself is trivial. The real insight is what makes it work: stories small enough to fit in one context window, explicit acceptance criteria, and test suites that provide fast binary feedback (pass/fail) — a direct application of why Verification is the single highest-leverage practice for agent-assisted coding. But the quality of those test suites matters: Evaluate agent tools with real multi-step tasks, not toy single-call examples shows that toy single-step checks miss the integration flaws that chained, multi-step tasks expose.

Each iteration starts with fresh context (reading updated prd.json and progress.txt), which directly addresses The context window is the fundamental constraint — everything else follows — the agent never accumulates stale context because it resets every cycle. This same principle applies within a single session: Separate research from implementation to preserve context quality argues that even research and implementation should run in separate sessions, because exploration tokens pollute the implementation context. The learnings from each iteration compound via progress.txt, connecting to Compound engineering makes each unit of work improve all future work.

The advanced version (nightly two-part loop) shows the full vision: a compound review job extracts learnings into AGENTS.md, then an auto-compound job implements the top priority using those fresh learnings. You wake up to draft PRs. This is Persistent agent memory preserves institutional knowledge that walks out the door with employees made autonomous — the agent both learns and applies its learnings without human intervention. Karpathy’s Autoresearch adds a sharper framing: Rollback safety nets enable autonomous iteration — not model intelligence — the key enabler isn’t agent intelligence but automatic rollback that makes failures free. And Time-bounded evaluation forces optimization for real-world usefulness instead of idealized performance shows that fixed time budgets per experiment force agents to optimize for real-world throughput rather than idealized quality.

Connected Insights

References (8)

→ Persistent agent memory preserves institutional knowledge that walks out the door with employees → Compound engineering makes each unit of work improve all future work → The context window is the fundamental constraint — everything else follows → Evaluate agent tools with real multi-step tasks, not toy single-call examples → Separate research from implementation to preserve context quality → Verification is the single highest-leverage practice for agent-assisted coding → Rollback safety nets enable autonomous iteration — not model intelligence → Time-bounded evaluation forces optimization for real-world usefulness instead of idealized performance

Referenced by (10)

← One session per contract beats long-running agent sessions ← Parallel agents create a management problem, not a coding problem ← Spec files are external memory that survives context resets ← Property-based testing explores agent input spaces that example-based tests miss ← A mediocre agent inside a strong harness outperforms a stronger agent inside a messy one ← Rollback safety nets enable autonomous iteration — not model intelligence ← Time-bounded evaluation forces optimization for real-world usefulness instead of idealized performance ← Adversarial branch-walking beats review for planning — walk every design branch until resolved ← Unfocused agents develop path dependency — without a specific mission, they explore the same paths repeatedly ← Meta-agents that autonomously optimize task agents beat hand-engineered harnesses on production benchmarks