AI Product Building Coding Tools AI Agents

The context window is the fundamental constraint — everything else follows

Every best practice in AI coding (subagents, /clear, focused tasks, specs files) traces back to managing a single scarce resource: context

Anthropic Official Best Practices · Jan 15, 2025 · 22 connections

Claude’s 200k token context window is significantly less than 200k usable before performance degrades. This single constraint explains every effective AI coding practice: subagents isolate research into separate context, /clear resets between unrelated tasks, specs files persist knowledge across context resets, and one-objective-per-conversation keeps focus sharp.

The concept of “context engineering” — a level beyond prompt engineering — captures this well. Where Context is the product, not the model argues that context is the product differentiator, here the claim is more operational: context is the bottleneck you’re always managing. Treat it like RAM — precious, limited, and requiring active garbage collection. This is the basis for Treat an agent as an operating system, not a stateless function — the full OS mental model (RAM as context, hard drive as persistent memory, garbage collection as decay) turns context management from ad-hoc to systematic. The economics compound the constraint: without Prompt caching makes long context economically viable, every conversation turn resends the full context at full price, making large context windows prohibitively expensive in practice.

The practical implication: common failure patterns (kitchen-sink sessions, correction loops, over-specified CLAUDE.md, trust-then-verify gaps, infinite exploration) are all context management failures. Fixing any of them means respecting the fundamental constraint. This is why Declarative beats imperative when working with agents matters operationally — verbose step-by-step instructions waste tokens that could carry actual working context. The most direct architectural response is One session per contract beats long-running agent sessions — using task contracts as session boundaries so cross-contract context bloat never accumulates in the first place.

Connected Insights

References (5)

→ Context is the product, not the model → Declarative beats imperative when working with agents → One session per contract beats long-running agent sessions → Prompt caching makes long context economically viable → Treat an agent as an operating system, not a stateless function

Referenced by (17)

← Compression should be a forking lifecycle event, not a destructive rewrite ← Separate tool registration from tool exposure — install broadly, reveal narrowly ← Spec files are external memory that survives context resets ← Agentic search beats RAG for live codebases ← Skill graphs enable progressive disclosure for complex domains ← Prompt caching makes long context economically viable ← Autonomous coding loops need small stories and fast feedback to work ← Treat AI like a distributed team, not a single assistant ← Treat an agent as an operating system, not a stateless function ← Tiered retrieval prevents context overload — summaries first, details on demand ← Tools are a new kind of software — contracts between deterministic systems and non-deterministic agents ← Tool design is continuous observation — see like an agent ← Context inefficiency compounds three penalties: cost, latency, and quality degradation ← CLAUDE.md should be a routing table, not a knowledge base ← Time-bounded evaluation forces optimization for real-world usefulness instead of idealized performance ← Uncorrelated context windows are a form of test time compute — fresh perspectives multiply capability ← Reasoning evaporation permanently destroys agent decision chains when the context window closes