A common sentiment holds that stronger models will absorb the scaffolding around them, eventually making harnesses obsolete. Harrison Chase argues the opposite: the scaffolding needed in 2023 (RAG chains, LangGraph flows) has been replaced, not eliminated — by agent harnesses like Claude Code, Deep Agents, Codex, OpenCode, and Letta Code. The concrete evidence is that Claude Code’s leaked source code weighs 512k lines; even the makers of the best model in the world invest heavily in harness engineering. Web search “built into” OpenAI and Anthropic APIs is itself just a lightweight harness doing tool calling behind the curtain.
This reframes harness engineering as durable infrastructure, which connects to Agents learn at three distinct layers — model weights, harness code, and context configuration — harness is a first-class learning surface, not a transitional artifact. It also anchors A mediocre agent inside a strong harness outperforms a stronger agent inside a messy one: if harnesses persist, investing in their design is not premature optimization. For builders, the implication is to pick a harness stance deliberately rather than hoping the question disappears. It also explains why Engineering is no longer the junior partner — at the frontier, research and engineering have fused — if the harness, eval, and data pipeline are durable infrastructure rather than disposable scaffolding, the researcher who can build them is the one whose hypotheses actually get tested.