Bustamante identifies the accuracy cliff that kills AI products: “the gap between 80% and 99% accuracy is often infinite in practice.” The first 80% is achievable with a good model, reasonable prompts, and a compelling demo. The last 19% requires edge-case engineering, domain-specific guardrails, human-in-the-loop escalation, robust error handling, and production telemetry — fundamentally different work than what got you to the demo.
This maps directly to Model-market fit comes before product-market fit — without it, no amount of product excellence drives adoption: the 80% demo creates the illusion of model-market fit, but the gap to 99% reveals whether the model can actually serve the market’s real requirements. Regulated industries (legal, finance, healthcare) sit squarely in this gap — 80% is a liability, not a product. The implication for engineering teams is that Verification is the single highest-leverage practice for agent-assisted coding is not optional optimization but survival infrastructure: without systematic verification, you never know where you actually sit on the 80-to-99 spectrum. Teams that ship the 80% demo as a product discover the gap through customer churn, not internal metrics. The structural reason this gap persists is that Verification is a Red Queen race — optimizing against a fixed eval contaminates it — the eval suites used to measure the gap degrade as the agent optimizes against them, making the 80/99 gap a permanently moving target rather than a fixed distance to close.