Schmidt gives three tests for whether you’re safely off the Yellow Brick Road. The tools-and-steps test: few steps with a forgiving outcome (a Drive search) is lab territory; dozens of steps that must clear partner review (a legal redline against three years of precedent) is not. The system test is the sharpest: are you a system the customer runs their work through, or a tool that sits on top of a system they already have? Its litmus question — “would the customer still need you if a lab shipped something that supposedly directly competes with you? If yes, you’re a system. If no, you’re a tool — even if your ACV is high.” High ACV signals a system but doesn’t guarantee one.
The P&L / hedge-fund test: the customer judges you on their own P&L (did the deal close, did the policy bind), not on SWE-Bench or MMLU — “winning on alpha measured in customer P&L, not in benchmark scores.” These operationalize The system of work is the moat, not the model — the model is fungible underneath and Sell the work, not the tool — model improvements compound for services, against software, and the system test is the practical inverse of Frontier companies absorb every useful agentic pattern into their products — if a lab shipping the feature would kill you, you were on the road.