AutoAgent found that when the meta-agent received only scores without trajectories, its improvement rate dropped hard. Understanding why something improved matters as much as knowing that it improved. Traces give the meta-agent interpretability over the task agent’s reasoning — that is what makes targeted edits possible rather than blind grid search.
This reinforces Decision traces are the missing data layer — a trillion-dollar gap from a new angle: traces aren’t just valuable for humans auditing agent behavior — they’re essential for agents improving other agents. The same principle applies to Observability is the missing discipline for agent systems — you can't improve what you can't measure: telemetry that only captures outcomes (success/failure, latency, cost) misses the reasoning layer that enables systematic improvement. Combined with Agents that store error patterns learn continuously without fine-tuning or retraining, the pattern is clear — the full trajectory of an agent’s reasoning is the most valuable artifact for continuous improvement, whether the improver is human or machine.