Ramp’s self-maintaining system watches every signal — 1,000+ auto-generated monitors covering new code on every PR merge — but applies aggressive filtering before anything reaches a human. An agent triages each alert, assesses scope, and only notifies when the issue is real and impactful. Noise gets silently handled: the monitor is tuned or deleted, with state stored on the monitor itself (PR link appended) to prevent duplicate alerts.
The principle is that detection breadth and notification selectivity must scale independently. Observability is the missing discipline for agent systems — you can't improve what you can't measure establishes that you need telemetry everywhere — this insight adds the architectural constraint that telemetry must NOT flow directly to humans. An agent intermediary that scopes, judges impact, and filters noise is what makes exhaustive monitoring sustainable rather than overwhelming. This connects to Production agents route routine cases through decision trees, reserving humans for complexity — the same routing principle (handle routine cases automatically, reserve humans for complexity) applies to monitoring output, not just request handling. And A mediocre agent inside a strong harness outperforms a stronger agent inside a messy one is validated here: the monitoring harness (triage step, sandbox reproduction, state-on-monitor dedup) matters more than which model does the debugging.