All insights
AI Product Building AI Agents Architecture

Safety enforcement belongs in tool design, not system prompts

At scale, embedding safety constraints in the tool's API (blocking destructive operations by default) beats relying on behavioral compliance with system prompt instructions

@nicbstme — Lessons from Reverse Engineering Excel AI Agents · · 4 connections

Reverse-engineering three production Excel AI agents revealed a critical architectural divergence in safety. Claude embeds overwrite protection at the API level — “the blocking is in the tool, the consent is in the prompt” — while Microsoft Copilot has no overwrite protection at all (it just overwrites) and Shortcut AI relies on system prompt instructions to prevent destructive operations. Behavioral compliance is inherently unreliable at millions of sessions; tool-enforced safety is deterministic.

This is a concrete example of Tools are a new kind of software — contracts between deterministic systems and non-deterministic agents — the tool contract must encode safety invariants that the non-deterministic model cannot violate. It also connects to Production agents route routine cases through decision trees, reserving humans for complexity: safety decisions should be deterministic code, not LLM judgment calls. The same principle applies to the five universal agent design questions Bustamante identifies: safety, verification, visibility, capability, and memory — each has a tool-enforced and a behavioral option, and the tool-enforced version scales.