AI Product Building AI Agents Architecture

Safety enforcement belongs in tool design, not system prompts

At scale, embedding safety constraints in the tool's API (blocking destructive operations by default) beats relying on behavioral compliance with system prompt instructions

@nicbstme — Lessons from Reverse Engineering Excel AI Agents · Mar 5, 2026 · 9 connections

Reverse-engineering three production Excel AI agents revealed a critical architectural divergence in safety. Claude embeds overwrite protection at the API level — “the blocking is in the tool, the consent is in the prompt” — while Microsoft Copilot has no overwrite protection at all (it just overwrites) and Shortcut AI relies on system prompt instructions to prevent destructive operations. Behavioral compliance is inherently unreliable at millions of sessions; tool-enforced safety is deterministic.

This is a concrete example of Tools are a new kind of software — contracts between deterministic systems and non-deterministic agents — the tool contract must encode safety invariants that the non-deterministic model cannot violate. It also connects to Production agents route routine cases through decision trees, reserving humans for complexity: safety decisions should be deterministic code, not LLM judgment calls. The same principle applies to the five universal agent design questions Bustamante identifies: safety, verification, visibility, capability, and memory — each has a tool-enforced and a behavioral option, and the tool-enforced version scales. The same principle generalizes from the tool API to the entire loop in Policy enforcement must run independently of model cooperation — hooks, not prompt instructions, and on the exposure side it pairs with Separate tool registration from tool exposure — install broadly, reveal narrowly — the fewer tools a run reveals, the smaller the surface any guardrail has to defend.

Connected Insights

References (4)

→ Policy enforcement must run independently of model cooperation — hooks, not prompt instructions → Production agents route routine cases through decision trees, reserving humans for complexity → Separate tool registration from tool exposure — install broadly, reveal narrowly → Tools are a new kind of software — contracts between deterministic systems and non-deterministic agents

Referenced by (5)

← Separate tool registration from tool exposure — install broadly, reveal narrowly ← Unattended agent jobs must run through the same permission machinery as interactive sessions ← Policy enforcement must run independently of model cooperation — hooks, not prompt instructions ← Evaluate agent tools with real multi-step tasks, not toy single-call examples ← Intelligence location — code vs prompts — determines system fragility and flexibility