All insights

Architecture

AI Product Building

55 insights in this topic

55 insights

ArchitectureBusiness Models

Context is the product, not the model

Anyone can call the API — differentiation comes from the data you access, skills you build, UX you design, and domain knowledge you encode

@nicbstme (Nicolas Bustamante) — Lessons from Building AI Agents for Financial Services27
ArchitectureFuture of AI

Decision traces are the missing data layer — a trillion-dollar gap

Systems store what happened but not why; capturing the reasoning behind decisions creates searchable precedent and a new system of record

Jaya Gupta & Ashu Garg — Foundation Capital, Context Graphs22
AI AgentsArchitecture

A mediocre agent inside a strong harness outperforms a stronger agent inside a messy one

The surrounding machinery — metrics, rollback, scoping, observability — determines autonomous system performance more than model capability

Manthan Gupta (@manthanguptaa) — How Karpathy's Autoresearch Works And What You Can Learn From It16
ArchitectureKnowledge Systems

Files are the universal interface between humans and agents

Markdown and YAML files on disk beat databases because agents already know file operations and humans can inspect everything

@danshipper + @nicbstme — Agent-Native Architectures + Fintool15
ArchitectureAI Agents

The three-layer AI stack: Memory, Search, Reasoning

The emerging AI product architecture has three layers — Memory (who is this user), Search (find the right information), Reasoning (navigate complex information) — all running on PostgreSQL

Synthesis from Supermemory, QMD, and PageIndex architectures14
AI AgentsArchitecture

Agents that store error patterns learn continuously without fine-tuning or retraining

Dash's 'GPU-poor continuous learning' separates validated knowledge from error-driven learnings — five lines of code replaces expensive retraining

@ashpreetbedi — Dash (OpenAI-inspired data agent)13
ArchitectureKnowledge SystemsAI Agents

Structure plus reasoning beats flat similarity for complex domains

Across documents, code, and skills, the same pattern holds: structured knowledge navigated by reasoning outperforms flat indexes searched by similarity

Recurring pattern across PageIndex, Claude Code agentic search, and @arscontexta skill graphs12
AI AgentsArchitecture

In agent-native architecture, features are prompts — not code

The shift from coding specific functions to describing outcomes that agents achieve by composing atomic tools

@danshipper — Agent-Native Architectures (co-authored with Claude)11
AI AgentsArchitecture

Production agents route routine cases through decision trees, reserving humans for complexity

Handle exact matches and known patterns without AI; invoke the model for ambiguity, and route genuinely complex cases to human judgment

@vasuman — AI Agents 10111
AI AgentsArchitecture

Markdown skill files may replace expensive fine-tuning

A SKILL.md file that teaches an agent how to do something specific can match domain-specific fine-tuned models — at zero training cost

Nicolas Bustamante (@nicbstme) — Lessons from Building AI Agents for Financial Services11
AI AgentsArchitecture

Observability is the missing discipline for agent systems — you can't improve what you can't measure

Agent systems need telemetry (token usage, latency, error rates, cost per task) as a first-class engineering concern, not an afterthought bolted on after production failures

Geoff Huntley — Latent Patterns Principles (verification over testing)10
AI AgentsArchitecture

Verification is a Red Queen race — optimizing against a fixed eval contaminates it

Eval suites degrade the moment you use them to improve an agent — the agent adapts to the distribution, and the eval stops measuring what it was designed to measure

@natashamalpani (Natasha Malpani) — The Verification Economy: The Red Queen Problem (Part III)10
ArchitectureCoding Tools

Agentic search beats RAG for live codebases

Claude Code abandoned RAG and vector DB in favor of letting the agent grep/glob/read — reasoning about where to look outperforms pre-indexed similarity search for code

Boris Cherny (@bcherny, Claude Code team) — Twitter reply to @EthanLipnik9
Architecture

Similarity is not relevance — relevance requires reasoning

Vector search finds semantically similar content, but what users need is relevant content, and determining relevance requires LLM reasoning, not just pattern matching

PageIndex by VectifyAI — https://github.com/VectifyAI/PageIndex9
Business ModelsArchitecture

Revealed preferences trump stated preferences — track what users do, not what they say

Users' actual behavior (what they click, skip, edit, redo) is the ground truth for product decisions; stated preferences in surveys and interviews systematically mislead

Nikunj Kothari — Revealed Preferences7
ArchitectureCoding Tools

Boring tech wins for AI-native startups — simpler stack means faster AI-assisted shipping

React + Node + TypeScript + Postgres + Redis scales to $1M ARR with 3 engineers; monorepo is a superpower for AI coding assistants

Kushal Byatnal — Extend ($1M+ ARR, 3 engineers)6
AI AgentsArchitecture

Traces not scores enable agent improvement — without trajectories, improvement rate drops hard

When AutoAgent's meta-agent received only pass/fail scores without reasoning traces, the improvement rate dropped significantly; understanding why matters as much as knowing that

@kevingu (Kevin Gu) — AutoAgent: First Open Source Library for Self-Optimizing Agents6
Business ModelsArchitecture

The UI moat collapses — API quality becomes the purchasing criterion

When agents are the primary users of software, beautiful dashboards stop mattering and API design becomes the competitive surface

@chrysb (Chrys Bader) + @nicbstme (Nicolas Bustamante) — Apps Are Dead + Every SaaS Is Now an API6
AI AgentsArchitecture

Agent edits are automatic decision instrumentation — every human correction is a structured signal

When agents propose and humans edit, the delta between proposal and correction captures tacit judgment as first-class data without requiring manual logging

@JayaGup10 (Jaya Gupta) — The Trillion Dollar Loop B2B Never Had5
AI AgentsArchitecture

Auto-generated narrow monitors beat handwritten broad checks — a tight mesh over the exact shape of the code

1,000+ AI-generated monitors that each target specific code paths catch more bugs than 10 hand-written checks that cover general categories

@RampLabs — How We Made Ramp Sheets Self-Maintaining5
ArchitectureKnowledge Systems

Context inefficiency compounds three penalties: cost, latency, and quality degradation

Every wasted token in an LLM context window doesn't just cost money — it slows responses and degrades output quality, creating a triple tax on production agents

@nicbstme — The LLM Context Tax: Best Tips for Tax Avoidance5
ArchitectureAI Agents

Context layers must be living systems, not static artifacts

Unlike semantic layers that rot when maintainers leave, context layers need self-updating feedback loops where agent errors refine the context corpus

@jasonscui — Your Data Agents Need Context, a16z5
Architecture

Hybrid search is the default, not the exception

Neither keyword nor semantic search alone is complete — combining BM25 and vector search with reranking is the baseline for production systems

QMD by Tobi Lütke, pg_textsearch by Timescale, TigerData BM25 article5
Knowledge SystemsArchitectureAI Agents

Intelligence location — code vs prompts — determines system fragility and flexibility

Critical architectural fork: prompt-driven systems (Pal's 400-line routing prompt) are flexible but break when models change; code-driven systems (our validate-graph.js) are rigid but reliable — best systems need both

Ayush Jhunjhunwala — KG Architecture Comparative Research (10+ systems analyzed)5
Knowledge SystemsArchitecture

Knowledge evolution is the biggest unsolved problem across all graph architectures

Almost nobody has solved how knowledge graphs grow without rotting — most are append-only, auto-decay is too aggressive, and even the best systems only add links without pruning, merging, or detecting contradictions

Ayush Jhunjhunwala — KG Architecture Comparative Research (10+ systems analyzed)5
Knowledge SystemsArchitecture

Knowledge systems need dual-layer storage — narrative depth and structured queries can't share a format

Every system beyond 'markdown files in a folder' discovers that narrative depth (rich prose, context, reasoning) and structured querying (filter, aggregate, cross-reference) need different storage layers with a routing mechanism between them

Ayush Jhunjhunwala — KG Architecture Comparative Research (10+ systems analyzed)5
ArchitectureKnowledge Systems

Metadata consumed by LLMs needs trigger specifications, not human summaries

When an LLM scans metadata to decide what to invoke, the description should specify when to activate — not summarize what the thing does — because LLMs are a fundamentally different consumer than humans

@trq212 (Thariq) — Lessons from Building Claude Code: How We Use Skills5
AI AgentsArchitecture

AI is the computer — orchestration across 19 models is the product, not any single model

Perplexity launched a unified agent system orchestrating 19 backend models that delegate tasks, manage files, execute code, and browse the web. The differentiation isn't the models — it's the orchestration. 'The computer is the orchestration system.'

@AravSrinivas (Aravind Srinivas, Perplexity CEO) — AI Is the Computer5
ArchitectureCoding Tools

Prompt caching makes long context economically viable

Prefix-matching cache enables 80%+ cost reduction for multi-turn conversations, making rich context systems affordable at scale

Anthropic documentation — Prompt Caching5
ArchitectureFuture of AI

Agents eat your system of record — the rigid app was the constraint, not the schema

When agents can clone your entire CRM in seconds and become the real interface, the SaaS product becomes a dumb write endpoint. Data moats evaporate because agents eliminate the rigid app that demanded rigid schemas.

@zain_hoda (Zain Hoda, Vanna AI) — The Agent Will Eat Your System of Record5
Coding ToolsArchitecture

Scaffolding is tech debt against the next model — the bitter lesson applied to product building

Code built to extend model capability 10-20% becomes worthless when the next model ships, making most product scaffolding an ephemeral trade-off rather than a lasting investment

Boris Cherny (@bcherny) — Inside Claude Code With Its Creator, Y Combinator Light Cone podcast5
AI AgentsArchitecture

Trust boundaries must be externalized — not held in engineers' heads

Where an agent's behavior is well-understood vs. unknown should be mapped, made auditable, and connected to deployment gates — not left as implicit tribal knowledge

@natashamalpani (Natasha Malpani) — The Verification Economy: The Red Queen Problem (Part III)5
AI AgentsArchitecture

WebMCP turns websites into agent-native interfaces

Chrome's MCP integration lets websites expose structured tools to agents instead of agents scraping and guessing at UI elements

Chrome for Developers — WebMCP announcement (https://developer.chrome.com/blog/web-mcp)5
ArchitectureKnowledge Systems

Context layers supersede semantic layers for agent autonomy

Traditional semantic layers handle metric definitions but agents need a superset: canonical entities, identity resolution, tribal knowledge instructions, and governance guidance

@jasonscui — Your Data Agents Need Context, a16z4
AI AgentsArchitecture

Data agent failures stem from missing business context, not SQL generation gaps

The industry initially blamed text-to-SQL capability for data agent failures, but the real blockers are undefined business definitions, ambiguous sources of truth, and missing tribal knowledge

@jasonscui — Your Data Agents Need Context, a16z4
AI AgentsArchitecture

Detect everything, notify selectively — the observability-to-notification ratio determines system trust

Watch every signal but ensure alerts reaching humans always mean something; teams ignore noisy monitors AND noisy agents equally fast

@RampLabs — How We Made Ramp Sheets Self-Maintaining4
Knowledge SystemsArchitecture

Embeddings measure similarity, not truth — vector databases have a temporal blind spot

Vector search can't resolve contradictions or understand time; 'I love my job' and 'I'm quitting' retrieve with equal confidence

Rohit (@rohit4verse) — How to Build Agents That Never Forget4
Knowledge SystemsArchitecture

Navigation beats search for knowledge retrieval — let each data source keep its native query interface

Vector similarity search flattens everything into one embedding space, losing native query affordances; better to let SQL be SQL, files be files, and build a routing layer that picks the right source per question type

Ayush Jhunjhunwala — KG Architecture Comparative Research (10+ systems analyzed)4
ArchitectureFuture of AI

Permissioned inference is harder than permissioned retrieval — enterprise context graphs need reasoning-level access control

Controlling who sees data is solved; controlling whose history shapes reasoning for others is the unsolved trust layer enterprise context graphs require

@JayaGup10 (Jaya Gupta) — The Trillion Dollar Loop B2B Never Had4
Architecture

PostgreSQL scales further than you think

OpenAI runs ChatGPT on one PostgreSQL primary plus ~50 read replicas handling millions of QPS — no sharding of PostgreSQL itself, just excellent operations

OpenAI — https://openai.com/index/scaling-postgresql/4
Architecture

Response UX should match retrieval intelligence

If your system uses semantic search to find results, the display should reflect that intelligence — keyword highlighting on semantic results creates a confusing mismatch

@akshay_pachaar — 'Your RAG System Has a Hidden UX Problem' (Daily Dose of Data Science blog), referencing Zilliz semantic highlighting model4
AI AgentsArchitecture

Safety enforcement belongs in tool design, not system prompts

At scale, embedding safety constraints in the tool's API (blocking destructive operations by default) beats relying on behavioral compliance with system prompt instructions

@nicbstme — Lessons from Reverse Engineering Excel AI Agents4
AI AgentsArchitecture

Shadow execution enables safe trace learning — replay write operations without touching production data

By replaying actions that would write to external apps in a shadow path, agents can learn from realistic end-to-end flows without impacting customer data

@tonygentilcore (Tony Gentilcore, Glean) — Trace Learning for Self-Improving Agents4
ArchitectureKnowledge Systems

Tiered retrieval prevents context overload — summaries first, details on demand

Reading category summaries first, then drilling to items, then raw resources only if needed keeps memory retrieval within token budgets

Rohit (@rohit4verse) — How to Build Agents That Never Forget4
AI AgentsArchitecture

Time-bounded evaluation forces optimization for real-world usefulness instead of idealized performance

A fixed wall-clock budget per experiment makes results comparable, normalizes across hardware, and forces agents to optimize for improvement per unit time

Manthan Gupta (@manthanguptaa) — How Karpathy's Autoresearch Works And What You Can Learn From It4
ArchitectureAI Agents

Virtual filesystems replace sandboxes for agent navigation — intercept commands instead of provisioning infrastructure

Mintlify's ChromaFs intercepts Unix commands and translates them into database queries, cutting boot time from 46 seconds to 100ms and cost from $70k/year to near-zero

Mintlify — How We Built a Virtual Filesystem for Our Assistant4
Architecture

Evaluations must augment trace data in place — divergent copies drift by design

The moment you export traces to a separate eval system, the copy diverges from where annotations run; evals, annotations, and traces should share a single source of truth

@aparnadhinak (Aparna Dhinakaran) — Data Architectures For Tracing Harnesses & Agents3
ArchitectureCoding Tools

Inference capability lowers input fidelity requirements — smart listeners make imprecise input work

When the consumer of input has strong inference ability, the quality bar for that input drops — voice works not because transcription improved, but because the listener got smarter

@mvanhorn (Matt Van Horn) — Every Claude Code Hack I Know (March 2026)3
ArchitectureAI Agents

KV cache hit rate is the most critical metric for production agents

Maintaining stable prompt prefixes and append-only context architecture maximizes cache reuse, dramatically reducing both cost and latency for agentic workflows

@nicbstme — The LLM Context Tax: Best Tips for Tax Avoidance3
Architecture

Lakebases decouple compute from storage — databases become elastic infrastructure

Third-generation databases separate compute and storage entirely, putting data in open formats on cloud object stores; the database becomes a serverless layer that scales to zero

Databricks — What Is a Lakebase3
Business ModelsArchitecture

Latent demand is the strongest product signal — make the thing people already do easier

People will only do things they already do; you can't get them to do a new thing, but you can make their existing behavior frictionless

Boris Cherny (@bcherny) — Inside Claude Code With Its Creator, Y Combinator Light Cone podcast3
AI AgentsArchitecture

Reasoning evaporation permanently destroys agent decision chains when the context window closes

An agent's multi-step reasoning exists only in the context window; when the session ends, the output survives but the decision chain — why each step was taken — is gone forever

@rohit4verse (Rohit) — The Missing Layer in Your Agentic Stack3
Future of AIArchitecture

Stronger models expand the verification gap, not close it

More capable models increase the deployment surface and raise the stakes of failures, making verification infrastructure more valuable rather than less

@natashamalpani (Natasha Malpani) — The Verification Economy: The Red Queen Problem (Part III)3
AI AgentsArchitecture

Teacher-student trace distillation with consensus validation beats single-oracle learning

A single high-reasoning teacher trace isn't reliable enough for enterprise learning; comparing multiple student traces under production constraints with consensus validation produces trustworthy strategies

@tonygentilcore (Tony Gentilcore, Glean) — Trace Learning for Self-Improving Agents3
ArchitectureAI Agents

AI trace data has an indefinite useful lifespan — SaaS observability's 30-day retention model destroys institutional knowledge

Infrastructure metrics expire quickly but AI conversations and reasoning traces gain value over time; 30-day retention windows erase the very data that reveals failure patterns and training signals

@aparnadhinak (Aparna Dhinakaran) — Data Architectures For Tracing Harnesses & Agents3