Knowledge Systems
AI Product Building29 insights in this topic
29 insights
Compound engineering makes each unit of work improve all future work
The 80/20 ratio (80% plan+review, 20% work+compound) ensures learning compounds across iterations, not just code
Files are the universal interface between humans and agents
Markdown and YAML files on disk beat databases because agents already know file operations and humans can inspect everything
Persistent agent memory preserves institutional knowledge that walks out the door with employees
When agents maintain daily changelogs, decision logs, and work preferences, organizational knowledge survives personnel changes
Skill graphs enable progressive disclosure for complex domains
Single skill files hit a ceiling — complex domains need interconnected knowledge that agents navigate progressively from index to description to links to sections to full content
Structure plus reasoning beats flat similarity for complex domains
Across documents, code, and skills, the same pattern holds: structured knowledge navigated by reasoning outperforms flat indexes searched by similarity
Harness engineering — humans steer, agents execute, documentation is the system of record
OpenAI built a million-line production codebase with zero manually-written code in 5 months. The discipline shifted from writing code to designing the harness: architecture constraints, documentation, tooling, and feedback loops that make agents reliable at scale.
Spec files are external memory that survives context resets
A structured specs/ folder (design.md, implementation.md, decisions.md) bridges human intent and agent execution across sessions
Evolving summaries beat append-only memory — rewrite profiles, don't accumulate facts
An evolve_summary() function that rewrites category profiles with new information handles contradictions naturally, unlike append-only logs
CLAUDE.md should be a routing table, not a knowledge base
Treat CLAUDE.md as a minimal IF-ELSE directory pointing to context files — not a 26,000-line monolith that bloats every session
Session capture turns ephemeral AI conversations into a compounding knowledge base
shadcn's /done pattern — dumping key decisions, questions, and follow-ups to markdown after each Claude session — applies file-based memory architecture to development workflow
Context inefficiency compounds three penalties: cost, latency, and quality degradation
Every wasted token in an LLM context window doesn't just cost money — it slows responses and degrades output quality, creating a triple tax on production agents
Cross-user knowledge transfer works without fine-tuning — just a database and prompt engineering
When one person teaches an agent something, another person benefits automatically — no RLHF, no training infrastructure, just structured storage and retrieval
Intelligence location — code vs prompts — determines system fragility and flexibility
Critical architectural fork: prompt-driven systems (Pal's 400-line routing prompt) are flexible but break when models change; code-driven systems (our validate-graph.js) are rigid but reliable — best systems need both
Knowledge evolution is the biggest unsolved problem across all graph architectures
Almost nobody has solved how knowledge graphs grow without rotting — most are append-only, auto-decay is too aggressive, and even the best systems only add links without pruning, merging, or detecting contradictions
Knowledge systems need dual-layer storage — narrative depth and structured queries can't share a format
Every system beyond 'markdown files in a folder' discovers that narrative depth (rich prose, context, reasoning) and structured querying (filter, aggregate, cross-reference) need different storage layers with a routing mechanism between them
Metadata consumed by LLMs needs trigger specifications, not human summaries
When an LLM scans metadata to decide what to invoke, the description should specify when to activate — not summarize what the thing does — because LLMs are a fundamentally different consumer than humans
Treat an agent as an operating system, not a stateless function
Agents need RAM (conversation context), a hard drive (persistent memory), garbage collection (decay/pruning), and I/O management (tools) — the OS mental model unlocks architectural clarity
Tribal knowledge is the irreducible human input that enables agent automation
Automated context construction handles most of the corpus, but the most critical context is implicit, conditional, and historically contingent — only humans can provide it
Accumulated agent traces produce emergent world models — discovered, not designed
When agent decision trajectories accumulate over time, they form a context graph that reveals entities, relationships, and constraints nobody explicitly modeled
Building real projects teaches AI skills faster than following structured curricula
A non-technical user who built a production WhatsApp bot reached 'Operator' level that a 30-day AI mastery roadmap targets — through building, not studying
Compilation scales but curation compounds — two camps for knowledge graph construction
LLM-compiled systems (Karpathy, Pal) grow fast by feeding raw content through model judgment; human-curated systems (our graph, brainctl) grow slowly but every node is validated — compilation scales linearly, curation compounds through connections
Context layers supersede semantic layers for agent autonomy
Traditional semantic layers handle metric definitions but agents need a superset: canonical entities, identity resolution, tribal knowledge instructions, and governance guidance
Don't be the discriminator — be the patron, not the judge
Taste (selecting from AI output) is the function that gets automated first; participating in creation through friction and will is what endures
Embeddings measure similarity, not truth — vector databases have a temporal blind spot
Vector search can't resolve contradictions or understand time; 'I love my job' and 'I'm quitting' retrieve with equal confidence
Navigation beats search for knowledge retrieval — let each data source keep its native query interface
Vector similarity search flattens everything into one embedding space, losing native query affordances; better to let SQL be SQL, files be files, and build a routing layer that picks the right source per question type
A skill's folder structure is its context architecture — the file system is a form of context engineering
Skills are not just markdown files but folders where scripts, references, and assets enable progressive disclosure — the agent reads deeper files only when it reaches the relevant step
Tiered retrieval prevents context overload — summaries first, details on demand
Reading category summaries first, then drilling to items, then raw resources only if needed keeps memory retrieval within token budgets
Two-tier agent memory separates organizational workflow knowledge from individual user preferences
Deployment-level memory captures shared tool strategies and sequencing patterns; user-level memory captures personal templates and communication styles — initially skipping user-level had a significant performance impact
Agents need workflow-level tool strategies, not individual tool instructions — the hard part is how tools combine
In enterprise environments, the challenge isn't finding the right tool but understanding how tools work together; intentionally narrow strategies that capture workflow patterns generalize better than broad abstractions