Knowledge Systems

Dan Shipper & Kieran Klaassen (Every) — Compound Engineering33

Compound engineering makes each unit of work improve all future work

The 80/20 ratio (80% plan+review, 20% work+compound) ensures learning compounds across iterations, not just code

@nicbstme (Nicolas Bustamante) + @rohit4verse (Rohit) — agent memory patterns18

Persistent agent memory preserves institutional knowledge that walks out the door with employees

When agents maintain daily changelogs, decision logs, and work preferences, organizational knowledge survives personnel changes

@danshipper + @nicbstme — Agent-Native Architectures + Fintool17

Files are the universal interface between humans and agents

Markdown and YAML files on disk beat databases because agents already know file operations and humans can inspect everything

@arscontexta (Heinrich) — Twitter thread on skill graphs13

Skill graphs enable progressive disclosure for complex domains

Single skill files hit a ceiling — complex domains need interconnected knowledge that agents navigate progressively from index to description to links to sections to full content

Rohit (@rohit4verse) — How to Build Agents That Never Forget12

Evolving summaries beat append-only memory — rewrite profiles, don't accumulate facts

An evolve_summary() function that rewrites category profiles with new information handles contradictions naturally, unlike append-only logs

ArchitectureKnowledge SystemsAI Agents

Structure plus reasoning beats flat similarity for complex domains

Across documents, code, and skills, the same pattern holds: structured knowledge navigated by reasoning outperforms flat indexes searched by similarity

Recurring pattern across PageIndex, Claude Code agentic search, and @arscontexta skill graphs12

Knowledge SystemsCoding Tools

Spec files are external memory that survives context resets

A structured specs/ folder (design.md, implementation.md, decisions.md) bridges human intent and agent execution across sessions

Community pattern — spec-first development (implementations by AWS Kiro, GitHub spec-kit, and multiple Claude Code workflows)11

OpenAI Codex Team — Harness Engineering: Leveraging Codex in an Agent-First World10

Harness engineering — humans steer, agents execute, documentation is the system of record

OpenAI built a million-line production codebase with zero manually-written code in 5 months. The discipline shifted from writing code to designing the harness: architecture constraints, documentation, tooling, and feedback loops that make agents reliable at scale.

@hwchase17 (Harrison Chase) — Continual Learning for AI Agents9

Agents learn at three distinct layers — model weights, harness code, and context configuration

Most people jump to model fine-tuning when discussing agent learning, but learning also happens at the harness layer (code, tools, instructions baked into all instances) and the context layer (per-user or per-tenant configuration like CLAUDE.md and skills)

Analysis of Machina (@EXM7777) — 30-Day AI Mastery Roadmap9

Building real projects teaches AI skills faster than following structured curricula

A non-technical user who built a production WhatsApp bot reached 'Operator' level that a 30-day AI mastery roadmap targets — through building, not studying

shadcn (via X/Twitter) — /done skill pattern9

Session capture turns ephemeral AI conversations into a compounding knowledge base

shadcn's /done pattern — dumping key decisions, questions, and follow-ups to markdown after each Claude session — applies file-based memory architecture to development workflow

@tonygentilcore (Tony Gentilcore, Glean) — Trace Learning for Self-Improving Agents9

Two-tier agent memory separates organizational workflow knowledge from individual user preferences

Deployment-level memory captures shared tool strategies and sequencing patterns; user-level memory captures personal templates and communication styles — initially skipping user-level had a significant performance impact

@systematicls — How To Be A World-Class Agentic Engineer8

CLAUDE.md should be a routing table, not a knowledge base

Treat CLAUDE.md as a minimal IF-ELSE directory pointing to context files — not a 26,000-line monolith that bloats every session

Rohit (@rohit4verse) — How to Build Agents That Never Forget7

Treat an agent as an operating system, not a stateless function

Agents need RAM (conversation context), a hard drive (persistent memory), garbage collection (decay/pruning), and I/O management (tools) — the OS mental model unlocks architectural clarity

@rohit4verse (Rohit) — The Missing Layer in Your Agentic Stack6

Accumulated agent traces produce emergent world models — discovered, not designed

When agent decision trajectories accumulate over time, they form a context graph that reveals entities, relationships, and constraints nobody explicitly modeled

@nicbstme — The LLM Context Tax: Best Tips for Tax Avoidance6

Context inefficiency compounds three penalties: cost, latency, and quality degradation

Every wasted token in an LLM context window doesn't just cost money — it slows responses and degrades output quality, creating a triple tax on production agents

Ashpreet Bedi — Memory: How Agents Learn (Agno Framework)6

Cross-user knowledge transfer works without fine-tuning — just a database and prompt engineering

When one person teaches an agent something, another person benefits automatically — no RLHF, no training infrastructure, just structured storage and retrieval

Future of AIKnowledge Systems

Don't be the discriminator — be the patron, not the judge

Taste (selecting from AI output) is the function that gets automated first; participating in creation through friction and will is what endures

@WillManidis (Will Manidis) — Against Taste6

Knowledge SystemsArchitectureAI Agents

Intelligence location — code vs prompts — determines system fragility and flexibility

Critical architectural fork: prompt-driven systems (Pal's 400-line routing prompt) are flexible but break when models change; code-driven systems (our validate-graph.js) are rigid but reliable — best systems need both

Ayush Jhunjhunwala — KG Architecture Comparative Research (10+ systems analyzed)6

Ayush Jhunjhunwala — KG Architecture Comparative Research (10+ systems analyzed)6

Knowledge evolution is the biggest unsolved problem across all graph architectures

Almost nobody has solved how knowledge graphs grow without rotting — most are append-only, auto-decay is too aggressive, and even the best systems only add links without pruning, merging, or detecting contradictions

@itsreallyvivek (vivek) — how to be good at research6

A loss curve is reassurance, not analysis — pull a hundred failures and read every one

Experiments throw off far more information than you consume — transcripts, failure cases, the strange tail — and most of it dies unread. Most ML bugs live in the data and fail silently; Ng's move is to pull 100 failures, sort them into piles, and attack the biggest pile

@jasonscui — Your Data Agents Need Context, a16z6

Tribal knowledge is the irreducible human input that enables agent automation

Automated context construction handles most of the corpus, but the most critical context is implicit, conditional, and historically contingent — only humans can provide it

@businessbarista (Alex Lieberman) quoting @da_fant (David Fant)5

Context centralization is why coding AI works — git is a solved context repository, knowledge work has no equivalent

Engineering AI leads because git centralizes all context in one versioned repository; knowledge work fails on three axes: distributed, unstructured, unverifiable

Decision MakingKnowledge Systems

Shared inputs produce shared conclusions worth nothing — old and cross-disciplinary material is criminally underpriced

If your information diet is trending arxiv plus the group chat, you reach the same conclusions as everyone else at the same time, which makes them worthless. Old material (MoE 1991, LSTMs 1997, the bitter lesson) and cross-disciplinary range are underpriced sources of differentiated ideas

@itsreallyvivek (vivek) — how to be good at research5

Ayush Jhunjhunwala — KG Architecture Comparative Research (10+ systems analyzed)5

Knowledge systems need dual-layer storage — narrative depth and structured queries can't share a format

Every system beyond 'markdown files in a folder' discovers that narrative depth (rich prose, context, reasoning) and structured querying (filter, aggregate, cross-reference) need different storage layers with a routing mechanism between them

@hwchase17 (Harrison Chase) — Your harness, your memory (citing Sarah Wooders, Letta)5

Memory is a harness responsibility, not a pluggable component

Managing context — what enters, what survives compaction, what's queryable — is a core capability of the harness itself, not an add-on service

@trq212 (Thariq) — Lessons from Building Claude Code: How We Use Skills5

Metadata consumed by LLMs needs trigger specifications, not human summaries

When an LLM scans metadata to decide what to invoke, the description should specify when to activate — not summarize what the thing does — because LLMs are a fundamentally different consumer than humans

Future of AIKnowledge Systems

You can offload a task, or even a job, but you can never offload your learning

The real opportunity isn't picking the best model — it's building a learning loop on top of models where the firm's accumulated learning, the one thing it can't outsource, compounds across people and AI

@satyanadella (Satya Nadella) — A frontier without an ecosystem is not stable5

Decision MakingKnowledge Systems

Taste is a muscle, not a gift — train it by forecasting every result before you see it

Predict the outcome of every experiment before running it, guess a paper's numbers from the method alone, call which releases will matter in two years and check your hit rate; a forecast plus a correction, repeated a few hundred times, trains the model in your head the way it trains any other model

@itsreallyvivek (vivek) — how to be good at research5

Knowledge SystemsFuture of AI

A clear public explanation is a genuine contribution and an unfakeable credential

Fields choke on undigested ideas, so distilling something hard into a clear explanation is real work, not a service job — and a body of public writing doubles as the strongest credential you can hold, because it's an unfakeable sample of how you think

@itsreallyvivek (vivek) — how to be good at research4

Ayush Jhunjhunwala — KG Architecture Comparative Research (10+ systems analyzed)4

Compilation scales but curation compounds — two camps for knowledge graph construction

LLM-compiled systems (Karpathy, Pal) grow fast by feeding raw content through model judgment; human-curated systems (our graph, brainctl) grow slowly but every node is validated — compilation scales linearly, curation compounds through connections

@jasonscui — Your Data Agents Need Context, a16z4

Context layers supersede semantic layers for agent autonomy

Traditional semantic layers handle metric definitions but agents need a superset: canonical entities, identity resolution, tribal knowledge instructions, and governance guidance

Rohit (@rohit4verse) — How to Build Agents That Never Forget4

Embeddings measure similarity, not truth — vector databases have a temporal blind spot

Vector search can't resolve contradictions or understand time; 'I love my job' and 'I'm quitting' retrieve with equal confidence

@hwchase17 (Harrison Chase) — Everything Gets Rebuilt: Agents, Harnesses, and the New Compute Layer4

Memory defines the agent — a zip of markdown files IS the agent, and portable memory between harnesses is the frontier

An agent IS its memory — a zip of markdown (system prompt + skills + tools) defines its identity; making that portable between harnesses is the current frontier

Ayush Jhunjhunwala — KG Architecture Comparative Research (10+ systems analyzed)4

Navigation beats search for knowledge retrieval — let each data source keep its native query interface

Vector similarity search flattens everything into one embedding space, losing native query affordances; better to let SQL be SQL, files be files, and build a routing layer that picks the right source per question type

@trq212 (Thariq) — Lessons from Building Claude Code: How We Use Skills4

A skill's folder structure is its context architecture — the file system is a form of context engineering

Skills are not just markdown files but folders where scripts, references, and assets enable progressive disclosure — the agent reads deeper files only when it reaches the relevant step

Rohit (@rohit4verse) — How to Build Agents That Never Forget4

Tiered retrieval prevents context overload — summaries first, details on demand

Reading category summaries first, then drilling to items, then raw resources only if needed keeps memory retrieval within token budgets

Knowledge SystemsDecision Making

Writing is the cheapest defense against fooling yourself — the page finds the gaps your head papers over

An idea feels fully formed until you try to word it; writing exposes the untested assumption, the step that doesn't follow, the two claims that contradict. Darwin made it procedural — log disconfirming evidence on the spot, because memory deletes inconvenient results faster than convenient ones

@itsreallyvivek (vivek) — how to be good at research4

@hwchase17 (Harrison Chase) — Continual Learning for AI Agents3

Context learning spans agent, tenant, and org levels — and you can mix all three

Agent-level context updates the agent's own configuration over time; tenant-level (user/org/team) gives each tenant their own evolving context; production systems mix multiple levels simultaneously

@hwchase17 (Harrison Chase) — Continual Learning for AI Agents3

Hot-path and offline learning are two temporal modes for agent context updates — each with different tradeoffs

Agents can update their context in the hot path (during task execution, like saving to memory while working) or offline (batch processing recent traces after the fact, like OpenClaw's 'dreaming'), with an additional dimension of explicit vs implicit memory updates

@hwchase17 (Harrison Chase) — Everything Gets Rebuilt: Agents, Harnesses, and the New Compute Layer3

Procedural memory is the highest-impact type of agent memory — it determines what the agent actually does

Of three memory types (semantic/episodic/procedural), procedural — instructions, skills, and tools — has the highest impact because it changes what the agent actually does

@tonygentilcore (Tony Gentilcore, Glean) — Trace Learning for Self-Improving Agents3

Agents need workflow-level tool strategies, not individual tool instructions — the hard part is how tools combine

In enterprise environments, the challenge isn't finding the right tool but understanding how tools work together; intentionally narrow strategies that capture workflow patterns generalize better than broad abstractions