Building With AI?

Learn how to build features like this yourself. I offer 1-on-1 AI web development coaching to help you ship faster with tools like Claude, Cursor, and ChatGPT.

Andrej Karpathy quietly published an 800-word gist proposing a structured layer of LLM-written markdown pages that compound, cross-link, and lint themselves. Here is what the pattern looks like, why it beats raw RAG, and how Agent Vault already runs about 60% of it.

What Karpathy Actually Proposed

Andrej Karpathy published the pattern on April 4, 2026 as a public GitHub gist (the LLM Wiki gist). It's framed not as a finished product but as an "idea file" — a sketch of how an LLM can incrementally compile curated source material into a persistent, interlinked Markdown wiki.

The architecture has three layers. Raw sources are immutable. The wiki sits on top -- a directory of markdown entity pages linked with [[wiki-link]] syntax. The schema is a config doc (Karpathy uses the CLAUDE.md convention) defining how the wiki behaves. Karpathy's own wiki reportedly reached ~100 articles and 400,000 words and was still faster and more accurate than his RAG pipeline.

Download

The LLM Wiki

Karpathy's April 2026 gist

A structured synthesis layer that compounds, ages well, and beats raw RAG on the trade-offs that actually matter.

Download

Three-Layer Architecture

Schema

CLAUDE.md

Operating contract

Wiki

entity pages + index

Maintained knowledge

Sources

PDFs, articles, transcripts

Immutable truth

Three layers, one knowledge base

Sources stay immutable. Wiki is the maintained synthesis. Schema is the operating contract.

How a Source Becomes a Wiki

Raw sources (immutable)

Wiki entity pages

Synthesised answer

Articles

[[Concept A]]

Query

LLM reads 3-4 entity pages, follows [[wiki-link]] references, returns a cross-referenced answer.

PDFs

[[Concept B]]

Transcripts

[[Concept C]]

Sources stay immutable. The wiki pages are the synthesis layer. Queries read 3-4 pages, not 100 fragments.

The Pattern in Numberskarpathy + agent vault

Architecture layers

sources / wiki / schema

Operations

ingest / query / lint

10-15

Pages per ingest

touched by one source

~200

Entity pages live

Agent Vault, 12 projects

The Three Operations

Karpathy describes exactly three operations against the wiki, and they are remarkably symmetrical. There is no chunking strategy, no embedding model selection, no hybrid retrieval, no re-ranker. The "retrieval" is the LLM reading index.md and following links.

Download

10-15

Pages per ingest

One source touches 10-15 existing wiki pages on ingest, not one isolated chunk in a vector store.

Download

The Three Operations

Ingest pulls in. Query reads out. Lint feeds back. A self-maintaining loop.

The Three Operations Loop

Every interaction with the wiki is one of three operations. The arrows show how they feed each other.

Ingest

Read a new source, update 10-15 existing pages, log it.

Query

Search the wiki, synthesise with citations, promote good answers.

Lint

Flag contradictions, stale claims, orphan pages, missing links.

The dotted return arrow is lint feeding back into ingest -- the maintenance loop that prevents wiki rot.

Lint is the maintenance loop

Vector stores have no equivalent to lint. They cannot flag contradictions, surface orphan fragments, or notice that two embeddings should obviously be cross-referenced. The wiki can, because the LLM is the index.

Why This Beats RAG

Vector-store RAG retrieves raw fragments per query -- the fragments are dumb, do not know about each other, and cannot flag contradictions. The wiki inverts the cost model: knowledge is compiled once on ingest, then read cheaply forever. When the LLM answers a question it is reading three or four pre-synthesised entity pages, not a hundred raw fragments.

RAG vs LLM Wiki -- the cost model inversion

Vector-store RAG

Retrieves raw fragments per query

Embedding-store round trip on every question

Fragments do not know about each other

Contradictions are invisible until you read them all

Fresh ingest does not retroactively help old queries

LLM Wiki

Reads pre-synthesised entity pages

Compile-once, read-cheaply-forever cost model

Cross-references baked in at ingest time

Contradictions flagged inline next to the claim

Every ingest improves the whole knowledge base

The wiki inverts the cost model: knowledge is compiled once on ingest, then read cheaply forever.

Download

RAG vs LLM Wiki

Before

Re-retrieves raw fragments per query

After

Reads pre-synthesised, cross-linked pages

Running in production

Want this running on your knowledge base?

We run LLM-wiki style knowledge systems across our client portfolio — plumbing, automotive inspection, pest control, breeders. From zero to a compounding wiki in about two weeks.

Book a 30-min call

Client projects

200+

Entity pages live

~2wk

Zero → compounding

How Agent Vault Implements About 75% of This

Agent Vault is the production system we run for Jordan James Media's client portfolio. About 200 entity pages across 12 projects, mirrored to a Supabase table, with an ingest-on-correction hook that auto-promotes user pushback into a new entity page. The pattern maps almost perfectly to Karpathy's three layers -- memory files are entity pages, MEMORY.md is the index, and CLAUDE.md is the schema.

Karpathy says → what we built

Karpathy

wiki/*.md

Markdown entity pages, one concept per file

Agent Vault

~/.claude/projects/<project>/memory/*.md

200+ files across 12 client projects

Karpathy

index.md

Catalog the LLM reads first

Agent Vault

MEMORY.md

Content-organised, one-line entries with links

Karpathy

CLAUDE.md

Schema — how the wiki behaves

Agent Vault

CLAUDE.md

Same file name. Same convention.

Karpathy

log.md

Append-only change log

Agent Vault

agents/*/sessions/**.md

Per-session logs — more useful for us, but no global grep view yet

Four pieces, four matches. We did not set out to copy Karpathy — we built this independently for a client portfolio and discovered the convergence later.

Download

Each ingest and each good answer can strengthen the corpus rather than vanish into chat history.

— Paraphrased from Karpathy's LLM Wiki gist

What We Do Not Have Yet

Three pieces still missing: formal [[wiki-link]] syntax across all files (we use raw paths and grep), a multi-page ingest workflow that updates 10-15 pages per source (our ingest is mostly single-page right now), and one consolidated log.md (each session writes its own log, which is more useful for our case but loses the global grep view).

What we did just ship after writing the first draft of this post: the memory-lint skill, in three flavors. Structural lint (regex-driven) runs at every session start and on demand — it caught four real drift bugs in our 442-memory corpus on its first run, all fixed the same day. Semantic lint (Claude Haiku 4.5) runs weekly with a --since 7d filter so the cost stays around ten cents a week — it reads each memory plus the files it cites in backticks, flags specific claims that look stale or contradicted. A write-time hook (PostToolUse) lints every memory edit in real time, so new bad [[wiki-link]] references can't silently enter the corpus.

Today (about 75%)

200+ entity pages live in memory/
MEMORY.md is the catalog
CLAUDE.md is the schema
Ingest-on-correction hook
Mirror to Supabase agent_memories table
Memory-lint: structural at SessionStart, semantic weekly, write-time on every edit
Orphan-page detector (in the structural lint)

The missing 25%

Formal [[wiki-link]] syntax across files
Multi-page ingest (10-15 pages per source)
Consolidated log.md with grep-friendly prefixes

The Comparison Mental Model

Here is the cleanest way to frame the difference. RAG is a search engine for your documents -- it answers "what did source X say about Y." The LLM Wiki is an encyclopedia your AI writes for itself -- it answers "what do I currently believe about Y, given everything I have read." Those are different questions, and the second one is almost always what you actually want.

What This Means For Your Stack

Start small. One markdown directory. One CLAUDE.md. One index.md. One log.md. Let your LLM read it, write to it, and lint it. After six months you will have something a new agent can pick up in five minutes -- which is the actual test of whether your knowledge base works.

The Minimum Viable Wiki

Eight files in one directory. One schema doc, one catalog, one log, and a handful of entity pages. Everything else is the LLM doing the work.

~/my-wiki — tree

my-wiki/

├── CLAUDE.md# Schema layer — how the wiki behaves

├── index.md# Catalog of every entity page

├── log.md# Append-only change log

└── wiki/# Entity pages live here

├── [[Compounding-Knowledge]].md

├── [[RAG-vs-LLM-Wiki]].md

└── [[Ingest-Loop]].md

That's it. No vector store. No embedding model. No re-ranker. The LLM reads index.md and follows links.

Start Your Own Wiki -- five steps for the weekend

Create one markdown directory and a CLAUDE.md schema doc describing your conventions
Seed index.md with categories and an empty log.md with a date-prefix format
Write your first 5-10 entity pages by hand to teach the LLM the shape
Wire an ingest hook: when a source arrives, the LLM updates affected pages and appends one log line
Schedule a weekly lint pass that flags contradictions, stale claims, and orphan pages

Apply This Pattern to Your AI Tooling

We have been deploying LLM-wiki style knowledge systems for clients across plumbing, automotive inspection, and legal services. If you want help mapping your existing knowledge into this pattern, the entry point is our content marketing service.

Explore Content Marketing

Download

Start Your Own Wiki

One directory, one schema doc, one index file, one log. Let your LLM read it, write it, and lint it.

Read the Gist

Newsletter

Get More AI Architecture Insights

Patterns we are testing in production, monthly. No newsletter platforms, no scheduling tools -- just whatever we built recently and what we learned shipping it.

Choose your interests:

No spam, unsubscribe anytime. We respect your privacy.

Frequently Asked Questions

Is the LLM Wiki pattern a replacement for RAG?

Not strictly. The wiki sits on top of -- or in place of -- vector-store retrieval, depending on the use case. For knowledge that needs to compound and reconcile over time, the wiki wins. For one-shot search across a static document set, RAG can still be the right tool. Many teams will end up running both.

How big can the wiki get before it stops working?

Karpathy does not give a hard number, but the practical ceiling is whatever fits in the LLM's context window when reading index.md plus a small handful of entity pages. With current frontier models that is comfortably tens of thousands of pages.

What is the minimum viable version of this pattern?

One folder. One CLAUDE.md schema doc explaining your conventions. One index.md listing every entity page. One log.md with grep-friendly date prefixes. The LLM does the rest.

Social Media Carousel

7 cards • Download as ZIP (images) or PDF (LinkedIn)

Download

1 of 7

The LLM Wiki

Karpathy's April 2026 gist

A structured synthesis layer that compounds, ages well, and beats raw RAG on the trade-offs that actually matter.

Download

2 of 7

Three-Layer Architecture

Schema

CLAUDE.md

Operating contract

Wiki

entity pages + index

Maintained knowledge

Sources

PDFs, articles, transcripts

Immutable truth

Three layers, one knowledge base

Sources stay immutable. Wiki is the maintained synthesis. Schema is the operating contract.

Download

3 of 7

10-15

Pages per ingest

One source touches 10-15 existing wiki pages on ingest, not one isolated chunk in a vector store.

Download

4 of 7

The Three Operations

Ingest pulls in. Query reads out. Lint feeds back. A self-maintaining loop.

Download

5 of 7

RAG vs LLM Wiki

Before

Re-retrieves raw fragments per query

After

Reads pre-synthesised, cross-linked pages

Download

6 of 7

Each ingest and each good answer can strengthen the corpus rather than vanish into chat history.

— Paraphrased from Karpathy's LLM Wiki gist

Download

7 of 7

Start Your Own Wiki

One directory, one schema doc, one index file, one log. Let your LLM read it, write it, and lint it.

Read the Gist

Karpathy's LLM Wiki: The Knowledge Base That Beats RAG

Building With AI?

What Karpathy Actually Proposed

The LLM Wiki

Three-Layer Architecture

How a Source Becomes a Wiki

The Three Operations

The Three Operations

The Three Operations Loop

Ingest

Query

Lint

Lint is the maintenance loop

Why This Beats RAG

RAG vs LLM Wiki -- the cost model inversion

Vector-store RAG

LLM Wiki

RAG vs LLM Wiki

Want this running on your knowledge base?

How Agent Vault Implements About 75% of This

Karpathy says → what we built

What We Do Not Have Yet

Today (about 75%)

The missing 25%

The Comparison Mental Model

Related Reading on This Stack

What This Means For Your Stack

The Minimum Viable Wiki

Start Your Own Wiki -- five steps for the weekend

Apply This Pattern to Your AI Tooling

Start Your Own Wiki

Get More AI Architecture Insights

Frequently Asked Questions

Social Media Carousel

The LLM Wiki

Three-Layer Architecture

The Three Operations

RAG vs LLM Wiki

Start Your Own Wiki

Share This Article

Stop Guessing.Start Growing.

Continue Your Learning Journey

The $300M-a-Week Bet Behind Claude Opus 4.8

AI Brochures Can Hallucinate Your Client's Google Rating

AI Image Generators Can Render Perfect Typography

Need Help With AI Development?

Website Development

Marketing Automation

We Serve Businesses Across Australia

Stop Guessing.
Start Growing.