AI Development

Karpathy's LLM Wiki: The Knowledge Base That Beats RAG

Andrej Karpathy's LLM Wiki proposes a structured synthesis layer that beats raw RAG. Here's the pattern, why it works, and how to run it yourself.

May 11, 2026
7 min

Building With AI?

Learn how to build features like this yourself. I offer 1-on-1 AI web development coaching to help you ship faster with tools like Claude, Cursor, and ChatGPT.

Andrej Karpathy quietly published an 800-word gist proposing a structured layer of LLM-written markdown pages that compound, cross-link, and lint themselves. Here is what the pattern looks like, why it beats raw RAG, and how Agent Vault already runs about 60% of it.

What Karpathy Actually Proposed

Andrej Karpathy published the pattern on April 4, 2026 as a public GitHub gist (the LLM Wiki gist). It's framed not as a finished product but as an "idea file" — a sketch of how an LLM can incrementally compile curated source material into a persistent, interlinked Markdown wiki.

The architecture has three layers. Raw sources are immutable. The wiki sits on top -- a directory of markdown entity pages linked with [[wiki-link]] syntax. The schema is a config doc (Karpathy uses the CLAUDE.md convention) defining how the wiki behaves. Karpathy's own wiki reportedly reached ~100 articles and 400,000 words and was still faster and more accurate than his RAG pipeline.

Download

The LLM Wiki

Karpathy's April 2026 gist

A structured synthesis layer that compounds, ages well, and beats raw RAG on the trade-offs that actually matter.

JJM
Download

Three-Layer Architecture

Schema
CLAUDE.md
Operating contract
Wiki
entity pages + index
Maintained knowledge
Sources
PDFs, articles, transcripts
Immutable truth
Three layers, one knowledge base

Sources stay immutable. Wiki is the maintained synthesis. Schema is the operating contract.

JJM

How a Source Becomes a Wiki

Raw sources (immutable)
Wiki entity pages
Synthesised answer
Articles
[[Concept A]]
Query

LLM reads 3-4 entity pages, follows [[wiki-link]] references, returns a cross-referenced answer.

PDFs
[[Concept B]]
Transcripts
[[Concept C]]

Sources stay immutable. The wiki pages are the synthesis layer. Queries read 3-4 pages, not 100 fragments.

The Pattern in Numbers
3
Architecture layers
sources / wiki / schema
3
Operations
ingest / query / lint
10-15
Pages per ingest
touched by one source
~200
Entity pages live
Agent Vault, 12 projects

The Three Operations

Karpathy describes exactly three operations against the wiki, and they are remarkably symmetrical. There is no chunking strategy, no embedding model selection, no hybrid retrieval, no re-ranker. The "retrieval" is the LLM reading index.md and following links.

Download
10-15
Pages per ingest

One source touches 10-15 existing wiki pages on ingest, not one isolated chunk in a vector store.

JJM
Download

The Three Operations

Ingestread + update 10-15 pagesQuerysynthesise + citeLintflag drift + orphans

Ingest pulls in. Query reads out. Lint feeds back. A self-maintaining loop.

JJM

The Three Operations Loop

Every interaction with the wiki is one of three operations. The arrows show how they feed each other.

Ingest

Read a new source, update 10-15 existing pages, log it.

Query

Search the wiki, synthesise with citations, promote good answers.

Lint

Flag contradictions, stale claims, orphan pages, missing links.

The dotted return arrow is lint feeding back into ingest -- the maintenance loop that prevents wiki rot.

Lint is the maintenance loop

Vector stores have no equivalent to lint. They cannot flag contradictions, surface orphan fragments, or notice that two embeddings should obviously be cross-referenced. The wiki can, because the LLM is the index.

Why This Beats RAG

Vector-store RAG retrieves raw fragments per query -- the fragments are dumb, do not know about each other, and cannot flag contradictions. The wiki inverts the cost model: knowledge is compiled once on ingest, then read cheaply forever. When the LLM answers a question it is reading three or four pre-synthesised entity pages, not a hundred raw fragments.

RAG vs LLM Wiki -- the cost model inversion

Vector-store RAG

Retrieves raw fragments per query
Embedding-store round trip on every question
Fragments do not know about each other
Contradictions are invisible until you read them all
Fresh ingest does not retroactively help old queries

LLM Wiki

Reads pre-synthesised entity pages
Compile-once, read-cheaply-forever cost model
Cross-references baked in at ingest time
Contradictions flagged inline next to the claim
Every ingest improves the whole knowledge base

The wiki inverts the cost model: knowledge is compiled once on ingest, then read cheaply forever.

Download

RAG vs LLM Wiki

Before

Re-retrieves raw fragments per query

After

Reads pre-synthesised, cross-linked pages

JJM
Running in production

Want this running on your knowledge base?

We run LLM-wiki style knowledge systems across our client portfolio — plumbing, automotive inspection, pest control, breeders. From zero to a compounding wiki in about two weeks.

Book a 30-min call
12
Client projects
200+
Entity pages live
~2wk
Zero → compounding

How Agent Vault Implements About 75% of This

Agent Vault is the production system we run for Jordan James Media's client portfolio. About 200 entity pages across 12 projects, mirrored to a Supabase table, with an ingest-on-correction hook that auto-promotes user pushback into a new entity page. The pattern maps almost perfectly to Karpathy's three layers -- memory files are entity pages, MEMORY.md is the index, and CLAUDE.md is the schema.

Karpathy says what we built

Karpathy
wiki/*.md

Markdown entity pages, one concept per file

Agent Vault
~/.claude/projects/<project>/memory/*.md

200+ files across 12 client projects

Karpathy
index.md

Catalog the LLM reads first

Agent Vault
MEMORY.md

Content-organised, one-line entries with links

Karpathy
CLAUDE.md

Schema — how the wiki behaves

Agent Vault
CLAUDE.md

Same file name. Same convention.

Karpathy
log.md

Append-only change log

Agent Vault
agents/*/sessions/**.md

Per-session logs — more useful for us, but no global grep view yet

Four pieces, four matches. We did not set out to copy Karpathy — we built this independently for a client portfolio and discovered the convergence later.

Download
"

Each ingest and each good answer can strengthen the corpus rather than vanish into chat history.

— Paraphrased from Karpathy's LLM Wiki gist

JJM

What We Do Not Have Yet

Three pieces still missing: formal [[wiki-link]] syntax across all files (we use raw paths and grep), a multi-page ingest workflow that updates 10-15 pages per source (our ingest is mostly single-page right now), and one consolidated log.md (each session writes its own log, which is more useful for our case but loses the global grep view).

What we did just ship after writing the first draft of this post: the memory-lint skill, in three flavors. Structural lint (regex-driven) runs at every session start and on demand — it caught four real drift bugs in our 442-memory corpus on its first run, all fixed the same day. Semantic lint (Claude Haiku 4.5) runs weekly with a --since 7d filter so the cost stays around ten cents a week — it reads each memory plus the files it cites in backticks, flags specific claims that look stale or contradicted. A write-time hook (PostToolUse) lints every memory edit in real time, so new bad [[wiki-link]] references can't silently enter the corpus.

Today (about 75%)

  • 200+ entity pages live in memory/
  • MEMORY.md is the catalog
  • CLAUDE.md is the schema
  • Ingest-on-correction hook
  • Mirror to Supabase agent_memories table
  • Memory-lint: structural at SessionStart, semantic weekly, write-time on every edit
  • Orphan-page detector (in the structural lint)

The missing 25%

  • Formal [[wiki-link]] syntax across files
  • Multi-page ingest (10-15 pages per source)
  • Consolidated log.md with grep-friendly prefixes

The Comparison Mental Model

Here is the cleanest way to frame the difference. RAG is a search engine for your documents -- it answers "what did source X say about Y." The LLM Wiki is an encyclopedia your AI writes for itself -- it answers "what do I currently believe about Y, given everything I have read." Those are different questions, and the second one is almost always what you actually want.

Related Reading on This Stack

A few adjacent posts and pages that explore how we build and ship knowledge systems at Jordan James Media:

What This Means For Your Stack

Start small. One markdown directory. One CLAUDE.md. One index.md. One log.md. Let your LLM read it, write to it, and lint it. After six months you will have something a new agent can pick up in five minutes -- which is the actual test of whether your knowledge base works.

The Minimum Viable Wiki

Eight files in one directory. One schema doc, one catalog, one log, and a handful of entity pages. Everything else is the LLM doing the work.

~/my-wiki — tree
my-wiki/
├── CLAUDE.md
├── index.md
├── log.md
└── wiki/
├── [[Compounding-Knowledge]].md
├── [[RAG-vs-LLM-Wiki]].md
└── [[Ingest-Loop]].md

That's it. No vector store. No embedding model. No re-ranker. The LLM reads index.md and follows links.

Start Your Own Wiki -- five steps for the weekend

  • Create one markdown directory and a CLAUDE.md schema doc describing your conventions
  • Seed index.md with categories and an empty log.md with a date-prefix format
  • Write your first 5-10 entity pages by hand to teach the LLM the shape
  • Wire an ingest hook: when a source arrives, the LLM updates affected pages and appends one log line
  • Schedule a weekly lint pass that flags contradictions, stale claims, and orphan pages

Apply This Pattern to Your AI Tooling

We have been deploying LLM-wiki style knowledge systems for clients across plumbing, automotive inspection, and legal services. If you want help mapping your existing knowledge into this pattern, the entry point is our content marketing service.

Explore Content Marketing
Download

Start Your Own Wiki

One directory, one schema doc, one index file, one log. Let your LLM read it, write it, and lint it.

Read the Gist
JJM
Newsletter

Get More AI Architecture Insights

Patterns we are testing in production, monthly. No newsletter platforms, no scheduling tools -- just whatever we built recently and what we learned shipping it.

Choose your interests:

No spam, unsubscribe anytime. We respect your privacy.

Frequently Asked Questions

Is the LLM Wiki pattern a replacement for RAG?

Not strictly. The wiki sits on top of -- or in place of -- vector-store retrieval, depending on the use case. For knowledge that needs to compound and reconcile over time, the wiki wins. For one-shot search across a static document set, RAG can still be the right tool. Many teams will end up running both.

How big can the wiki get before it stops working?

Karpathy does not give a hard number, but the practical ceiling is whatever fits in the LLM's context window when reading index.md plus a small handful of entity pages. With current frontier models that is comfortably tens of thousands of pages.

What is the minimum viable version of this pattern?

One folder. One CLAUDE.md schema doc explaining your conventions. One index.md listing every entity page. One log.md with grep-friendly date prefixes. The LLM does the rest.

Social Media Carousel

7 cards • Download as ZIP (images) or PDF (LinkedIn)

Download
1 of 7

The LLM Wiki

Karpathy's April 2026 gist

A structured synthesis layer that compounds, ages well, and beats raw RAG on the trade-offs that actually matter.

JJM
Download
2 of 7

Three-Layer Architecture

Schema
CLAUDE.md
Operating contract
Wiki
entity pages + index
Maintained knowledge
Sources
PDFs, articles, transcripts
Immutable truth
Three layers, one knowledge base

Sources stay immutable. Wiki is the maintained synthesis. Schema is the operating contract.

JJM
Download
3 of 7
10-15
Pages per ingest

One source touches 10-15 existing wiki pages on ingest, not one isolated chunk in a vector store.

JJM
Download
4 of 7

The Three Operations

Ingestread + update 10-15 pagesQuerysynthesise + citeLintflag drift + orphans

Ingest pulls in. Query reads out. Lint feeds back. A self-maintaining loop.

JJM
Download
5 of 7

RAG vs LLM Wiki

Before

Re-retrieves raw fragments per query

After

Reads pre-synthesised, cross-linked pages

JJM
Download
6 of 7
"

Each ingest and each good answer can strengthen the corpus rather than vanish into chat history.

— Paraphrased from Karpathy's LLM Wiki gist

JJM
Download
7 of 7

Start Your Own Wiki

One directory, one schema doc, one index file, one log. Let your LLM read it, write it, and lint it.

Read the Gist
JJM

Share This Article

Spread the knowledge

Free Strategy Session

Stop Guessing.
Start Growing.

Get a custom strategy built around your goals, not generic advice. Real insights. Measurable results.

No obligation
30-min call
Custom strategy

Continue Your Learning Journey

Explore these related articles to deepen your understanding of ai development

The $300M-a-Week Bet Behind Claude Opus 4.8

Claude Opus 4.8 landed on 28 May 2026. The real story isn't the model — it's the compute bill funding it, and why that means the updates won't slow down.

7 min read
Read →

AI Brochures Can Hallucinate Your Client's Google Rating

Image-native AI brochures bake every fact into a pixel. If the model hallucinates a Google rating, you cannot find-and-replace it. Here is how to catch it.

10 min read
Read →

AI Image Generators Can Render Perfect Typography

AI image generators render clean, typo-free text if you keep each line short and feed the model your logo. Here is how we built a brochure from images.

10 min read
Read →