Andrej Karpathy's LLM Wiki Workflow: How It Works (and Why Meetings Are the Missing Source)

Andrej Karpathy's Obsidian LLM wiki workflow explained: how he uses LLMs to compile raw documents into a self-maintaining personal knowledge base, why it skips RAG, and what it means for meeting transcripts.

In April 2026, Andrej Karpathy published a post on X describing something he called an "LLM Knowledge Base" — a workflow where large language models compile raw information into a structured, self-maintaining markdown wiki. It was not a product announcement. It was not a framework launch. It was a former Tesla AI director and OpenAI co-founder sharing his personal workflow for managing knowledge, and it sent a quiet shockwave through the AI community.

The reason it mattered was not the workflow itself, which is elegant but straightforward. It mattered because of what it demonstrated about where the real value of LLMs lives — and where most of the industry is looking in the wrong direction.

What Karpathy Built

The workflow is deceptively simple. Raw data — articles, research, notes, whatever — gets fed into an LLM that "compiles" it into linked markdown files. These are not summaries in the traditional sense. They are structured, cross-referenced knowledge articles with consistent formatting, explicit links between concepts, and maintained indexes. Obsidian serves as the IDE — a place to browse, query, and extend the wiki.

From there, the system layers on additional capabilities: question-and-answer against the full knowledge base, automated "linting" that checks for contradictions and gaps, and output generation that produces new documents from the accumulated knowledge.

The key finding was striking: at practical scale — roughly 100 articles and 400,000 words — this structured markdown approach outperformed most retrieval-augmented generation (RAG) pipelines. Not in theory. In actual daily use, for a person who understands both approaches deeply.

In Karpathy's words, "a large fraction of my recent token throughput is going less into manipulating code, and more into manipulating knowledge." This is a significant statement from someone who has spent the last decade at the frontier of AI engineering. The most valuable application of LLMs, in his experience, is not writing code. It is structuring knowledge.

The Anti-RAG Insight

This finding deserves careful attention because it cuts against the grain of enormous industry investment.

The dominant approach to AI-powered knowledge management in 2025 and 2026 has been RAG: take your documents, chunk them, embed them in a vector database, and retrieve relevant chunks at query time. Billions of dollars have gone into vector databases, embedding models, chunking strategies, and retrieval optimization. Every enterprise AI startup has a RAG pipeline at its core.

Karpathy's finding is not that RAG is useless. At massive scale — millions of documents, enterprise-wide repositories — retrieval is necessary because no context window can hold everything. But at the scale where most individuals and teams actually operate, a different approach works better: have the LLM read everything, structure it, maintain the structure, and work from the structured version.

Why does this work? Several reasons:

Structure preserves reasoning. When an LLM creates a structured summary, it makes editorial decisions about what matters, how concepts relate, and what level of detail to preserve. These decisions encode a form of reasoning that raw retrieval discards. A chunk of text retrieved by cosine similarity carries no context about why it is important or how it connects to other knowledge.

Links beat embeddings at human scale. Explicit cross-references between documents ("this decision relates to that constraint") are more useful than implicit similarity scores when you have hundreds, not millions, of documents. You can follow a link. You cannot follow a vector.

Maintenance enables trust. Because the LLM continuously "lints" the knowledge base — checking for contradictions, updating stale information, flagging gaps — the output is something you can actually rely on. A RAG pipeline retrieves whatever matches the query, regardless of whether the source information is outdated, contradicted by newer data, or missing critical context.

Context windows keep growing. When Karpathy's approach would have been impractical — say, 2023-era 4K context windows — RAG was the only option. With modern context windows stretching to hundreds of thousands or millions of tokens, the calculus has changed. You can fit a lot of structured knowledge into a single prompt.

The implication is clear: for personal and team-scale knowledge, the winning architecture is not "retrieve and generate." It is "compile, structure, maintain, and reason."

The Emerging Landscape

Karpathy is not the only one exploring this territory. The space of AI-powered knowledge management is evolving rapidly, and the landscape tells an interesting story.

Google's NotebookLM has quietly become one of the most compelling products in this space. It ingests your documents — up to 500 sources — and creates what amounts to an AI expert on your specific corpus. No RAG pipeline, no vector database visible to the user. Just structured understanding of your material. The podcast generation feature gets the attention, but the real value is the knowledge synthesis.

Mem.ai raised $29 million on the promise of AI-native note-taking with automatic organization. It struggled to find product-market fit despite strong technology, illustrating a recurring pattern: building a standalone knowledge tool is hard when the knowledge lives elsewhere.

Rewind became Limitless, pivoting from full system capture to meeting-focused AI, before being acquired by Meta. The trajectory is telling — broad capture is technically impressive but practically overwhelming. Meeting-focused knowledge turned out to be more actionable.

Daniel Miessler's Fabric took a different approach entirely: open-source AI patterns that you run locally, treating AI as a Unix-style tool for processing information. No cloud, no subscription, no vendor lock-in. It resonated with a technical audience that wanted control.

Obsidian has deliberately stayed non-AI, maintaining its position as a tool for human-driven knowledge management. This is a principled stance, and Karpathy's choice of Obsidian as his wiki's frontend is notable — it suggests that the best knowledge interface might be separate from the AI that maintains the knowledge.

Notion has moved aggressively into AI agents that can reason across your workspace, blurring the line between document tool and knowledge system.

And threading through all of this is the Model Context Protocol (MCP), which is emerging as a standard way for AI systems to connect to external tools and data sources. MCP does not solve the knowledge problem, but it provides the plumbing that lets different approaches interoperate.

The pattern across all of these is convergence toward the same insight Karpathy articulated: raw data is not knowledge. Something has to compile it.

What This Means for Meetings

Here is where this gets concrete. Karpathy built his knowledge base from articles and research. But meetings are arguably a better fit for this approach — and a more urgent one.

Consider the properties of meeting knowledge:

It is generated continuously, in every meeting, without effort
It is inherently relational — involving people, decisions, projects, and commitments
It compounds over time — the hundredth meeting in a project is far more valuable in context than in isolation
It is perishable — the details fade within hours if not captured
It is currently wasted at staggering scale

The "when does this become a product?" tension that Karpathy's post gestures at is answered most clearly in the meeting domain. His personal wiki requires curation and intent. A meeting knowledge base can be automatic — because meetings have a natural structure (participants, agenda, outcomes) and a natural trigger (the meeting itself).

This is exactly the thesis behind Proudfrog's approach to meeting intelligence. Raw audio becomes structured transcript. Structured transcript becomes decisions, entities, action items, and relationship context. And that structured output compounds across every meeting, building a knowledge graph that grows more valuable with each conversation.

The compound effect is what matters most. Your tenth meeting in a project is more useful than your first not because the transcript is better, but because the system now has context: who said what before, what was decided, what commitments were made and whether they were fulfilled. By your hundredth meeting, you have something that no amount of note-taking could replicate — a comprehensive, queryable history of how your work actually happened.

Meeting Applications of Karpathy's Concepts

Karpathy's framework maps remarkably well to the meeting domain. Each concept in his workflow has a direct analog:

Compilation

In Karpathy's system, compilation means taking raw data and structuring it into linked, formatted knowledge articles. For meetings, this means AI processing every recording into structured outputs: decisions extracted and tagged, entities identified and linked to previous appearances, action items captured with owners and due dates, topic summaries that connect to earlier discussions.

This is not transcription. Transcription gives you a record of what was said. Compilation gives you knowledge — structured, linked, queryable. The difference is the same as the difference between a pile of receipts and a financial statement.

Linting

Karpathy's "linting" step checks the knowledge base for contradictions, gaps, and stale information. In the meeting context, this becomes: flagging when a decision in Tuesday's meeting contradicts one from last month, surfacing overdue commitments that were never followed up on, identifying forgotten follow-ups, and highlighting when the same question keeps coming up without resolution.

This is where meeting knowledge starts to become genuinely active — not just a record, but an advisor. "You committed to delivering this by March 15. It is now March 28 and it has not been mentioned since." That is not a search result. That is intelligence.

The Self-Maintaining Wiki

Karpathy's wiki updates itself as new information arrives. For meetings, this means a knowledge graph that grows with every conversation — automatically updating project timelines, relationship maps, decision logs, and topic threads. No manual maintenance. No curation required. Each meeting feeds the system, and the system maintains itself.

The Anti-RAG Finding at Meeting Scale

Karpathy's discovery that structured summaries outperform vector search at practical scale maps perfectly to the meeting domain. A person's meeting corpus — even an active professional — is typically hundreds of documents, not millions. At this scale, structured knowledge with explicit links between meetings, people, and decisions is far more useful than similarity-based retrieval.

When you ask "What did we decide about the pricing model?", you do not want the three transcript chunks that are closest to your query in embedding space. You want a structured answer that draws on every meeting where pricing was discussed, in chronological order, with the evolution of the thinking preserved. That requires compilation, not retrieval.

The Competitive Picture

Most meeting tools today stop at transcription, or at best, basic summarization. The market segments roughly into tiers:

Commodity transcription — tools like Otter.ai and Fireflies.ai that record, transcribe, and provide searchable text. Useful, but flat. Your hundredth transcript is no more valuable than your first because there is no connection between them. These tools solve the capture problem but not the knowledge problem.

Note enhancement — tools like Granola that augment your existing note-taking with AI-generated context. Clever and practical, but still anchored to the individual meeting. The knowledge does not compound.

The Karpathy ideal — a system that compiles every meeting into structured knowledge, maintains the links between meetings, lints for contradictions and gaps, and enables reasoning across the full corpus. This is where the real value lives, and it is where surprisingly few tools operate.

The gap between "searchable transcripts" and "self-maintaining meeting knowledge base" is enormous. It is the difference between a filing cabinet and a colleague who has attended every meeting and can reason about the full history. Most of the market is still building better filing cabinets.

Proudfrog's approach is to build toward the Karpathy ideal specifically for meetings. The Explore feature already enables AI-powered querying across your entire meeting history — not keyword search, but genuine reasoning about decisions, commitments, people, and projects. The pipeline extracts structured knowledge from every meeting — key terms, entities, meeting notes with topic summaries, action items. And the knowledge compounds: each new meeting enriches the context available for every query.

Is this fully realized today? No. The vision Karpathy articulated — with automated linting, self-maintaining indexes, and seamless compilation — sets a bar that the entire industry is still reaching for. But the architecture is right, and the direction is clear.

What Comes Next

Karpathy's post matters because it raises the expectation ceiling for everyone. Once you see what LLM-compiled knowledge looks like — structured, linked, maintained, queryable — you cannot unsee it. You cannot go back to flat notes and keyword search and feel satisfied.

This has consequences for every tool that touches knowledge work:

Meeting tools will need to move beyond transcription. The table stakes are rising. If your meeting tool cannot reason across meetings, connect decisions to their outcomes, and surface forgotten commitments, it is a commodity.

Note-taking apps will need to incorporate compilation. Manual organization is not sustainable at the rate knowledge workers generate information. The structure has to be automated.

Enterprise knowledge platforms will need to reconcile RAG with compilation. At enterprise scale, retrieval is necessary. But Karpathy's finding suggests that the retrieval should operate over compiled, structured knowledge — not raw document chunks.

The broader shift is from tools that store information to tools that understand it. Karpathy demonstrated this with articles and research. For meetings, the opportunity is even larger because the information is richer, more relational, and generated continuously.

The question is not whether meeting tools will evolve in this direction. It is how quickly the market will move and who will get there first. Karpathy showed the destination. The race to build it as a product — particularly for the domain where it matters most, the daily meetings where work actually happens — is well underway.

See how Proudfrog approaches meeting knowledge or explore the pricing to get started.

Frequently Asked Questions

What is Karpathy's LLM Knowledge Base approach?

Andrej Karpathy described a workflow where LLMs "compile" raw information into structured, interlinked markdown articles — essentially a self-maintaining wiki. The LLM reads raw data, structures it with consistent formatting and cross-references, and then maintains the knowledge base over time by checking for contradictions, updating stale information, and generating new outputs from the accumulated knowledge. His key finding was that this approach outperforms traditional RAG pipelines at practical, personal-use scale.

Why does structured knowledge outperform RAG at personal scale?

RAG (retrieval-augmented generation) works by finding document chunks similar to your query using vector embeddings. This works well at massive scale but loses context at smaller scale. When you have hundreds of documents rather than millions, an LLM can read and structure all of them, preserving reasoning about how concepts connect, what supersedes what, and what the overall narrative is. Structured summaries with explicit links carry more meaning than similarity scores.

How does this apply to meetings specifically?

Meetings generate relational knowledge — decisions connected to people, projects linked to timelines, commitments tied to outcomes. This type of knowledge benefits enormously from compilation rather than simple storage. Instead of searching transcript text, a compiled meeting knowledge base lets you trace how decisions evolved, who was involved, what commitments were made, and whether they were fulfilled. Each new meeting enriches the full history.

What does Proudfrog do today versus what this vision implies?

Proudfrog currently records meetings without bots, transcribes with speaker identification, and processes each meeting through an AI pipeline that extracts decisions, key terms, entities, action items, and topic summaries. The Explore feature enables AI-powered querying across your full meeting history. The vision Karpathy describes — automated linting for contradictions, self-maintaining indexes, fully compiled knowledge graphs — represents the direction the product is heading, with some elements already in place and others on the roadmap.

Do I need to understand AI or knowledge graphs to use this?

No. The entire point of Karpathy's approach — and Proudfrog's implementation of it for meetings — is that the AI handles the compilation and maintenance. You attend your meetings as you normally would, and the system builds the knowledge base automatically. No manual tagging, no folder organization, no curation required. The knowledge worker workflow is designed to be invisible until you need to query it.

How is this different from just searching my meeting transcripts?

Searching transcripts finds keywords. A compiled knowledge base understands context. When you ask "What did we decide about the Q3 timeline?", transcript search returns every mention of "Q3" and "timeline." A knowledge base returns the specific decisions, in order, with the reasoning and who was involved — drawing on multiple meetings to give you the full picture. The difference is retrieval versus understanding.