MemPalace Stores Everything. Your Meetings Store Nothing. Here's the Gap.

MemPalace went from 0 to 23,000 GitHub stars in 72 hours by arguing that AI memory should never throw away anything. Yet it can't ingest a single meeting transcript. The largest source of organizational knowledge isn't even in the conversation.

mempalacemeeting-aiai-memoryknowledge-graph

The core argument of MemPalace fits in one sentence, and the README states it almost as a manifesto: AI memory systems should store everything verbatim and never let an LLM decide what is worth keeping. The README's framing of the competition is sharp: "Other memory systems try to fix this by letting AI decide what's worth remembering. It extracts 'user prefers Postgres' and throws away the conversation where you explained why." That sentence is why the project went viral in 72 hours. It is also the sentence that makes the meeting-transcript gap so glaring. A system whose entire reason for existing is to never throw context away has no path at all for ingesting the single largest source of organizational context most teams have.

This article maps that gap honestly — what MemPalace gets right, what it does not address, and what a meeting-native version of the same idea would have to look like.

23,000 stars in 72 hours

The context first, briefly, because the speed matters. MemPalace launched on April 6, 2026. Within 24 hours it had passed 5,000 GitHub stars. By April 8 it sat at roughly 23,000 stars and 3,000 forks, with 159 PRs already filed. The co-creators are actress Milla Jovovich (yes, that one) and engineer Ben Sigman, and the launch was carried by a tweet with more than 1.5 million impressions.

The benchmark claims that drove the launch — "first perfect score on LongMemEval, beating every product in the space" — turned out to be more complicated than the tweet suggested, and the community pulled them apart in real time. We covered the details of that controversy honestly in the AI memory week round-up, and we are not going to relitigate it here. The short version: the headline numbers compare retrieval recall against competitors' end-to-end QA accuracy, which is not a like-for-like comparison, and the "100% on LongMemEval" run involved patches written against the exact failing questions. That is real, and it is worth knowing.

But the architectural ideas underneath the marketing are real too, and that is what this article is actually about. The 19 read-write MCP tools. The local-first SQLite knowledge graph. The wings-rooms-halls-drawers spatial metaphor. The 4-layer loading system that boots an agent in roughly 170 tokens at the lowest setting. The insistence on storing original verbatim files in "drawers" that never get summarized away. These are real engineering choices with real consequences. Strip away the launch hype and there is a contribution here worth taking seriously.

So let us take it seriously, in the one direction that matters most for the people building meeting tools: what happens when you point MemPalace at a meeting?

What MemPalace actually accepts as input

The conversation miner is in mempalace/normalize.py. Five formats are supported, and the list is exact:

  1. Claude Code JSONL
  2. Claude.ai JSON
  3. ChatGPT conversations.json
  4. Slack workspace exports
  5. General text files

The mining modes are equally specific. --mode convos parses chat-style exchanges with turn-taking. --mode projects walks codebases across about 20 file extensions. --extract general runs a regex-based pass over plain text looking for surface patterns: decisions, preferences, milestones, problems, emotional moments. That is the surface area.

All five inputs share a property worth naming: they are written conversations between one human user and one AI, or asynchronous text channels with one author per message. The miner expects that shape. The > user marker, the assumption of short alternating turns, the absence of any speaker-identity tracking — every design choice in the normalizer reflects an input world that looks like chat.

What MemPalace does not accept

The list of things the conversation miner does not handle is, in context, the entire shape of the meeting world:

  • No VTT format. WebVTT is the standard subtitle and transcript format used by Zoom and a long list of video tools. It is not in the normalizer.
  • No SRT format. The other dominant subtitle format used by transcription pipelines. Also absent.
  • No native Zoom transcript export. Not by structured field, not by file recognition.
  • No Microsoft Teams transcript export. Same.
  • No Google Meet transcript export. Same.
  • No speaker diarization awareness. Chat exports have one speaker per message by definition. Meeting transcripts have multiple speakers per session, with overlapping context, interruptions, and identity that has to be resolved across files.
  • No timestamp preservation. Meeting transcripts are anchored to the second. That anchoring is what lets you trace a decision back to the moment it was made. Nothing in the miner preserves it.
  • No agenda awareness. Meetings are organized around topics. Chats are reactive. The miner has no concept of topic boundaries inside a session.
  • No monologue support. A long uninterrupted stretch of one speaker — common in voice memos, lectures, and a lot of real meetings — does not look like a chat exchange and does not pass through the miner cleanly.

Two more facts make the gap impossible to read as accidental. The word "meeting" appears zero times in the MemPalace documentation and codebase. The word "transcript" appears zero times. This is not an oversight to be patched in a 1.0.1 release. It is a design space the project never entered.

That is fine, by the way. Projects are allowed to scope themselves. The point is not that MemPalace is wrong to focus on chat artifacts. The point is that the verbatim-first philosophy is exactly the philosophy meetings need, and the project that articulated it most loudly is not yet anywhere near the inputs where it would matter most.

Why meetings are structurally different from chats

It is worth being precise about why a chat-shaped memory system cannot just absorb a meeting by treating it as one more text file. The structural differences are not cosmetic.

Multi-speaker conversations within a single session. A chat export is a sequence of one-author messages. A meeting transcript is a single session with multiple speakers whose turns overlap, whose context depends on each other, and whose identities have to be tracked across the entire file rather than per-message.

Long monologues followed by reactive discussion. Real meetings alternate between someone talking for ten minutes and a flurry of two-line responses. Chat miners assume short turn-taking. Both shapes break the assumption.

Agenda-driven flow. Meetings are structured around topics that span multiple speakers. The same topic can run for fifteen minutes, then return briefly thirty minutes later. A unit-of-meaning here is not "a turn" — it is "a discussion of X across multiple turns by multiple people." Chat miners have no concept of this.

Cross-meeting continuity. Sarah from the customer team in this week's call is the same Sarah next week, even if she shows up in a meeting that MemPalace would file under a different room. Speaker identity has to resolve across files, not within them. Chat exports do not have this problem because the user is always one person.

Decision moments anchored to timestamps. When a meeting decides something, the decision exists at a specific point in the recording. That timestamp is how you audit the decision later. A memory system that drops timestamps at ingest cannot answer "when exactly was this decided," which is the question that matters most when decisions go sideways.

Action items that span meetings. Something assigned in meeting A and resolved in meeting B is not two unrelated facts. It is one thread. A chat-shaped extractor sees them as independent observations. A meeting-shaped extractor has to model the thread itself.

A memory system designed for chat treats each turn as an independent unit and lets the corpus shape itself from the bottom up. A memory system for meetings has to treat the full conversation as a unit, while still extracting structured signals — decisions, commitments, entity mentions — that get tracked across sessions. These are not the same job.

What MemPalace gets right

This is the part of the article where it would be cheap to be a hater. The honest read is the opposite: most of MemPalace's load-bearing architectural choices are exactly what a meeting-native memory system would also need. The project is closer to "right pattern, wrong inputs" than "wrong pattern."

Verbatim storage as the primitive. The "drawer" pattern — original files preserved exactly, never summarized over — is the right base layer for meeting transcripts. Meetings are precisely the kind of source where extraction loses critical context. The phrasing matters. The hesitation matters. The follow-up question that nobody answered matters. Storing the verbatim transcript and treating any compiled artifact as a derived view on top is the architecture meeting tools should already be using and mostly are not. (For the broader version of this argument, see the meeting-to-wiki gap.)

Local-first by default. Meeting transcripts are some of the most sensitive data an organization owns. They contain client conversations, internal disagreements, named individuals discussing things they would not write down. The Nordic data sovereignty concerns are real and they are not going away. MemPalace's default of running on local SQLite and ChromaDB, with no cloud round-trip required, is the right starting position. Most meeting tools currently default the other way.

Read-write MCP tools. This is the breakthrough nobody is naming clearly enough. Almost every meeting MCP server shipped to date is read-only — agents can search transcripts and pull summaries, but nothing writes back. MemPalace ships 19 tools and the list explicitly includes add_drawer, kg_add, kg_invalidate, and diary_write. Write operations are first-class. That is the architectural shape meeting tools need to evolve toward, and MemPalace is currently the clearest example of it in the open-source memory space.

The 4-layer loading system. The split into L0 identity, L1 critical facts, L2 room recall, L3 deep search is a real engineering choice with a measurable cost. Independent reproduction puts actual L0+L1 closer to 600–900 tokens than the README's 170, but the principle holds: an agent that needs context every time it spins up benefits enormously from a tiered loading model where most of the information is fetched only when asked for. Meeting agents that have to brief themselves on a person or a project before answering need exactly this shape.

The schema-as-file pattern. MemPalace's structure is defined in plain files the LLM reads at boot. That is the same pattern Karpathy described for the LLM-maintained wiki — a CLAUDE.md or AGENTS.md that tells the model how to compile content into the knowledge base. It is portable, version-controllable, and forces the human to think clearly about what the knowledge base actually is. Meeting tools have not adopted this pattern at all. They should.

What MemPalace gets wrong (or does not get at all) for meetings

Now the other side. Take the ideas above as given, and the gaps for the meeting case sharpen quickly.

The wings-rooms-halls metaphor does not map onto meetings naturally. Meetings are episodic, not topical. A single one-hour meeting touches five "rooms" worth of subject matter. Forcing it into a single room loses the cross-topic context. Splitting it across rooms loses the session unity. Neither is right. Chat memory is naturally room-shaped because each conversation is mostly about one thing. Meetings are not.

The conversation miner expects chat patterns, not meeting structure. Documented above. The > user marker, the assumed turn cadence, the lack of any speaker-continuity tracking — none of it survives contact with a real transcript file.

The knowledge graph is real but limited, and contradiction detection does not actually exist. This one is important and surprising. The README repeatedly demonstrates contradiction detection ("AUTH-MIGRATION: attribution conflict — Maya was assigned, not Soren"). Leonard Lin's independent code review found zero occurrences of "contradict" in knowledge_graph.py. A separate fact_checker.py exists but is not wired into the KG operations. The project has acknowledged this. For meetings, contradiction detection is one of the most valuable possible features — did this week's decision contradict last month's? — and right now MemPalace does not implement it, even for the inputs it does support.

No speaker identity resolution across meetings. "Sarah" in meeting one and "Sarah Chen" in meeting five are different entities to MemPalace. There is no voice fingerprinting, no name-merging pass, no entity resolution layer designed for conversational audio. For meeting knowledge to compound — the entire reason you would build this in the first place — speaker identity has to be solved at the source, not retrofitted at query time.

No decision extraction. The --extract general mode runs regex against text looking for decisions and preferences. That catches surface patterns. It does not understand meeting structure, which is where most decisions actually live ("okay, so we are going with the second option" is structurally a decision precisely because of the discussion that came before it). Meeting transcripts need extraction that is aware of the conversational context, not pattern-matching against keywords.

The verbatim-vs-extraction debate, applied to meetings

Step back. MemPalace's "store everything verbatim" position is one pole of an old debate. The other pole is meeting tools like Granola, which store enhanced notes brilliantly and largely throw the raw transcript away. Both poles have a real argument behind them. Verbatim defenders are right that extraction loses context. Extraction defenders are right that nobody re-reads a four-hour transcript.

The honest answer is that this is a false binary. The right architecture stores both:

  • The verbatim transcript, immutable, so any moment can be replayed and any extraction can be re-derived.
  • Structured signals — decisions, entities, action items, speaker identities — extracted at ingest and tracked across sessions as first-class objects.
  • A single query interface that lets you reach into either layer depending on what you need.

This is also the architecture Karpathy described for the LLM-maintained wiki, with raw sources in one folder and a compiled, regeneratable wiki on top. If you have not read the workflow guide or the meeting-knowledge framing, they make the same point at length. The skeptical version of the same conversation, including the token-cost and contamination concerns, is in the LLM wiki skeptic's guide.

MemPalace has taken the verbatim half seriously and the extraction half lightly. Most meeting tools have done the opposite. Neither is finished.

What a meeting-native MemPalace would look like

If you took the architectural philosophy of MemPalace and rebuilt it for meetings from scratch, the components are not mysterious. The list reads like a product spec.

  • Native ingestion of meeting formats. VTT, SRT, Zoom, Teams, Google Meet, Otter exports, raw audio piped through transcription. Anything that produces words spoken by humans in a room.
  • Speaker resolution as a first-class operation. Voice fingerprints, name extraction from context, cross-meeting identity merging. "Sarah Chen" in week five gets unified with "Sarah" in week one without the user having to ask.
  • Decision and action item extraction in the ingest pipeline. Not as a downstream chat feature. As part of how a meeting becomes a memory. Extracted items live as first-class entities in the graph, tagged with the source meeting and the timestamp inside it.
  • Cross-meeting synthesis as a persistent artifact. "Everything Sarah committed to across the last six meetings" should not be a chat answer that disappears when you close the tab. It should be a markdown file that exists tomorrow and updates itself incrementally after the seventh meeting.
  • An MCP tool surface for both read and write. Read tools for searching transcripts and pulling entities. Write tools for ingesting new meetings, updating decision logs, merging speaker identities, flagging contradictions. The write half is the part most meeting tools are missing today.
  • Local-first architecture with EU data residency as the default. Not as an enterprise add-on. As the starting position. This is table stakes for European customers and increasingly for everyone else.

This is not a hypothetical product. We have been building it at Proudfrog for months — the Proudfrog MCP server, currently in beta with five read-only tools and write tools landing next, is the bridge we are putting between meeting transcription and the Karpathy/MemPalace memory pattern. That is the only sentence in this article about our own product, and it is here because the gap this article describes is the gap we have been building toward.

The open question

The verbatim-storage-of-meetings problem is now visible in a way it was not two weeks ago. MemPalace put one half of the answer on the table, very loudly. The meeting tools have most of the other half but do not yet realize they need MemPalace's half. The question for the next twelve months is who closes the distance. Does MemPalace add meeting support? Does a meeting tool adopt MemPalace's verbatim-first, write-back, local-first architecture? Does a new entrant arrive and do both at once?

Whatever the answer, the gap is now legible, and that visibility is the part that was missing. You can argue about who is best positioned to close it. You can no longer argue that it is not there.


Frequently Asked Questions

Why doesn't MemPalace support meeting transcripts?

It was not built for them. MemPalace's conversation miner is designed for chat-shaped artifacts — Claude Code logs, ChatGPT exports, Slack — where each message has one author and the turn cadence is short. Meeting transcripts have a fundamentally different structure: multiple speakers per session, long monologues, agenda-driven topic flow, and timestamps that anchor decisions. Supporting them would require adding native parsers for VTT, SRT, and the major meeting platforms' export formats, plus a speaker-resolution layer that does not currently exist anywhere in the codebase. It is a different design space, and the words "meeting" and "transcript" both appear zero times in the project's documentation. That is the cleanest signal that this was a deliberate scoping choice rather than an oversight.

Could I just save my meeting transcript as a text file and feed it to MemPalace?

You can, and it will not crash. The general text mode will accept it, the regex-based extractor will pull surface patterns out of it, and the result will land in a drawer. What you will lose is everything that makes a meeting transcript useful: speaker identity, timestamps, agenda boundaries, the distinction between a monologue and a discussion, and any cross-meeting continuity for the people involved. You will end up with a verbatim blob that is searchable but not structured. That is better than nothing, and much worse than a meeting-shaped pipeline would give you.

What is the difference between MemPalace's verbatim storage and Otter's full transcription?

Otter stores the transcript and exposes it through search and a chatbot. MemPalace stores raw artifacts in "drawers" and treats them as immutable source material that everything else is derived from, with explicit write tools for updating a knowledge graph on top. The philosophical difference is what each system thinks the storage is for. Otter treats the transcript as the product. MemPalace treats the verbatim file as the trusted source layer underneath a compiled, queryable graph. The Otter approach is simpler to use today; the MemPalace approach is closer to the architecture you want if you care about knowledge that compounds across sessions.

Is the MemPalace knowledge graph good enough for meeting decisions?

In its current form, no. The graph is real — SQLite-backed, subject-predicate-object triples with temporal validity windows — and the temporal model is genuinely useful. But contradiction detection, which is one of the most valuable possible features for a meeting decision log, is not implemented in the code despite the README implying otherwise. Independent code review by Leonard Lin confirmed this and the project acknowledged it. For tracking decisions across meetings — "did this week's call contradict last month's?" — the graph is a starting point, not a finished tool.

How is Proudfrog different from a meeting-native version of MemPalace?

The shortest way to put it: Proudfrog starts where MemPalace ends. Proudfrog handles the parts MemPalace does not — capture, transcription, speaker diarization, identity resolution across meetings, decision and action-item extraction, and a knowledge layer that updates itself as new meetings arrive — and is built local-first with EU data residency from the beginning. The Proudfrog MCP server (read tools today, write tools next) is intended to play the same role for meetings that MemPalace's MCP plays for chats. We are not the only people who could close this gap, but we have been building toward exactly this shape for months, and a lot of the design choices in MemPalace look familiar to us for that reason.

Should I wait for MemPalace to add meeting support?

If you need it today, no. The project is three days old at the time of writing, the contributors are responsive but the roadmap is busy with much more immediate issues (the benchmark methodology, the AAAK compression claims, the contradiction-detection gap), and meeting ingestion would require a substantial new subsystem rather than a small patch. If you are a developer who wants to experiment with the ideas, fork it and try writing a VTT normalizer — it is exactly the kind of contribution the project would probably welcome. If you need a working pipeline for your team's meetings now, look at tools that were built for meetings from the start and that take the verbatim-and-compiled architecture seriously. The knowledge workers use case and the features page are the closest descriptions of how we think about it.