Karpathy's LLM Knowledge Base Workflow — Let AI Build Your Personal Wikipedia
You know that feeling — you read a bunch of amazing articles, every single one feels life-changing in the moment, and three days later you can’t remember any of them?
Andrej Karpathy apparently got fed up with this too. He recently shared a workflow on X that isn’t a new model or a new paper, but something that sounds deceptively simple yet fundamentally changes how humans interact with knowledge: using LLMs to build a personal knowledge base.
He says a significant chunk of his recent token usage has shifted from manipulating code to “manipulating knowledge (stored as markdown and images).” Not reading and forgetting — reading, then having LLMs organize, categorize, link, and query that knowledge for you.
This is a completely different game from opening Notion and asking ChatGPT to polish your notes.
Data Ingestion: A Knowledge Compiler
Karpathy’s first step is to throw everything — papers, articles, repos, datasets, images — into a raw/ folder. Then he lets LLMs “compile” that pile of stuff.
Why “compile”? When you write code, a compiler turns human-readable source into machine-readable output. Karpathy flips it — the LLM takes a mess of raw materials and “compiles” them into a structured wiki that humans can easily digest. A directory of .md files with auto-generated summaries, backlinks, categories, one article per concept, everything cross-linked.
He uses Obsidian’s Web Clipper to convert web articles into .md files, and set up a keyboard shortcut that downloads all related images locally so the LLM can read visual content too.
Clawd 想補充:
The “knowledge compiler” concept is worth pausing on. The traditional note-taking flow is: you read → you take notes → you organize notes → you search notes. Four steps, all done by you. Karpathy outsourced the middle two to LLMs. Humans now only handle “feed raw material” and “ask questions.” That’s not laziness — it’s redefining what a knowledge worker actually does. From librarian to curator. (◕‿◕)
ShroomDog 的 OS:
When I first saw Karpathy’s flow, I did a double take — because gu-log is basically doing the same thing.
Our SP/CP pipeline is literally a “knowledge compiler”: raw materials (Twitter threads, blog posts, papers) go in, LLMs compile them into structured Chinese blog articles. The difference is output format — Karpathy produces a private wiki for personal lookup, gu-log produces a public blog for readers. But the core is identical: raw data → LLM compile → structured knowledge.
Except gu-log actually does one extra layer Karpathy doesn’t: translation + editorial voice. ClawdNote is AI’s opinion injection, ShroomDogNote (yes, the thing you’re reading right now) is the human author’s opinion injection. It’s not just organize — it’s transform, with curatorial judgment and attitude baked in. In a way, it’s more “compiled” than a private wiki.
And the fact that you’re reading this article right now is peak meta — an LLM pipeline wrote an article about “using LLMs to build knowledge bases.” Inception vibes.
Obsidian as Frontend, LLM as Backend
His choice of IDE reveals a lot about the design philosophy. He uses Obsidian as the “frontend” — browsing raw data, the compiled wiki, and various visualizations.
But here’s the key: he almost never manually edits the wiki content. All writing and maintenance is done by the LLM. He just uses Obsidian to “view” the results.
In web terms: Obsidian is a read-only viewer, the LLM is the backend API + database writer. You don’t modify the database directly — you go through the API (i.e., talking to the LLM) to make changes.
Wait — do you realize what this means? The user got demoted from author to reader. The old note-taking model was “you write, you organize, you search.” Now it’s “you dump data in, AI organizes, you ask questions.” Your role shifted from being the librarian who manually shelves books to being the person sitting on the couch asking the librarian questions.
Clawd 碎碎念:
“Second brain” has been used to death as a metaphor, but Karpathy might be the first person to make it literal. Previously, “second brain” meant “your note system is well-organized so you can find things fast.” Now it means you actually have an external brain processing information for you — and it organizes its own shelves, writes its own summaries, builds its own indexes. You just feed it and ask questions. ┐( ̄ヘ ̄)┌
ShroomDog 實戰觀點:
gu-log’s architecture is almost identical to Karpathy’s — just swap Obsidian for a Vercel-hosted Astro website. Why not Obsidian? Honestly, it comes down to mobile experience. gu-log produces a public blog, and readers (including myself) mostly read on their phones. With Obsidian, you either install the app or pay for vault sync — and even then, sharing an article with someone means… what, exporting a PDF? With Vercel, it’s just a URL. Any device, zero setup.
The backend is also LLM-managed: Claude Code runs the pipeline, writes articles, Ralph scorer grades them, auto-commit + push. I basically just feed in source material and review the output — which is exactly Karpathy’s “only use Obsidian to view, never manually edit the wiki” philosophy.
When Your Wiki Gets Big Enough to “Research”
Up to this point, you might think: “So it’s fancy note organization?” No. When the quantity gets big enough, a qualitative shift happens.
Karpathy says one of his research topic wikis has grown to about 100 articles and 400,000 words. Four hundred thousand words. That’s not a notebook — that’s an encyclopedia. At this scale, you can start asking your LLM agent complex questions that require cross-referencing dozens of articles, and it’ll dig through the wiki, compare sources, and assemble answers on its own.
And — this is the part that makes RAG companies nervous — he originally expected to need fancy RAG (Retrieval-Augmented Generation), but found that at this scale, the LLM manages “pretty good” with just auto-maintained index files and document summaries. He hedged with “~small scale,” meaning it might not hold at much larger sizes. But at the 400K-word level, no vector database needed.
Clawd 真心話:
Speaking of scale — gu-log itself is actually much larger than Karpathy’s wiki. As of April 2026, gu-log has 422 articles totaling over 1.15 million words (Chinese characters + English terms combined). Karpathy’s wiki is 100 articles / 400K words; gu-log is nearly 4x the article count and 3x the word count. Granted, gu-log is a public blog rather than a private wiki, so the use case differs — but in terms of sheer “volume of structured knowledge compiled by LLMs,” it’s a substantial corpus.
As for RAG, Karpathy says he doesn’t need fancy RAG at this scale — index files and summaries do the job. How many Series A RAG startups felt a chill reading that? Of course he hedged with “pretty good” and “~small scale.” But when the former Tesla AI Director says “I don’t need it for my use case,” that at least suggests RAG’s product-market fit might be narrower than many founders are pitching. ( ̄▽ ̄)/
Every Curiosity Is a Deposit for Your Future Self
Most people interact with LLMs like this: type in a terminal or chat window, read the reply, move on. That conversation disappears into history. Karpathy doesn’t play that game.
He has LLMs render answers in various formats — Markdown for Obsidian, Marp slides for presentations, matplotlib charts for visualization. But the format variety isn’t the real point. The real point is that he “files back” query outputs into the wiki.
Picture this cycle: you ask a question → the LLM digs through the wiki and assembles an answer → that answer itself becomes part of the wiki → next time anyone (including future-you) asks a related question, that answer is already there. He wrote: “my own explorations and queries always add up in the knowledge base.” Every time you’re curious about something, you’re making a deposit into a knowledge savings account for your future self.
This isn’t a throwaway chat log. It’s a self-growing knowledge organism, and its nutrients come from your daily explorations. A question you casually asked today might become a critical puzzle piece in some complex reasoning chain six months from now.
Clawd 內心戲:
This “file back query results” pattern reminds me of compound interest. Most people’s ChatGPT conversations are like day wages — you get a reply, then it evaporates. Karpathy’s approach deposits every conversation’s output into an account where it keeps earning interest. Three months later, your wiki has three hundred answers to questions you’ve already forgotten asking, and the LLM can cite those old answers when tackling new questions. Knowledge compound interest. (๑•̀ㅂ•́)و✧
Lint Your Knowledge: The Most Underrated Killer Feature
Everything so far has been in the “oh, that’s clever” territory. But this next part is the idea that made me stop reading and stare at the ceiling for a minute.
Karpathy runs LLM “health checks” on his wiki. If you’ve written code, you know linting — ESLint catches syntax issues, type checkers catch type errors. Karpathy brought the same concept to knowledge.
The LLM scans the entire wiki and finds places where the same concept is described contradictorily across different articles. It uses web search to fill information gaps. And the wildest part — it finds potential connections between two seemingly unrelated concepts and tells you: “Hey, have you noticed A and B might be related? Want to dig deeper?”
The most valuable breakthroughs in academic research often don’t come from “I deliberately searched for X.” They come from “I was looking for X and accidentally discovered a connection between Y and Z.” That phenomenon has a name: serendipity. What Karpathy is doing is essentially systematically manufacturing serendipity.
Clawd murmur:
“Linting knowledge” — those two words alone are worth the price of admission for this article. Think about it: your brain definitely holds contradictory beliefs, you just don’t realize it. You might simultaneously believe “the early bird catches the worm” and “good things come to those who wait” — those beliefs contradict each other in specific contexts. A system that automatically surfaces these contradictions isn’t organizing knowledge. It’s upgrading the quality of your thinking. Knowledge linter: my nomination for the most underrated AI use case of 2026. (╯°□°)╯
Tools Building Tools: The Beauty of Recursion
Karpathy also mentioned building additional tools for his wiki data. He vibe-coded a small search engine with a web UI, but the more common usage is wrapping it as a CLI tool that the LLM can call when handling larger queries.
Notice the recursive structure: LLM builds a wiki → human builds a search engine on top of the wiki → LLM uses that search engine to better operate the wiki → wiki gets better → which enables building even better tools. Tools building tools, each layer reinforcing the next. This isn’t linear improvement — it’s exponential.
Clawd 溫馨提示:
This pattern points the same direction as current agent frameworks — making LLMs not just talk but “do things.” But Karpathy’s version is more interesting because his tool chain orbits a specific knowledge base, not a generic “I can do anything” setup. Constrained tools are often more powerful because they understand their domain more deeply. A Swiss Army knife cuts everything; a sushi chef’s specialized blade cuts art. (⌐■_■)
ShroomDog 實戰觀點:
This “tools growing organs” pattern has a living example at ShroomDog’s day job.
The team maintains a development document — essentially a Git-tracked Markdown repo with a remote. Sounds basic, right? But with Obsidian as the frontend and Claude Code as the backend manager, the repo grew two surprisingly useful limbs: Claude Skills and Claude Commands.
Here’s a concrete example: I wrote a read-only SQL skill that lets Claude do Text-to-SQL queries against our internal database. Because the whole system is a Git repo, when teammates pull, they automatically get this skill — no extra setup, no permission grants, no onboarding docs. The tool travels with the repo. Version-controlled, auto-synced, grows with the team.
Same recursive pattern as Karpathy’s: knowledge base grows tools → tools make the knowledge base more useful → team members benefit automatically via Git sync. The only difference is scale — Karpathy’s is a one-person wiki, ShroomDog’s is a team’s dev docs.
The Future: When Your Wiki Grows Into the Model’s Brain
Near the end, Karpathy floated a longer-term idea: once a knowledge base grows large enough, you’d naturally want to do synthetic data generation + finetuning — baking the knowledge into the model’s weights instead of loading it through the context window every time.
Here’s an analogy: the context window is RAM, finetuning is ROM. You run your program in RAM first — fast iteration, easy to modify, quick to verify. Once the knowledge stabilizes, you “burn” it into ROM. This isn’t science fiction; it’s the natural next step in Karpathy’s vision of how knowledge bases evolve.
Clawd murmur:
The RAM vs ROM analogy lands well, but practically speaking, finetuning is still way more expensive and complex than stuffing data into the context window. The real point here isn’t the technical details — it’s the evolutionary arc: your knowledge base goes from “a bunch of files” to “a living system” to potentially “part of the model itself.” External memory → internalized memory. Which, if you think about it, is exactly how human learning works. ヽ(°〇°)ノ
The Ephemeral Wiki: Pushing to the Logical Extreme
In his second tweet of the thread, Karpathy pushed this concept to its most mind-bending boundary.
Imagine: every time you ask a frontier-class LLM a question, instead of just .decode()-ing a response, it dispatches an entire team of LLMs that automatically build a temporary wiki, ingest data, lint it a few rounds, iterate, and finally produce a comprehensive report. He called it an “ephemeral wiki” — not the kind you maintain long-term, but one built on the fly to answer a single question, then discarded.
Sounds wasteful? But think about how humans do research. To understand one topic, you read a pile of papers, take a pile of notes, draw relationship diagrams, and only then write your conclusion. Most of those intermediate artifacts never get looked at again either. Karpathy is just having AI compress a human researcher’s weeks-long workflow into minutes.
Clawd 補個刀:
“Way beyond a
.decode()” — this line is the soul of the entire thread. The way most people use LLMs right now — one prompt in, one reply out — is like using a supercomputer to calculate 1+1. What Karpathy envisions is: before answering a question, first construct an entire knowledge structure, then reason from within that structure. Not faster text generation — deeper understanding. The difference between a calculator and a thinking machine. (⌐■_■)
Closing Thoughts
Karpathy ended with this line:
“I think there is room here for an incredible new product instead of a hacky collection of scripts.”
Everything he demonstrated — compiling knowledge, auto-linting, filing queries back, tools building tools — was cobbled together from existing tools. But assembled together, what emerges is a complete knowledge lifecycle: input → compile → query → grow → lint → grow more.
If “linting knowledge” as a concept didn’t make you pause and think, go back and re-read that section. That’s not note organization. That’s systematically upgrading the quality of human thought. And this entire workflow is currently just a bunch of scripts. Imagine what it looks like as a product.