LLM Wiki 模式:打造個人 AI 知識庫
A pattern for building personal knowledge bases using LLMs.
This is an idea file, it is designed to be copy pasted to your own LLM Agent (e.g. OpenAI Codex, Claude Code, OpenCode / Pi, or etc.). Its goal is to communicate the high level idea, but your agent will build out the specifics in collaboration with you.
Most people’s experience with LLMs and documents looks like RAG: you upload a collection of files, the LLM retrieves relevant chunks at query time, and generates an answer. This works, but the LLM is rediscovering knowledge from scratch on every question. There’s no accumulation. Ask a subtle question that requires synthesizing five documents, and the LLM has to find and piece together the relevant fragments every time. Nothing is built up. NotebookLM, ChatGPT file uploads, and most RAG systems work this way.
The idea here is different. Instead of just retrieving from raw documents at query time, the LLM incrementally builds and maintains a persistent wiki — a structured, interlinked collection of markdown files that sits between you and the raw sources. When you add a new source, the LLM doesn’t just index it for later retrieval. It reads it, extracts the key information, and integrates it into the existing wiki — updating entity pages, revising topic summaries, noting where new data contradicts old claims, strengthening or challenging the evolving synthesis. The knowledge is compiled once and then kept current, not re-derived on every query.
為什麼這個模式有效? (Why this works)
The tedious part of maintaining a knowledge base is not the reading or the thinking — it’s the bookkeeping. Updating cross-references, keeping summaries current, noting when new data contradicts old claims, maintaining consistency across dozens of pages. Humans abandon wikis because the maintenance burden grows faster than the value. LLMs don’t get bored, don’t forget to update a cross-reference, and can touch 15 files in one pass. The wiki stays maintained because the cost of maintenance is near zero.
The human’s job is to curate sources, direct the analysis, ask good questions, and think about what it all means. The LLM’s job is everything else.
架構三層級 (Architecture)
- Raw sources: 您的來源檔案 (PDF, 網頁抓取)。這是 Read-only 的真相來源。
- The wiki: LLM 產生並維護的 Markdown 資料夾。您只負責讀它,LLM 負責重寫、維持格式、加入連結。
- The schema: 指引 LLM 如何維護 Wiki 的規則文件 (如
CLAUDE.md或系統提示詞)。
“Obsidian is the IDE; the LLM is the programmer; the wiki is the codebase.”
如果您正在尋找關於 LLM 最純粹、最硬核的「第一性原理」解釋,這就是目前的黃金標本。
🎓 您可以從中學習與獲得:
- LLM 的底層直覺:Andrej Karpathy 擅長將複雜的 Token 預測轉化為人人都能理解的視覺想像,幫助您建立正確的心理模型。
- 自動化知識體系架構:學習如何使用 LLM 作為您的「數位圖書館長」,自動分類、索引並檢索您多年累積的雜亂文獻。
- 極簡主義的寫作與思考:學習頂級工程師是如何過濾雜訊,只留下具備高價值的核心技術筆記。
- 未來的學習路徑:這篇文章給出了從零開始理解現代 AI 的最短路徑,避免您在無用的技術術語中迷失方向。
不論您是想開發 AI 應用,還是單純想優化自己的個人知識庫,Karpathy 的經驗都是最紮實的基石。
🌐 延伸閱讀連結(點擊前往原始文章):