News

Building Scalable AI Agent Knowledge Bases: Addressing Accumulation and Structure with akm's Multi-Wiki Support

Building Scalable AI Agent Knowledge Bases: Addressing Accumulation and Structure with akm's Multi-Wiki Support

As AI coding agents become more sophisticated, the volume of skills, scripts, and contextual information they rely on continues to grow. When exploring a new research area, such as LLM inference optimization, developers typically read papers, take notes, and save PDFs. Their agents then summarize, discover connections, and ask follow-up questions that generate even more notes. Weeks later, one might find a disorganized collection of markdown files, partially digested papers in a downloads folder, and a vague memory of "KV cache quantization being covered somewhere." A search yields multiple files with incomplete takes on the same topic, none providing the precise answer needed.

This isn't a storage problem; all the information exists. Instead, it's a structural issue—and akm's multi-wiki support is specifically designed to address this.

Karpathy's LLM Wiki Pattern

In a GitHub gist, Andrej Karpathy outlined a pattern for maintaining a markdown knowledge base collaboratively built and managed by humans and LLMs. The core idea is deceptively simple: a structured directory of markdown pages, with an agent responsible for synthesizing incoming information into these pages over time.

The key insight is that agents excel at tasks humans find tedious. This includes summarizing a 40-page paper into a two-paragraph page entry, identifying connections between a new paper and existing knowledge, logging additions and their rationale, and updating existing pages when new information contradicts or extends them.

However, agents are not proficient at maintaining invariants. An agent won't reliably enforce slug uniqueness, consistently regenerate an index, or avoid overwriting immutable source files. It also tends to lose track of ingested data when sessions end.

Therefore, a clear division of labor is essential: the agent handles information synthesis, while tooling enforces the necessary invariants. This is precisely the gap that akm's wiki support fills.

The Problem: Knowledge Accumulates and Gets Lost

Most developers encounter this problem in one of two ways:

The first involves importing files as knowledge assets (e.g., akm import ./paper.md), which the agent then reads directly. This works for one or two documents. At ten documents, the agent loads full text into every relevant session. By thirty documents, tokens are wasted on unnecessary context, and the agent attempts to mentally synthesize relationships across thirty separate files per session. The result is a disconnected and unindexed knowledge base.

The second approach is to instruct the agent to "take notes" in a single notes.md file. This file grows into an unstructured wall of text, making search reliant on simple grep commands. Updating it requires the agent to read the entire file before making any additions.

↗ Read original source