Lesson 7: Architecture and Data Flow of claude-mem

⏱ Est. reading time: 3 min Updated on 5/7/2026

7.1 Automated Capture Workflow

claude-mem utilizes a Hook mechanism to implement fully automated background recording:

  1. SessionStart: Injects relevant historical context at the beginning of a session.
  2. PostToolUse: The core stage. Every time Claude Code executes a tool (e.g., Edit or run_command), the Hook passes the operation context to the background Worker.
  3. Distillation: The Worker calls the LLM to "distill" the raw interaction into a structured Observation record (including type, summary, etc.).
  4. Storage: The record is written to the local SQLite database and a vector embedding is stored in Chroma.
  5. Stop: Generates a summary report of the entire session when it concludes.

7.2 Data Flow During Retrieval

When the LLM needs to query history, the data flows as follows:

  • Initiate Query: The LLM calls the search MCP tool.
  • Hybrid Retrieval: The system simultaneously performs full-text matching in SQLite and semantic vector calculations in Chroma.
  • Return Results: Returns a compressed list of IDs.
  • Deepen Context: The LLM further calls timeline or get_observations to retrieve full details as needed.

7.3 Three-Layer Storage Abstraction

To balance performance and functionality, claude-mem builds a three-tier data structure:

  • Raw Layer: The original .jsonl session logs generated by Claude Code (located in ~/.claude/projects/*/sessions/).
  • Distillation Layer: The Observation records distilled by claude-mem, stored in SQLite and Chroma.
  • Usage Layer: Injected into the System Prompt via Hooks or read by the LLM via MCP tools.

7.4 Data Privacy and Compliance

Important Note: Although the data is stored locally, the "distillation" process requires calls to the remote LLM API. This means:

  • Context from every tool call is sent to Anthropic for summarization.
  • For sensitive projects, you should disable this plugin in .claude/settings.local.json to ensure data security.