Episode 3: Inside Claude-Mem's Brain — Data Storage Architecture

⏱ Est. reading time: 14 min Updated on 5/7/2026

This episode's scenario: You've been using Claude Code on the blog project for two sessions. The Web UI shows a bunch of Observations. Questions arise: Where is this data stored? What does it look like? Why two databases?


3.1 Physical Storage Location

All Claude-Mem data lives in a single directory on your local drive:

~/.claude-mem/
├── claude-mem.db          # SQLite database (core)
├── settings.json          # Configuration
├── chroma/                # ChromaDB vector database
│   └── ...
└── logs/                  # Runtime logs
    └── worker.log

No data is ever uploaded to the cloud. Your development memory belongs entirely to you.


3.2 SQLite — The Structured Storage Engine

Why SQLite?

Feature Benefit
Zero config No database server to install
Single file Entire DB is one .db file — copy it to back up
WAL mode Write-Ahead Logging enables concurrent reads/writes from Hooks and Worker
Built-in FTS5 Full-text search engine for fast keyword lookups

Four Core Tables

Table 1: sdk_sessions — Session Records

A new row is created each time you start a Claude Code session.

SELECT sdk_session_id, project, status, created_at, completed_at
FROM sdk_sessions
ORDER BY created_at DESC LIMIT 5;
┌──────────────────┬─────────────┬───────────┬─────────────────────┐
│ sdk_session_id   │ project     │ status    │ created_at          │
├──────────────────┼─────────────┼───────────┼─────────────────────┤
│ sess_abc123      │ my-blog     │ completed │ 2026-04-21 10:30:00 │
│ sess_def456      │ my-blog     │ completed │ 2026-04-20 14:00:00 │
│ sess_ghi789      │ other-proj  │ completed │ 2026-04-19 09:15:00 │
└──────────────────┴─────────────┴───────────┴─────────────────────┘

Table 2: observations — The Heart of Claude-Mem

Every time Claude executes a tool (read file, write file, run command), the Worker generates an Observation.

SELECT id, title, type, tool_name, created_at
FROM observations WHERE session_id = 'sess_abc123'
ORDER BY prompt_number;
┌────┬──────────────────────────────┬──────────┬────────────┐
│ id │ title                        │ type     │ tool_name  │
├────┼──────────────────────────────┼──────────┼────────────┤
│ 1  │ Read prisma schema file      │ discovery│ Read       │
│ 2  │ Add Comment model            │ feature  │ Write      │
│ 3  │ Run prisma migrate           │ change   │ Bash       │
│ 4  │ Fix foreign key error        │ bugfix   │ Write      │
│ 5  │ Choose cascade delete policy │ decision │ Write      │
└────┴──────────────────────────────┴──────────┴────────────┘

Each Observation contains rich fields:

Field Meaning Example
title One-line summary "Fix JWT refresh logic"
narrative Detailed account (third person) "The developer discovered the refreshToken function was missing..."
facts Extracted facts list ["Uses jsonwebtoken library", "Token TTL is 7 days"]
concepts Related concepts ["JWT", "authentication", "middleware"]
type Category (6 types) "bugfix"
tool_name Triggering tool "Write"
files_read Files read ["src/auth/jwt.ts"]
files_modified Files changed ["src/auth/jwt.ts"]

Table 3: user_prompts — Your Inputs

Records every message you send to Claude within each session.

SELECT prompt_number, content FROM user_prompts
WHERE session_id = 'sess_abc123' ORDER BY prompt_number;
┌───────────────┬──────────────────────────────────────┐
│ prompt_number │ content                              │
├───────────────┼──────────────────────────────────────┤
│ 1             │ Add a comment feature to the blog    │
│ 2             │ Comments should support Markdown     │
│ 3             │ Fix that foreign key error            │
└───────────────┴──────────────────────────────────────┘

Table 4: session_summaries — Session Wrap-Up

Auto-generated structured summary when a session ends.

{
  "request": "Add comment system to blog",
  "investigated": ["Prisma relation model syntax", "Cascade delete strategies"],
  "completed": ["Created Comment model", "Implemented comment API", "Fixed FK constraint"],
  "next_steps": ["Build comment UI", "Implement comment notifications"]
}

Notice next_steps magic — it gets automatically injected into Claude's context on the next session, so the AI knows "what you left unfinished."


3.3 ChromaDB — The Semantic Vector Search Engine

SQLite already has FTS5 full-text search. Why also ChromaDB?

Because keyword search and semantic search are fundamentally different capabilities:

Search Type Engine When searching for "login bug"...
Keyword search SQLite FTS5 ✅ Matches records containing "login" and "bug"
❌ Misses "authentication error" records
Semantic search ChromaDB ✅ Matches "login bug"
✅ Also matches "authentication error" (semantically similar)
✅ Even matches "JWT token expiration issue"

How ChromaDB Works

graph LR
    A["New Observation:
Fix JWT refresh logic"] --> B["Vectorize
(text → number vector)"] B --> C["Store in ChromaDB
[0.23, -0.41, 0.87, ...]"] D["User search:
login issue"] --> E["Vectorize
(query → number vector)"] E --> F["Vector similarity calculation"] C --> F F --> G["Return closest matches"] style A fill:#f59e0b,color:#000 style D fill:#6366f1,color:#fff style G fill:#10b981,color:#fff

In simple terms:

  • SQLite FTS5 = Exact matching (like Ctrl+F)
  • ChromaDB = Meaning-aware matching (like a colleague who understands what you're asking)

Together, they form Claude-Mem's Hybrid Search, so you can find relevant history no matter what keywords you use.


3.4 The 6 Observation Types

The Worker automatically classifies each Observation into one of 6 types:

Type Meaning Blog Project Example
decision Architecture/design choice "Chose Prisma over TypeORM"
bugfix Fixed a bug "Fixed 500 error on duplicate article slug"
feature Built a new feature "Added article tagging system"
refactor Refactored code "Migrated routes from pages/ to app/"
discovery Learned something new "Prisma's findMany doesn't include relations by default"
change General modification "Updated .env config file"

These type labels enable precise filtering:

search(query="database", type="decision")

3.5 Complete Data Flow

graph TB
    A["Claude executes a tool
(read/write file, run command)"] --> B["PostToolUse Hook fires"] B --> C["Worker receives raw tool output"] C --> D["Claude Agent SDK
compresses & distills"] D --> E["Generates structured Observation
{title, narrative, facts,
concepts, type}"] E --> F["SQLite"] E --> G["ChromaDB"] F --> F1["INSERT into observations table"] F --> F2["Update FTS5 index"] G --> G1["Text → vector embedding"] G --> G2["Store in vector collection"] H["Next search"] --> I{"Search type?"} I -->|"Exact keyword match"| F I -->|"Semantic fuzzy match"| G I -->|"Hybrid search (default)"| J["FTS5 + vector
results merged & ranked"] style A fill:#6366f1,color:#fff style F fill:#10b981,color:#fff style G fill:#10b981,color:#fff style J fill:#f59e0b,color:#000

Hands-On Exercise

  1. Work on the blog project in Claude Code for one session (e.g., create the database schema)
  2. After the session, query the database directly:
sqlite3 ~/.claude-mem/claude-mem.db
SELECT COUNT(*) FROM sdk_sessions;
SELECT id, title, type FROM observations ORDER BY created_at DESC LIMIT 10;
SELECT request, completed, next_steps FROM session_summaries LIMIT 1;
.quit
  1. Open http://localhost:37777 and compare the Web UI with your SQL results — they should match.

Coming Up Next

Data is stored, but how does it get captured "automatically"? Next episode, we reveal Claude-Mem's 5 lifecycle hooks — the invisible secretary that "takes notes behind your back." We'll trace every step from keypress to database insert with a full sequence diagram.

➡️ Episode 4: Lifecycle Hooks — How Claude-Mem Takes Notes Behind Your Back