Episode 3: Inside Claude-Mem's Brain — Data Storage Architecture
This episode's scenario: You've been using Claude Code on the blog project for two sessions. The Web UI shows a bunch of Observations. Questions arise: Where is this data stored? What does it look like? Why two databases?
3.1 Physical Storage Location
All Claude-Mem data lives in a single directory on your local drive:
~/.claude-mem/
├── claude-mem.db # SQLite database (core)
├── settings.json # Configuration
├── chroma/ # ChromaDB vector database
│ └── ...
└── logs/ # Runtime logs
└── worker.log
No data is ever uploaded to the cloud. Your development memory belongs entirely to you.
3.2 SQLite — The Structured Storage Engine
Why SQLite?
| Feature | Benefit |
|---|---|
| Zero config | No database server to install |
| Single file | Entire DB is one .db file — copy it to back up |
| WAL mode | Write-Ahead Logging enables concurrent reads/writes from Hooks and Worker |
| Built-in FTS5 | Full-text search engine for fast keyword lookups |
Four Core Tables
Table 1: sdk_sessions — Session Records
A new row is created each time you start a Claude Code session.
SELECT sdk_session_id, project, status, created_at, completed_at
FROM sdk_sessions
ORDER BY created_at DESC LIMIT 5;
┌──────────────────┬─────────────┬───────────┬─────────────────────┐
│ sdk_session_id │ project │ status │ created_at │
├──────────────────┼─────────────┼───────────┼─────────────────────┤
│ sess_abc123 │ my-blog │ completed │ 2026-04-21 10:30:00 │
│ sess_def456 │ my-blog │ completed │ 2026-04-20 14:00:00 │
│ sess_ghi789 │ other-proj │ completed │ 2026-04-19 09:15:00 │
└──────────────────┴─────────────┴───────────┴─────────────────────┘
Table 2: observations — The Heart of Claude-Mem
Every time Claude executes a tool (read file, write file, run command), the Worker generates an Observation.
SELECT id, title, type, tool_name, created_at
FROM observations WHERE session_id = 'sess_abc123'
ORDER BY prompt_number;
┌────┬──────────────────────────────┬──────────┬────────────┐
│ id │ title │ type │ tool_name │
├────┼──────────────────────────────┼──────────┼────────────┤
│ 1 │ Read prisma schema file │ discovery│ Read │
│ 2 │ Add Comment model │ feature │ Write │
│ 3 │ Run prisma migrate │ change │ Bash │
│ 4 │ Fix foreign key error │ bugfix │ Write │
│ 5 │ Choose cascade delete policy │ decision │ Write │
└────┴──────────────────────────────┴──────────┴────────────┘
Each Observation contains rich fields:
| Field | Meaning | Example |
|---|---|---|
title |
One-line summary | "Fix JWT refresh logic" |
narrative |
Detailed account (third person) | "The developer discovered the refreshToken function was missing..." |
facts |
Extracted facts list | ["Uses jsonwebtoken library", "Token TTL is 7 days"] |
concepts |
Related concepts | ["JWT", "authentication", "middleware"] |
type |
Category (6 types) | "bugfix" |
tool_name |
Triggering tool | "Write" |
files_read |
Files read | ["src/auth/jwt.ts"] |
files_modified |
Files changed | ["src/auth/jwt.ts"] |
Table 3: user_prompts — Your Inputs
Records every message you send to Claude within each session.
SELECT prompt_number, content FROM user_prompts
WHERE session_id = 'sess_abc123' ORDER BY prompt_number;
┌───────────────┬──────────────────────────────────────┐
│ prompt_number │ content │
├───────────────┼──────────────────────────────────────┤
│ 1 │ Add a comment feature to the blog │
│ 2 │ Comments should support Markdown │
│ 3 │ Fix that foreign key error │
└───────────────┴──────────────────────────────────────┘
Table 4: session_summaries — Session Wrap-Up
Auto-generated structured summary when a session ends.
{
"request": "Add comment system to blog",
"investigated": ["Prisma relation model syntax", "Cascade delete strategies"],
"completed": ["Created Comment model", "Implemented comment API", "Fixed FK constraint"],
"next_steps": ["Build comment UI", "Implement comment notifications"]
}
Notice
next_stepsmagic — it gets automatically injected into Claude's context on the next session, so the AI knows "what you left unfinished."
3.3 ChromaDB — The Semantic Vector Search Engine
SQLite already has FTS5 full-text search. Why also ChromaDB?
Because keyword search and semantic search are fundamentally different capabilities:
| Search Type | Engine | When searching for "login bug"... |
|---|---|---|
| Keyword search | SQLite FTS5 | ✅ Matches records containing "login" and "bug" ❌ Misses "authentication error" records |
| Semantic search | ChromaDB | ✅ Matches "login bug" ✅ Also matches "authentication error" (semantically similar) ✅ Even matches "JWT token expiration issue" |
How ChromaDB Works
graph LR
A["New Observation:
Fix JWT refresh logic"] --> B["Vectorize
(text → number vector)"]
B --> C["Store in ChromaDB
[0.23, -0.41, 0.87, ...]"]
D["User search:
login issue"] --> E["Vectorize
(query → number vector)"]
E --> F["Vector similarity calculation"]
C --> F
F --> G["Return closest matches"]
style A fill:#f59e0b,color:#000
style D fill:#6366f1,color:#fff
style G fill:#10b981,color:#fffIn simple terms:
- SQLite FTS5 = Exact matching (like Ctrl+F)
- ChromaDB = Meaning-aware matching (like a colleague who understands what you're asking)
Together, they form Claude-Mem's Hybrid Search, so you can find relevant history no matter what keywords you use.
3.4 The 6 Observation Types
The Worker automatically classifies each Observation into one of 6 types:
| Type | Meaning | Blog Project Example |
|---|---|---|
decision |
Architecture/design choice | "Chose Prisma over TypeORM" |
bugfix |
Fixed a bug | "Fixed 500 error on duplicate article slug" |
feature |
Built a new feature | "Added article tagging system" |
refactor |
Refactored code | "Migrated routes from pages/ to app/" |
discovery |
Learned something new | "Prisma's findMany doesn't include relations by default" |
change |
General modification | "Updated .env config file" |
These type labels enable precise filtering:
search(query="database", type="decision")
3.5 Complete Data Flow
graph TB
A["Claude executes a tool
(read/write file, run command)"] --> B["PostToolUse Hook fires"]
B --> C["Worker receives raw tool output"]
C --> D["Claude Agent SDK
compresses & distills"]
D --> E["Generates structured Observation
{title, narrative, facts,
concepts, type}"]
E --> F["SQLite"]
E --> G["ChromaDB"]
F --> F1["INSERT into observations table"]
F --> F2["Update FTS5 index"]
G --> G1["Text → vector embedding"]
G --> G2["Store in vector collection"]
H["Next search"] --> I{"Search type?"}
I -->|"Exact keyword match"| F
I -->|"Semantic fuzzy match"| G
I -->|"Hybrid search (default)"| J["FTS5 + vector
results merged & ranked"]
style A fill:#6366f1,color:#fff
style F fill:#10b981,color:#fff
style G fill:#10b981,color:#fff
style J fill:#f59e0b,color:#000Hands-On Exercise
- Work on the blog project in Claude Code for one session (e.g., create the database schema)
- After the session, query the database directly:
sqlite3 ~/.claude-mem/claude-mem.db
SELECT COUNT(*) FROM sdk_sessions;
SELECT id, title, type FROM observations ORDER BY created_at DESC LIMIT 10;
SELECT request, completed, next_steps FROM session_summaries LIMIT 1;
.quit
- Open
http://localhost:37777and compare the Web UI with your SQL results — they should match.
Coming Up Next
Data is stored, but how does it get captured "automatically"? Next episode, we reveal Claude-Mem's 5 lifecycle hooks — the invisible secretary that "takes notes behind your back." We'll trace every step from keypress to database insert with a full sequence diagram.
➡️ Episode 4: Lifecycle Hooks — How Claude-Mem Takes Notes Behind Your Back