Episode 6: 3-Layer Progressive Retrieval — Claude-Mem's Money-Saving Secret
This episode's scenario: Your blog project has 2 weeks, 10 sessions, and 200 Observations under its belt. You ask Claude: "How did we fix that login bug last time?" Dumping all 200 Observations into Claude would cost ~100,000 tokens.
6.1 Token Economics: Why "Dump Everything" Is Suicide
| Metric | Number |
|---|---|
| 10 sessions × 20 Observations | 200 total |
| Average 500 tokens per Observation | 100,000 tokens |
| Claude's context window | 200,000 tokens (max) |
| What you actually need | Maybe 2–3 relevant records |
Brute-force injecting all 200 memories:
- 💸 Cost: 100K tokens ($0.30–$1.50 per query)
- 🐌 Speed: Takes forever to fill context
- 🤯 Quality: Irrelevant noise degrades Claude's judgment ("context poisoning")
6.2 Three Layers: Progressive Disclosure
Claude-Mem's core philosophy is Progressive Disclosure — reveal information in layers. Claude first gets a lightweight index, identifies what's interesting, then fetches details.
Layer 1: search → Lightweight Index (~50–100 tokens/result)
Returns only: ID, title, type, date. Like a book's table of contents.
Layer 2: timeline → Chronological Context (~100–200 tokens/result)
Shows what happened before and after a specific Observation — the causal chain.
Layer 3: get_observations → Full Details (~500–1,000 tokens/result)
Fetches complete content only for the IDs that made it through the filter.
6.3 Cost Comparison
| Approach | Token Cost |
|---|---|
| Brute-force dump all | ~100,000 |
| Layer 1 index only | ~700 |
| Layer 1 + Layer 2 timeline | ~1,100 |
| Layer 1 + 2 + Layer 3 (2 items) | ~2,600 |
| Savings | ~97% |
6.4 How Claude Auto-Executes the 3-Layer Pattern
You don't need to tell Claude to "search first, then get details." The MCP tools are designed to naturally guide this workflow:
Claude's reasoning (automatic):
1. User asks about a specific historical bug fix
2. I have search, timeline, and get_observations tools
3. search first → found 3 matches, #67 looks most relevant
4. timeline(observation_id=67) → see the full context chain
5. get_observations([66, 67]) → get the details I need
6. Answer the user with precision
This funnel happens automatically. The tool names and descriptions are carefully designed to make Claude choose the right call order naturally.
Hands-On Exercise
- Accumulate 2–3 sessions on the blog project
- In a new session, ask Claude about a past action
- Observe Claude's tool calls — it should
searchfirst, thenget_observations - In the Web UI, find the corresponding Observations and compare with Claude's answer
Coming Up Next
Progressive retrieval solves "how to save tokens." But for practical use, you need advanced search skills — filtering by type, file, and date. Next: the Memory Search Skill deep dive.