Episode 6: 3-Layer Progressive Retrieval — Claude-Mem's Money-Saving Secret

⏱ Est. reading time: 9 min Updated on 5/7/2026

This episode's scenario: Your blog project has 2 weeks, 10 sessions, and 200 Observations under its belt. You ask Claude: "How did we fix that login bug last time?" Dumping all 200 Observations into Claude would cost ~100,000 tokens.


6.1 Token Economics: Why "Dump Everything" Is Suicide

Metric Number
10 sessions × 20 Observations 200 total
Average 500 tokens per Observation 100,000 tokens
Claude's context window 200,000 tokens (max)
What you actually need Maybe 2–3 relevant records

Brute-force injecting all 200 memories:

  • 💸 Cost: 100K tokens ($0.30–$1.50 per query)
  • 🐌 Speed: Takes forever to fill context
  • 🤯 Quality: Irrelevant noise degrades Claude's judgment ("context poisoning")

6.2 Three Layers: Progressive Disclosure

Claude-Mem's core philosophy is Progressive Disclosure — reveal information in layers. Claude first gets a lightweight index, identifies what's interesting, then fetches details.

Layer 1: search → Lightweight Index (~50–100 tokens/result)

Returns only: ID, title, type, date. Like a book's table of contents.

Layer 2: timeline → Chronological Context (~100–200 tokens/result)

Shows what happened before and after a specific Observation — the causal chain.

Layer 3: get_observations → Full Details (~500–1,000 tokens/result)

Fetches complete content only for the IDs that made it through the filter.


6.3 Cost Comparison

Approach Token Cost
Brute-force dump all ~100,000
Layer 1 index only ~700
Layer 1 + Layer 2 timeline ~1,100
Layer 1 + 2 + Layer 3 (2 items) ~2,600
Savings ~97%

6.4 How Claude Auto-Executes the 3-Layer Pattern

You don't need to tell Claude to "search first, then get details." The MCP tools are designed to naturally guide this workflow:

Claude's reasoning (automatic):
1. User asks about a specific historical bug fix
2. I have search, timeline, and get_observations tools
3. search first → found 3 matches, #67 looks most relevant
4. timeline(observation_id=67) → see the full context chain
5. get_observations([66, 67]) → get the details I need
6. Answer the user with precision

This funnel happens automatically. The tool names and descriptions are carefully designed to make Claude choose the right call order naturally.


Hands-On Exercise

  1. Accumulate 2–3 sessions on the blog project
  2. In a new session, ask Claude about a past action
  3. Observe Claude's tool calls — it should search first, then get_observations
  4. In the Web UI, find the corresponding Observations and compare with Claude's answer

Coming Up Next

Progressive retrieval solves "how to save tokens." But for practical use, you need advanced search skills — filtering by type, file, and date. Next: the Memory Search Skill deep dive.

➡️ Episode 7: Skill 1 — Memory Search