Episode 5: Context Window Deep Dive
Key Takeaway: The context window is Claude Code's lifeline. Understanding its composition is key to precise management.
5.1 What is the Context Window
Every Claude Code conversation packs all information into a fixed-size "window." Window size depends on the model:
| Model | Context Window |
|---|---|
| Claude Opus 4 / Sonnet 4 | 200K tokens |
| Claude Haiku 4 | 200K tokens |
| Some 1M context sessions | 1,000K tokens |
5.2 What's Inside the Context Window
The context window is divided into fixed sections (same every turn) and dynamic sections (grow with conversation):
ββββββββββββββββββββ Context Window (200K tokens) ββββββββββββββββββ
β β
β βΌ Fixed Section β Same every request, constant size β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β System Prompt ~3,000 tokens β β
β β CLAUDE.md ~2,000-8,000 tokens β β
β β Hooks injection ~500-2,000 tokens β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β βΌ Dynamic Section β Grows with conversation turns β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Conversation history (user messages + AI replies) β β
β β Tool calls + tool results (biggest growth source!) β β
β β Current user message β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β β Remaining space (for Claude's response) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
5.3 Fixed Section Details
Fixed section is roughly 6K-15K tokens:
| Content | Size | Can it be reduced? |
|---|---|---|
| System Prompt | ~3K | No (built into Claude Code) |
| CLAUDE.md | 1K-8K | Yes. /caveman:compress |
| caveman hook injection | ~200-500 | Yes. /caveman lite |
| claude-mem SessionStart | 500-3K | Yes. Control observation count |
Key point: Fixed section is the same every turn, but thanks to Prompt Cache, you only pay once.
5.4 What Operations Cause Context to Explode
| Operation | Context Increase | Reason |
|---|---|---|
| Read large file | +2K-10K | Entire file content enters context |
| Bash npm test | +1K-5K | Test output can be very long |
| Agent subagent | +500-5K | Subagent results enter main session |
| Grep search | +200-2K | Depends on match count |
| Edit file | +100-300 | Only sends diff, very small |
| Normal conversation | +100-500 | User message + AI reply |
5.5 Real-World: "Read Project" Context Changes
Before:
Context ββββββββββ 18% (36K/200K)
After reading 6 files:
Context ββββββββββ 60% (120K/200K)
β From 18% β 60%, one "read project" consumed 42% of context
Lesson: Reading 6 files at once jumps context from 18% to 60%. Only 40% remains.
5.6 What Happens When the Window is Full β Auto Compression
Claude Code triggers auto-compression at 95% (hardcoded, not configurable).
Consequences:
- Cache fully invalidated: Compression changes message prefix
- Information unrecoverable: Compressed file contents and tool results are permanently lost
- Cost increases: Fixed section must be reprocessed after cache invalidation
How to avoid:
- Keep context < 85% (start controlling at yellow warning)
- Proactive
/clearis better than forced compression - Avoid reading too many large files at once
5.7 How HUD Helps You Manage Context
Context ββββββββββ 78% β Yellow warning, be careful
- < 60% Green: Work freely
- 60-85% Yellow: Avoid large file reads and long debugging loops
- > 85% Red: Consider
/clearto reset
5.8 Practical Context-Saving Tips
- Regular
/clear: Clear mid-session to reset the dynamic section - Avoid debugging loops:
/clearand restart after 3-4 rounds with no progress - Use Grep instead of Read: Only see relevant lines, don't read entire files
- Slim down CLAUDE.md: Use
/caveman:compress - Read specific line ranges:
Read file.ts:offset:limitinstead of the whole file
Next Episode: Episode 6 dives deep into the token system β detailed breakdown of input/output/cache tokens with real-world cost calculations.