Episode 5: Context Window Deep Dive

⏱ Est. reading time: 7 min Updated on 5/7/2026

Key Takeaway: The context window is Claude Code's lifeline. Understanding its composition is key to precise management.


5.1 What is the Context Window

Every Claude Code conversation packs all information into a fixed-size "window." Window size depends on the model:

Model Context Window
Claude Opus 4 / Sonnet 4 200K tokens
Claude Haiku 4 200K tokens
Some 1M context sessions 1,000K tokens

5.2 What's Inside the Context Window

The context window is divided into fixed sections (same every turn) and dynamic sections (grow with conversation):

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ Context Window (200K tokens) ─────────────────┐
β”‚                                                                   β”‚
β”‚ β–Ό Fixed Section β€” Same every request, constant size               β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ System Prompt          ~3,000 tokens                         β”‚ β”‚
β”‚ β”‚ CLAUDE.md              ~2,000-8,000 tokens                   β”‚ β”‚
β”‚ β”‚ Hooks injection         ~500-2,000 tokens                    β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚                                                                   β”‚
β”‚ β–Ό Dynamic Section β€” Grows with conversation turns                 β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ Conversation history (user messages + AI replies)             β”‚ β”‚
β”‚ β”‚ Tool calls + tool results (biggest growth source!)            β”‚ β”‚
β”‚ β”‚ Current user message                                          β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚                                                                   β”‚
β”‚                     ← Remaining space (for Claude's response)     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

5.3 Fixed Section Details

Fixed section is roughly 6K-15K tokens:

Content Size Can it be reduced?
System Prompt ~3K No (built into Claude Code)
CLAUDE.md 1K-8K Yes. /caveman:compress
caveman hook injection ~200-500 Yes. /caveman lite
claude-mem SessionStart 500-3K Yes. Control observation count

Key point: Fixed section is the same every turn, but thanks to Prompt Cache, you only pay once.


5.4 What Operations Cause Context to Explode

Operation Context Increase Reason
Read large file +2K-10K Entire file content enters context
Bash npm test +1K-5K Test output can be very long
Agent subagent +500-5K Subagent results enter main session
Grep search +200-2K Depends on match count
Edit file +100-300 Only sends diff, very small
Normal conversation +100-500 User message + AI reply

5.5 Real-World: "Read Project" Context Changes

Before:
Context β–ˆβ–ˆβ–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘ 18% (36K/200K)

After reading 6 files:
Context β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‘β–‘β–‘ 60% (120K/200K)
  ↑ From 18% β†’ 60%, one "read project" consumed 42% of context

Lesson: Reading 6 files at once jumps context from 18% to 60%. Only 40% remains.


5.6 What Happens When the Window is Full β€” Auto Compression

Claude Code triggers auto-compression at 95% (hardcoded, not configurable).

Consequences:

  • Cache fully invalidated: Compression changes message prefix
  • Information unrecoverable: Compressed file contents and tool results are permanently lost
  • Cost increases: Fixed section must be reprocessed after cache invalidation

How to avoid:

  • Keep context < 85% (start controlling at yellow warning)
  • Proactive /clear is better than forced compression
  • Avoid reading too many large files at once

5.7 How HUD Helps You Manage Context

Context β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‘β–‘ 78%  ← Yellow warning, be careful
  • < 60% Green: Work freely
  • 60-85% Yellow: Avoid large file reads and long debugging loops
  • > 85% Red: Consider /clear to reset

5.8 Practical Context-Saving Tips

  1. Regular /clear: Clear mid-session to reset the dynamic section
  2. Avoid debugging loops: /clear and restart after 3-4 rounds with no progress
  3. Use Grep instead of Read: Only see relevant lines, don't read entire files
  4. Slim down CLAUDE.md: Use /caveman:compress
  5. Read specific line ranges: Read file.ts:offset:limit instead of the whole file

Next Episode: Episode 6 dives deep into the token system β€” detailed breakdown of input/output/cache tokens with real-world cost calculations.