Episode 5: Context Window Deep Dive — 📊 Claude HUD Complete Tutorial: From Beginner to Master in 18 Chapters

Key Takeaway: The context window is Claude Code's lifeline. Understanding its composition is key to precise management.

5.1 What is the Context Window

Every Claude Code conversation packs all information into a fixed-size "window." Window size depends on the model:

Model	Context Window
Claude Opus 4 / Sonnet 4	200K tokens
Claude Haiku 4	200K tokens
Some 1M context sessions	1,000K tokens

5.2 What's Inside the Context Window

The context window is divided into fixed sections (same every turn) and dynamic sections (grow with conversation):

┌─────────────────── Context Window (200K tokens) ─────────────────┐
│                                                                   │
│ ▼ Fixed Section — Same every request, constant size               │
│ ┌───────────────────────────────────────────────────────────────┐ │
│ │ System Prompt          ~3,000 tokens                         │ │
│ │ CLAUDE.md              ~2,000-8,000 tokens                   │ │
│ │ Hooks injection         ~500-2,000 tokens                    │ │
│ └───────────────────────────────────────────────────────────────┘ │
│                                                                   │
│ ▼ Dynamic Section — Grows with conversation turns                 │
│ ┌───────────────────────────────────────────────────────────────┐ │
│ │ Conversation history (user messages + AI replies)             │ │
│ │ Tool calls + tool results (biggest growth source!)            │ │
│ │ Current user message                                          │ │
│ └───────────────────────────────────────────────────────────────┘ │
│                                                                   │
│                     ← Remaining space (for Claude's response)     │
└───────────────────────────────────────────────────────────────────┘

5.3 Fixed Section Details

Fixed section is roughly 6K-15K tokens:

Content	Size	Can it be reduced?
System Prompt	~3K	No (built into Claude Code)
CLAUDE.md	1K-8K	Yes. `/caveman:compress`
caveman hook injection	~200-500	Yes. `/caveman lite`
claude-mem SessionStart	500-3K	Yes. Control observation count

Key point: Fixed section is the same every turn, but thanks to Prompt Cache, you only pay once.

5.4 What Operations Cause Context to Explode

Operation	Context Increase	Reason
Read large file	+2K-10K	Entire file content enters context
Bash npm test	+1K-5K	Test output can be very long
Agent subagent	+500-5K	Subagent results enter main session
Grep search	+200-2K	Depends on match count
Edit file	+100-300	Only sends diff, very small
Normal conversation	+100-500	User message + AI reply

5.5 Real-World: "Read Project" Context Changes

Before:
Context ██░░░░░░░░ 18% (36K/200K)

After reading 6 files:
Context ███████░░░ 60% (120K/200K)
  ↑ From 18% → 60%, one "read project" consumed 42% of context

Lesson: Reading 6 files at once jumps context from 18% to 60%. Only 40% remains.

5.6 What Happens When the Window is Full — Auto Compression

Claude Code triggers auto-compression at 95% (hardcoded, not configurable).

Consequences:

Cache fully invalidated: Compression changes message prefix
Information unrecoverable: Compressed file contents and tool results are permanently lost
Cost increases: Fixed section must be reprocessed after cache invalidation

How to avoid:

Keep context < 85% (start controlling at yellow warning)
Proactive /clear is better than forced compression
Avoid reading too many large files at once

5.7 How HUD Helps You Manage Context

Context ████████░░ 78%  ← Yellow warning, be careful

< 60% Green: Work freely
60-85% Yellow: Avoid large file reads and long debugging loops
> 85% Red: Consider /clear to reset

5.8 Practical Context-Saving Tips

Regular /clear: Clear mid-session to reset the dynamic section
Avoid debugging loops: /clear and restart after 3-4 rounds with no progress
Use Grep instead of Read: Only see relevant lines, don't read entire files
Slim down CLAUDE.md: Use /caveman:compress
Read specific line ranges: Read file.ts:offset:limit instead of the whole file

Next Episode: Episode 6 dives deep into the token system — detailed breakdown of input/output/cache tokens with real-world cost calculations.