Issue 08 | /caveman:compress β€” Compress your CLAUDE.md, saving 46% input tokens each time

⏱ Est. reading time: 13 min Updated on 5/7/2026

🎯 Learning Objectives

After this issue, you will master:

  1. The difference between Input Tokens and Output Tokens, and why compressing input is equally important
  2. The usage of /caveman:compress and its file processing mechanism
  3. Compression safety rules β€” what content gets compressed and what doesn't
  4. Best practices for using compress on different platforms

πŸ“– Core Content

8.1 The Overlooked Cost: Input Tokens

In previous issues, we've been discussing how to compress Agent output. However, there's another, more hidden source of Token consumption: input.

Every time you start a Claude Code session, the Agent automatically reads:

  • CLAUDE.md (project-level rules)
  • ~/.claude/CLAUDE.md (user-level rules)
  • Various Skill files
  • MCP configuration
graph TD
    A["Session Start"] --> B["Read CLAUDE.md
~1200 tokens"] A --> C["Read ~/.claude/CLAUDE.md
~800 tokens"] A --> D["Read Skill Files
~500 tokens"] A --> E["Read MCP Configuration
~300 tokens"] B --> F["Total Input: ~2800 tokens
Paid per session"] C --> F D --> F E --> F F --> G["20 sessions/day
= 56,000 input tokens/day
= Context loading alone costs ~$0.84/day"]

These files rarely change throughout your project's lifecycle, but they are re-read and paid for every time a session starts. This is the problem caveman-compress aims to solve.

8.2 How Compress Works

graph LR
    A["CLAUDE.md
1200 tokens
Human-readable version"] -->|"/caveman:compress CLAUDE.md"| B["Processing Flow"] B --> C["CLAUDE.md
648 tokens
Compressed version (Claude reads)"] B --> D["CLAUDE.original.md
1200 tokens
Backup (Human reads)"] E["Every session start"] -->|"Reads"| C F["Developer maintains"] -->|"Edits"| D style C fill:#90EE90,stroke:#2E8B57 style D fill:#87CEEB,stroke:#4169E1

Core Design:

  1. The original file is backed up as .original.md (you continue editing this)
  2. The original file is overwritten with the compressed version (Claude reads this, fewer Tokens)
  3. Technical content remains untouched, only prose descriptions are Caveman-ified

8.3 Usage

Basic Usage

# In Claude Code
> /caveman:compress CLAUDE.md

# Output:
# βœ… Compressed CLAUDE.md (1200 β†’ 648 tokens, saved 46%)
# πŸ“„ Original saved as CLAUDE.original.md

Compressing Other Files

# Compress user-level preferences
> /caveman:compress claude-md-preferences.md

# Compress project notes
> /caveman:compress project-notes.md

# Compress todo list
> /caveman:compress todo-list.md

# Compress any Markdown context file
> /caveman:compress any-context-file.md

On Other Platforms

Platform Usage
Claude Code /caveman:compress <filepath>
Antigravity Please use caveman compress to compress the GEMINI.md file
Gemini CLI /caveman:compress GEMINI.md or natural language
Codex $caveman-compress AGENTS.md
OpenCode Please compress the AGENTS.md file, keeping code and paths, only compressing prose

8.4 Safety Rules: What Won't Be Compressed

This is the most important design principle of compress β€” technical content is never modified:

graph TB
    subgraph Safe["βœ… Won't Be Compressed (Kept As Is)"]
        A1["Code Blocks
```python...```"] A2["URL Links
https://..."] A3["File Paths
src/utils/auth.ts"] A4["Command Lines
npm install / git commit"] A5["Headings
# ## ###"] A6["Dates / Version Numbers
v2.3.0 / 2026-04-22"] end subgraph Compressed["πŸ”§ Will Be Compressed (Caveman-ified)"] B1["Prose Descriptions
'This project is a...'"] B2["Explanatory Paragraphs
'The reason we chose...'"] B3["Redundant Qualifiers
'Please make sure to always...'"] end style Safe fill:#E8F5E9,stroke:#4CAF50 style Compressed fill:#FFF3E0,stroke:#FF9800

Before and After Compression Comparison

Before Compression (CLAUDE.md):

# Project Architecture

This project is built using a modern microservices architecture 
with TypeScript as the primary language. We chose this approach 
because it provides better scalability and allows independent 
deployment of each service.

## Important Rules

Please make sure to always follow these guidelines when making 
changes to the codebase:

1. Always run `npm test` before committing
2. Use the `src/utils/logger.ts` module for all logging
3. Database queries must go through `src/db/repository.ts`

After Compression (CLAUDE.md β†’ Claude reads this):

# Project Architecture

Microservices, TypeScript primary. Independent deploy per service.

## Important Rules

1. Run `npm test` before commit
2. Log via `src/utils/logger.ts`
3. DB queries through `src/db/repository.ts`

Backup (CLAUDE.original.md β†’ You edit this):

Original full content remains unchanged.

8.5 Daily Maintenance Workflow

graph TD
    A["Need to update project rules"] --> B["Edit CLAUDE.original.md"]
    B --> C["Re-compress"]
    C --> D["/caveman:compress CLAUDE.md"]
    D --> E["CLAUDE.md updated to new compressed version"]
    D --> F["CLAUDE.original.md updated to new backup"]
    
    G["⚠️ Do not edit CLAUDE.md directly!"] -->|"Will be overwritten by next compression"| B
    
    style G fill:#FFEBEE,stroke:#F44336

⚠️ Important Reminder:

  • Edit β†’ Modify CLAUDE.original.md
  • Compress β†’ Run /caveman:compress CLAUDE.md
  • Do not directly edit the compressed CLAUDE.md; it will be overwritten during the next compression

8.6 Context File Mapping for Various Platforms

Platform Main Context File Compression Command Backup File
Claude Code CLAUDE.md /caveman:compress CLAUDE.md CLAUDE.original.md
Antigravity GEMINI.md Natural language request for compression GEMINI.original.md
Gemini CLI GEMINI.md /caveman:compress GEMINI.md GEMINI.original.md
Codex AGENTS.md $caveman-compress AGENTS.md AGENTS.original.md
OpenCode .config/opencode/AGENTS.md Natural language request for compression Manual backup

πŸ’» Hands-on Practice

Exercise: Compress your project CLAUDE.md

  1. Check the current Token count of CLAUDE.md (rough estimate: 1 English word β‰ˆ 1.3 tokens, 1 Chinese character β‰ˆ 2 tokens)
  2. Run /caveman:compress CLAUDE.md
  3. Compare Token counts before and after compression
  4. Confirm that CLAUDE.original.md correctly saved the original text
  5. Start a new session and confirm that the Agent can correctly read the compressed version

πŸ“Š Compression Effectiveness Statistics

According to Caveman's official benchmarks, average compression rates for different file types:

File Type Average Compression Rate Description
Pure Prose Descriptions ~55-60% Highest compression potential
Rules + Code Mixed ~40-46% Most common scenario
Code-centric Files ~15-20% Code not compressed, only comments
Pure Code Files ~5% Almost no compression space

πŸ’‘ Best Practice: Write the "what" and "why" in CLAUDE.md in prose paragraphs (these will be compressed), and the "how-to" in code blocks and commands (these remain as is).


πŸ“ Key Takeaways from This Issue

  1. Input Tokens are a hidden cost of every session; CLAUDE.md is paid for every time it starts
  2. /caveman:compress compresses prose into Caveman style, saving an average of 46% of input Tokens
  3. Safety Guarantee: Code blocks, URLs, file paths, command lines, headings, version numbers are never compressed
  4. After compression, edit *.original.md; do not directly edit the compressed version
  5. Each platform has corresponding context files (CLAUDE.md / GEMINI.md / AGENTS.md) that can be compressed

πŸ”— References