Chapter 29: Sandbox Selection

Updated on 5/12/2026

Chapter 29: Sandbox Selection

Learning Objectives

Understand when to use Docker, when not to, and how to compromise based on your project's actual constraints.

Three Tiers of Sandbox Comparison

flowchart TB
    L1["L1: git worktree + venv
(Native)"] --> L1Pros["✓ Native speed
✓ File isolation
✓ No global Python pollution"] L1 --> L1Cons["✗ No protection against accidental `rm`
✗ No protection against malicious `curl|sh`
✗ Poor cross-machine reproducibility"] L2["L2: Docker bind-mount
(Container)"] --> L2Pros["✓ Process isolation
✓ Prevents global pollution
✓ Cross-machine reproducibility"] L2 --> L2Cons["✗ E2E must run outside container
✗ Slower on macOS
✗ Configuration overhead"] L3["L3: Full VM
(Complete Isolation)"] --> L3Pros["✓ Complete isolation
✓ Highest security"] L3 --> L3Cons["✗ Heavyweight
✗ Difficult to run macOS desktop apps"] style L1 fill:#c5e1a5 style L2 fill:#fff9c4 style L3 fill:#ffcdd2

Decision Tree

flowchart TD
    Q1{Requires macOS GUI
(Screen recording/Chrome)?} Q1 -->|Yes| MacOS["E2E must be on host
L3 not suitable"] Q1 -->|No| AnyOS["Any OS is fine"] MacOS --> Q2{Security sensitive?} Q2 -->|Yes| Mix["L1 + L2 Hybrid
(Dev L2, E2E host)"] Q2 -->|No| L1Pick["L1 is sufficient to start"] AnyOS --> Q3{Multi-person collaboration?} Q3 -->|Yes| L2Pick["L2 (Reproducibility)"] Q3 -->|No| L1Pick style L1Pick fill:#c5e1a5 style L2Pick fill:#fff9c4 style Mix fill:#fff9c4

Special Constraints for doc2video

✗ E2E requires macOS avfoundation for screen recording
✗ E2E requires a real Chrome desktop (non-headless)
✗ E2E requires iTerm fullscreen + tmux
  
→ None of these can run in Docker (Linux container)
→ Even with Docker on Mac, avfoundation is inaccessible
→ Conclusion: E2E must run on the host

Therefore, for our MVP, we will use L1, without Docker.

L1 Specific Configuration

# Git worktree
cd ~/work/openspec
git worktree add ../doc2video-worktree -b feature/group-3

# venv
cd ../doc2video-worktree
python3 -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"

# Claude Code runs here
claude

→ File isolation + Python isolation, overhead ≈ 0.

When to Upgrade to L2

□ Team ≥ 3 people
□ Cross-platform development (Linux + Mac + Windows)
□ Long-term maintenance project (>6 months)
□ Genuinely concerned about agents modifying the host indiscriminately

→ If ≥ 2 items are hit, upgrade to L2.

L2 Hybrid Mode (Applicable to doc2video)

flowchart TB
    Dev["Development Phase"] -->|developer/tester/reviewer| Container["Inside Docker Container"]
    Container --> Mount["bind-mount project directory"]
    Container --> Pytest["pytest runs in container"]

    E2E["E2E Phase"] -->|e2e-tester| Host["macOS host"]
    Host --> Real["Real Chrome + iTerm + ffmpeg"]
    Host --> Record["Record real desktop"]

    style Container fill:#bbdefb
    style Host fill:#fff9c4

What Docker Cannot Save You From

✗ `.env` files can still be read inside the container (if bind-mounts are fully exposed)
✗ Container processes can still `curl` out (unless `--network=none`)
✗ Denying `rm /tmp` inside the container still requires hooks
✗ Container escape vulnerabilities theoretically exist

→ Docker is one layer in "defense in depth," not a silver bullet

Anti-Patterns

❌ "Using Docker makes it secure"
   → Still requires configuring denies + hooks

❌ Bind-mounting the entire home directory
   → Equivalent to no sandbox

❌ Running E2E in a container
   → Fails for macOS GUI projects

❌ Adopting L3 (VM) at the MVP stage
   → Over-engineering, minimal benefit

❌ L1 with no permission restrictions
   → Equivalent to running naked (no protection)

What You Can Do Now

  • Choose the appropriate sandbox level for your project
  • Understand what Docker solves and what it doesn't
  • Design hybrid modes (Dev L2 + E2E host)