Chapter 28 | Debugging and Troubleshooting

Updated on 5/12/2026

Chapter 28: Debugging and Troubleshooting

Learning Objectives

Know where to look and how to fix issues, independently resolving 90% of common problems.

Three Core Troubleshooting Tools

flowchart TB
    Issue["Encountered Issue"] --> T1["Tool 1
openspec status
(Check artifact status)"] Issue --> T2["Tool 2
git diff / git log
(See what changed)"] Issue --> T3["Tool 3
review/N.md
test-reports/N.md
STUCK.md
(Who said what)"] style Issue fill:#fff9c4

Core Principle: File Status Trumps Agent Self-Report

Agent says: "Group 3 complete"
You: Verify!
  → cat openspec/changes/<active>/tasks.md | grep "## 3\." -A 10
  → Are all items truly checked?
  → Is review/3.md truly APPROVED?
  → Is test-reports/3.md truly PASS?

If file status is inconsistent, assume the agent has not completed the task.

Diagnosis Flow for 5 Common Symptoms

Symptom 1: Agent Claims Completion, But Status is Incorrect

Phenomenon: stdout says "DONE"
       But tasks.md is not checked, or no commit was made
Reason: Agent might have modified memory but forgot to commit; or Edit failed but it didn't realize
Countermeasure: Have the main Claude re-read the files and decide the next step based on file status

Symptom 2: Infinite Loop of Dispatches

Phenomenon: Dispatches continue more than 5 times
Reason: Threshold not effective / Status reading inaccurate
Countermeasures:
  1. Use /dev status to check the actual state
  2. Check the **Reviewer Round** field in review/N.md
  3. If necessary, manually add STUCK.md to force a stop

Symptom 3: Reviewer Repeatedly Rejects the Same Spot

Phenomenon: Rounds 1, 2, 3 all reject the same line of code
Reason: 
  - Sonnet developer doesn't understand reviewer's intent
  - Or reviewer is nitpicking
Countermeasures:
  - Threshold 3 automatically escalates to developer-deep
  - Deep will "question assumptions" to find the real problem
  - Still fails → architect diagnosis

Symptom 4: Tests Pass, But Code is Incorrect

Phenomenon: Tester PASSes, but user finds a bug when running
Reason: Tester wrote "self-proving tests" – only tested the happy path
Countermeasures:
  - Reviewer should check if tests target the Scenario
  - Add constraint to tester.md: "Do not pass for the sake of passing"
  - Severe cases escalate to tester-deep

Symptom 5: Hook Blocks Legitimate Operations

Phenomenon: A perfectly legitimate command is BLOCKED
Reason: guard-bash misjudgment
Countermeasures:
  1. Test hook in isolation: echo '{"tool_name":"Bash",...}' | hook
  2. Check stderr to see which rule was triggered
  3. Modify hook to add an exception, or broaden the match

Debugging Decision Tree

flowchart TD
    P["Problem"] --> Q1{What type?}
    Q1 -->|Status inconsistent| Read["Re-read files
Trust file status"] Q1 -->|Infinite loop| Check["openspec status + Threshold check"] Q1 -->|Incorrect code| Diff["git diff to see specific changes"] Q1 -->|Incorrect tests| Trans["Check Scenario → test translation
Does it meet contract?"] Q1 -->|Hook stuck| Trace["echo JSON to reproduce"] Q1 -->|Unknown| Reset["/dev status
Review from scratch"] style P fill:#ffcdd2 style Read fill:#c8e6c9 style Check fill:#c8e6c9 style Diff fill:#c8e6c9 style Trans fill:#c8e6c9 style Trace fill:#c8e6c9 style Reset fill:#fff9c4

Reset Strategy (Extreme Cases)

If the entire state machine is messed up:

# 1. Backup current state
cp -r openspec/changes/<active> /tmp/backup-active

# 2. Clear state
rm -rf openspec/changes/<active>/review/
rm -rf openspec/changes/<active>/test-reports/
rm openspec/changes/<active>/STUCK.md 2>/dev/null

# 3. Change [x] back to [ ] in tasks.md (only for groups to be redone)
# Manual edit

# 4. Rerun
/dev <N>

This is the nuclear option. Do not use it routinely.

Things You Should Never Do

❌ Directly modify review/N.md to make it APPROVED
   → False positive, bypasses quality gate
   
❌ Delete a task from tasks.md to make it "complete"
   → The spec still requires that capability, which is self-deception

❌ Disable the hook to let the agent run
   → Disables the safety net, will eventually lead to failure

❌ Forget to re-read CLAUDE.md after /clear
   → Main Claude won't know project rules

Anti-Patterns

❌ Trust agent stdout without checking files
❌ Rely on human intervention to fix inconsistent states
❌ Frequently git reset instead of fixing the underlying problem
❌ Fix bugs without updating the spec → will repeat the same mistakes next time

What You Can Do Now

  • Use the three core tools to pinpoint issues
  • Differentiate between "truly complete" and "falsely complete"
  • Know when to use the nuclear option

The next chapter discusses sandboxing – when to use Docker.