Chapter 28 | Debugging and Troubleshooting
Chapter 28: Debugging and Troubleshooting
Learning Objectives
Know where to look and how to fix issues, independently resolving 90% of common problems.
Three Core Troubleshooting Tools
flowchart TB
Issue["Encountered Issue"] --> T1["Tool 1
openspec status
(Check artifact status)"]
Issue --> T2["Tool 2
git diff / git log
(See what changed)"]
Issue --> T3["Tool 3
review/N.md
test-reports/N.md
STUCK.md
(Who said what)"]
style Issue fill:#fff9c4Core Principle: File Status Trumps Agent Self-Report
Agent says: "Group 3 complete"
You: Verify!
→ cat openspec/changes/<active>/tasks.md | grep "## 3\." -A 10
→ Are all items truly checked?
→ Is review/3.md truly APPROVED?
→ Is test-reports/3.md truly PASS?
If file status is inconsistent, assume the agent has not completed the task.
Diagnosis Flow for 5 Common Symptoms
Symptom 1: Agent Claims Completion, But Status is Incorrect
Phenomenon: stdout says "DONE"
But tasks.md is not checked, or no commit was made
Reason: Agent might have modified memory but forgot to commit; or Edit failed but it didn't realize
Countermeasure: Have the main Claude re-read the files and decide the next step based on file status
Symptom 2: Infinite Loop of Dispatches
Phenomenon: Dispatches continue more than 5 times
Reason: Threshold not effective / Status reading inaccurate
Countermeasures:
1. Use /dev status to check the actual state
2. Check the **Reviewer Round** field in review/N.md
3. If necessary, manually add STUCK.md to force a stop
Symptom 3: Reviewer Repeatedly Rejects the Same Spot
Phenomenon: Rounds 1, 2, 3 all reject the same line of code
Reason:
- Sonnet developer doesn't understand reviewer's intent
- Or reviewer is nitpicking
Countermeasures:
- Threshold 3 automatically escalates to developer-deep
- Deep will "question assumptions" to find the real problem
- Still fails → architect diagnosis
Symptom 4: Tests Pass, But Code is Incorrect
Phenomenon: Tester PASSes, but user finds a bug when running
Reason: Tester wrote "self-proving tests" – only tested the happy path
Countermeasures:
- Reviewer should check if tests target the Scenario
- Add constraint to tester.md: "Do not pass for the sake of passing"
- Severe cases escalate to tester-deep
Symptom 5: Hook Blocks Legitimate Operations
Phenomenon: A perfectly legitimate command is BLOCKED
Reason: guard-bash misjudgment
Countermeasures:
1. Test hook in isolation: echo '{"tool_name":"Bash",...}' | hook
2. Check stderr to see which rule was triggered
3. Modify hook to add an exception, or broaden the match
Debugging Decision Tree
flowchart TD
P["Problem"] --> Q1{What type?}
Q1 -->|Status inconsistent| Read["Re-read files
Trust file status"]
Q1 -->|Infinite loop| Check["openspec status + Threshold check"]
Q1 -->|Incorrect code| Diff["git diff to see specific changes"]
Q1 -->|Incorrect tests| Trans["Check Scenario → test translation
Does it meet contract?"]
Q1 -->|Hook stuck| Trace["echo JSON to reproduce"]
Q1 -->|Unknown| Reset["/dev status
Review from scratch"]
style P fill:#ffcdd2
style Read fill:#c8e6c9
style Check fill:#c8e6c9
style Diff fill:#c8e6c9
style Trans fill:#c8e6c9
style Trace fill:#c8e6c9
style Reset fill:#fff9c4Reset Strategy (Extreme Cases)
If the entire state machine is messed up:
# 1. Backup current state
cp -r openspec/changes/<active> /tmp/backup-active
# 2. Clear state
rm -rf openspec/changes/<active>/review/
rm -rf openspec/changes/<active>/test-reports/
rm openspec/changes/<active>/STUCK.md 2>/dev/null
# 3. Change [x] back to [ ] in tasks.md (only for groups to be redone)
# Manual edit
# 4. Rerun
/dev <N>
→ This is the nuclear option. Do not use it routinely.
Things You Should Never Do
❌ Directly modify review/N.md to make it APPROVED
→ False positive, bypasses quality gate
❌ Delete a task from tasks.md to make it "complete"
→ The spec still requires that capability, which is self-deception
❌ Disable the hook to let the agent run
→ Disables the safety net, will eventually lead to failure
❌ Forget to re-read CLAUDE.md after /clear
→ Main Claude won't know project rules
Anti-Patterns
❌ Trust agent stdout without checking files
❌ Rely on human intervention to fix inconsistent states
❌ Frequently git reset instead of fixing the underlying problem
❌ Fix bugs without updating the spec → will repeat the same mistakes next time
What You Can Do Now
- Use the three core tools to pinpoint issues
- Differentiate between "truly complete" and "falsely complete"
- Know when to use the nuclear option
The next chapter discusses sandboxing – when to use Docker.