Phase 3 / Ep 12: TDD—Steel Chains to Tame the Behemoth
Many developers hate writing unit tests, finding them slow and counter-intuitive. In the era of manual coding, if you didn't write tests, you might barely manage to maintain a project with tens of thousands of lines using your human memory.
However, in the era of Agent-driven development, without tests, the entire codebase will collapse within three minutes!
1. The Cost of Speed
Large Language Models (LLMs) (e.g., Gemini, Claude, GPT) can write business logic at a speed of hundreds of lines per second. But due to the inherent flaws of Token forgetting and attention drift, once it reaches the refactoring stage, if the AI modifies one place, it is highly likely to cause a cascading butterfly effect elsewhere. It won't stop due to fatigue; it will rationalize its actions over and over again until your old code structure is completely shattered.
At this point, the most effective thing to pull the AI back from the brink is ruthless assertion statements (Assertions).
2. What is TDD for Large Models?
Traditional TDD (Test-Driven Development) is: Red (write a failing test) -> Green (write code to pass the test) -> Refactor code.
TDD for Agents is to first define a cage with tests at the beginning of each task in task_plan.md.
The Golden Rule of Agent-TDD:
"As long as a piece of logic hasn't produced a red error in Vitest/Jest, you are absolutely forbidden from touching its
src/business code."
3. Downgrading "Emotional Dialogue" to "Machine Judgment"
If you don't write unit tests, how do you test your code? You'd click on your page in the browser, and then it errors. You'd need to tell the AI: "When clicking the 'add' button, the time block didn't pop up." This is an extremely emotional and inefficient way of communicating.
With a TDD test environment, once the business logic is written incorrectly, the Agent can run npx vitest run block_allocator.spec.ts through the terminal itself.
It will read the console stack error thrown: "Expected [2 hours] but received [undefined]".
Once it enters this state, humans completely exit the misery of debugging. The Agent will enter an extremely excited and efficient "self-healing loop interacting with the terminal", and only when the terminal gives a green Pass will it stop and hand over control back to you.
In the next episode, we will materialize these chains by writing our system's dedicated test-driven-development skill.