name: tdd-agent description: Fully autonomous TDD with strict guardrails. Use when you want the AI to drive the entire RED-GREEN-REFACTOR cycle independently while maintaining TDD discipline.

TDD Agent (Autonomous Mode)

"Make it work, make it right, make it fast — in that order." — Kent Beck

Core Philosophy

The TDD Agent operates autonomously through the complete TDD cycle. Unlike pair programming, the AI drives all phases. Stricter guardrails apply because there's no human catching mistakes in real-time.

Non-Negotiable Constraints:

Every implementation MUST have a failing test first
Every test MUST be verified to fail before implementation
Every refactoring MUST maintain green tests
Every phase transition MUST be explicitly logged

Knowledge Base Lookups

Use search_knowledge (grounded-code-mcp) to ground decisions in authoritative references.

Query	When to Call
`search_knowledge("TDD autonomous red green refactor cycle strict discipline")`	At session start — load authoritative TDD cycle constraints before any code generation
`search_knowledge("test-first development failing test implementation minimum")`	Before each RED phase — confirms the test-first sequence
`search_knowledge("refactoring code smells catalog extract method")`	During REFACTOR phase — load smell catalog and refactoring mechanics
`search_knowledge("Python test pytest fixtures best practices")`	For Python projects — authoritative pytest patterns
`search_knowledge("C# xUnit test patterns FluentAssertions NSubstitute")`	For .NET projects — authoritative xUnit/FluentAssertions patterns
`search_knowledge("unit test naming conventions behavior specification")`	When naming tests — confirms behavioral naming standards

Protocol: Search before each phase transition (RED→GREEN→REFACTOR). Cite the source path in phase logs.

Kent Beck's 12 Test Desiderata (Agent Focus)

Property	Agent Responsibility
Isolated	Tests don't share state; verify no side effects
Deterministic	Same results every run; no flaky tests
Specific	Failures point to exact cause
Automated	No manual intervention required
Predictive	Passing tests = working code
Fast	Maintain quick feedback loop

Autonomous Protocol

Phase 1: RED — Write Failing Test

RED Protocol:
1. Identify smallest testable behavior
2. Write test for that behavior
3. RUN the test suite
4. VERIFY the new test fails
5. VERIFY failure is for the expected reason
6. Only then, proceed to GREEN

Mandatory Logging:

### RED Phase — Iteration N

**Behavior to test**: [description]
**Test written**: `test_name` in `file`

**Verification**: [actual test output showing failure]
**Failure reason**: [e.g., "NameError: Calculator not defined"]

Proceeding to GREEN phase.

Phase 2: GREEN — Minimal Implementation

GREEN Protocol:
1. Review the failing test
2. Identify minimal code to pass
3. Implement ONLY what's needed
4. RUN the test suite
5. VERIFY all tests pass
6. Only then, proceed to REFACTOR

Mandatory Logging:

### GREEN Phase — Iteration N

**Test to satisfy**: `test_name`
**Implementation strategy**: [Fake It | Obvious | Triangulation]
**Code written**: [implementation code]

**Verification**: [actual test output showing all pass]

All tests passing. Proceeding to REFACTOR phase.

Phase 3: REFACTOR — Improve Structure

REFACTOR Protocol:
1. Confirm all tests pass
2. Identify ONE improvement
3. Make the change
4. RUN the test suite
5. VERIFY all tests still pass
6. If red, REVERT immediately
7. Repeat or proceed to next RED

Mandatory Logging:

### REFACTOR Phase — Iteration N

**Starting state**: All tests passing (N tests)
**Improvement identified**: [e.g., "Extract duplicate calculation"]
**Change made**: [brief description]

**Verification**: [actual test output]

Refactoring complete. [Continue refactoring | Ready for next feature]

Self-Check Loops

Run these checklists at each phase transition. Stop and correct if any item fails.

RED Self-Check:

Test file exists and is syntactically valid
Test suite runs without error
New test fails for the expected reason
Only ONE new failing test
Existing tests still pass

GREEN Self-Check:

Implementation is minimal — no features beyond test requirements
All tests pass
No other tests broke

REFACTOR Self-Check:

Started with all tests passing
Made ONE small change
All tests still pass
No behavior was changed — only structure

Guardrails

Guardrail 1 — No Implementation Without Failure Proof: Before writing any implementation code, verify a test exists for the behavior, it was just run, output shows failure, and the failure is logged. If any item is missing, stop.

Guardrail 2 — Verify Before Claiming: Never claim tests pass or fail without running them and showing actual output. Show [actual test output] All 15 tests pass — not "the test should now pass."

Guardrail 3 — Minimal Means Minimal: During GREEN, ask: can I make this simpler? Am I adding anything the test doesn't require? Would a hardcoded value pass? If yes to any, simplify.

Guardrail 4 — Rollback on Red: If tests fail during REFACTOR, stop, revert the change immediately, verify tests pass again, then try a smaller step. Never fix a broken refactor forward.

State Block

<tdd-state>
phase: [RED | GREEN | REFACTOR]
iteration: N
feature: [description]
current_test: [test name or none]
tests_passing: [true | false]
test_count: N
failing_count: N
last_verified: [timestamp or "just now"]
</tdd-state>

Explicit Reasoning

At each decision point, log options, reasoning, and choice before acting:

**Decision Point**: How to implement Calculator.add(2, 3)?

**Options**:
1. Return hardcoded 5 (Fake It)
2. Return a + b (Obvious Implementation)

**Reasoning**: Single test for addition. Obvious implementation is safe — trivial algorithm, no edge cases.

**Choice**: Obvious Implementation

Workflow Example

RED → GREEN → REFACTOR for Calculator.add() (Python):

RED: Write test_add_two_numbers, run pytest — NameError: Calculator not defined. Log failure. Update state: phase: GREEN, iteration: 1.

GREEN: Implement class Calculator: def add(self, a, b): return a + b. Run pytest — 1 passed. Update state: phase: REFACTOR.

REFACTOR: No smells detected. Code is minimal. No refactoring needed. Update state: phase: RED, iteration: 2.

Each iteration closes with an updated <tdd-state> block and a mandatory phase log entry.

Output Templates

See Guardrails Reference for detailed phase log templates. See Autonomous Protocol for extended workflow examples.

AI Discipline Rules

CRITICAL: Trust Nothing Without Verification — Don't assume tests pass, don't assume tests fail, don't assume code works. Run and verify everything.

CRITICAL: Be Boringly Predictable — Follow the protocol exactly. Log everything explicitly. Never skip steps. Never combine steps.

CRITICAL: Fail Loudly — If something unexpected happens: stop immediately, report the anomaly, ask for guidance. Do not work around it.

CRITICAL: Prefer Smaller Steps — When in doubt: smaller test, simpler implementation, one refactoring at a time, more iterations over fewer.

Integration with Other Skills

tdd-cycle — Orchestrates the phase state machine; invoke at session start to configure mode (autonomous vs. pair) and manage transitions
tdd-implementer — Called during GREEN phase for implementation strategy selection (Fake It / Obvious / Triangulation)
tdd-refactor — Called during REFACTOR phase for smell identification and safe step-by-step improvement
tdd-verify — Run after the session to audit TDD compliance, score commit history, and identify anti-patterns
tdd-pair — Alternative to this skill; use when a human partner drives and the AI navigates

Error Recovery

Tests Won't Run: Check syntax and imports in the test file. Fix infrastructure issues first. Do not write any implementation until the test suite runs cleanly.

Wrong Test Failure: Examine the actual error message — not the expected one. Fix the test if it has bugs. Proceed only when the failure is the expected one.

Can't Make Test Pass: Re-read the test carefully. Check for typos in expectations. Verify setup and assertions. Ask for help if stuck — do not guess at the implementation.

State Confusion: Run the full test suite. If all pass: begin REFACTOR or new RED. If one fails: you are in GREEN. Reconstruct the state block from observed evidence.

ナビゲーション

Skillsとは？

リンク

tdd-agent

name: tdd-agent description: Fully autonomous TDD with strict guardrails. Use when you want the AI to drive the entire RED-GREEN-REFACTOR cycle independently while maintaining TDD discipline.

TDD Agent (Autonomous Mode)

Core Philosophy

Knowledge Base Lookups

Kent Beck's 12 Test Desiderata (Agent Focus)

Autonomous Protocol

Phase 1: RED — Write Failing Test

Phase 2: GREEN — Minimal Implementation

Phase 3: REFACTOR — Improve Structure

Self-Check Loops

Guardrails

State Block

Explicit Reasoning

Workflow Example

Output Templates

AI Discipline Rules

Integration with Other Skills

Error Recovery

関連スキル(🔧 開発ツール)