name: code-test description: Enforce test-driven development (RED-GREEN-REFACTOR). Use when implementing features, fixing bugs, or changing behavior - write failing test first, then minimal code to pass.

Test-Driven Development

Write the test first. Watch it fail. Write minimal code to pass.

Core principle: If you didn't watch the test fail, you don't know if it tests the right thing.

When to Use

Always use for:

New features
Bug fixes
Behavior changes
Refactoring

Exceptions (confirm with user):

Throwaway prototypes
Generated code
Configuration files

Workflow Integration:

Multiple independent tasks from a plan? → Use task-dispatch skill (it enforces TDD per task with quality gates)
Single implementation task? → Use this skill directly for TDD workflow

The Iron Law

NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST

Wrote code before the test? Delete it. Start over. No exceptions.

Red-Green-Refactor Cycle

1. RED - Write Failing Test

Write one minimal test showing what should happen.

Requirements:

One behavior per test
Clear name describing behavior
Real code (avoid mocks unless unavoidable)

2. Verify RED - Watch It Fail

MANDATORY. Never skip.

Run the test. Confirm:

Test fails (not errors from typos)
Failure message matches expectation
Fails because feature is missing

Test passes immediately? You're testing existing behavior. Fix the test.

3. GREEN - Minimal Code

Write the simplest code to pass the test.

DO: Just enough to pass, simple implementation DON'T: Add features, refactor other code, add configurability

4. Verify GREEN - Watch It Pass

MANDATORY.

Run the test. Confirm:

Test passes
Other tests still pass
No errors or warnings

5. REFACTOR - Clean Up

Only after green:

Remove duplication
Improve names
Extract helpers

Keep tests green. Don't add behavior.

6. Repeat

Next failing test for next behavior.

Common Rationalizations

Excuse	Reality
"Too simple to test"	Simple code breaks. Test takes 30 seconds.
"I'll test after"	Tests passing immediately prove nothing.
"Already manually tested"	Ad-hoc is not systematic. No record, can't re-run.
"Deleting X hours is wasteful"	Sunk cost fallacy. Unverified code is debt.
"Need to explore first"	Fine. Throw away exploration, then TDD.
"Test hard to write"	Hard to test = hard to use. Simplify design.

Red Flags - Stop and Start Over

Code written before test
Test passes immediately
Can't explain why test failed
"Just this once" rationalization
Keeping code "as reference"

All of these mean: Delete code. Start with TDD.

Verification Checklist

Before marking work complete:

Every new function has a test
Watched each test fail before implementing
Each test failed for expected reason
Wrote minimal code to pass each test
All tests pass
No errors or warnings in output

TDD Evidence Format (For Subagent Verification)

When implementing as a subagent, you MUST output this evidence block:

tdd_evidence:
  tests_written:
    - name: "test_feature_x"
      file: "tests/test_x.py"
      red_output: "FAILED - [actual failure message]"
      green_output: "PASSED - 1 passed in 0.05s"
  implementation_files:
    - path: "src/feature.py"
  all_tests_pass: true
  test_command: "pytest tests/test_x.py -v"
  final_output: "[full test output]"

This is REQUIRED for SubagentStop hook verification.

Integration

Use with:

code-debug - Write failing test to reproduce bug before fixing
task-dispatch - Subagents follow TDD for each task
completion-verify - Run tests before claiming completion

ナビゲーション

Skillsとは？

リンク

code-test