name: verify description: "Evidence-based completion verification against all acceptance criteria. Must run before claiming work is done. No claims without evidence."
Session Start Check (required)
If session-start has not run in this session, run /atdd-kit:session-start first.
verify Skill
<AUTOPILOT-GUARD> If ARGUMENTS does not contain `--autopilot` (user invoked directly via slash command): - Display message: "This skill is autopilot-only. Use `/atdd-kit:autopilot <number>` instead." - **STOP.** Do not proceed with execution. If ARGUMENTS contains `--autopilot` (invoked by autopilot): skip this guard silently. </AUTOPILOT-GUARD> <HARD-GATE> Do NOT invoke ship or claim work is complete until ALL ACs have FRESH verification evidence from commands executed in THIS session. No exceptions. </HARD-GATE>State Gate (required)
- Check
in-progresslabel:gh issue view <number> --json labels --jq '[.labels[].name] | index("in-progress")'- Present: proceed.
- Missing: STOP. "Issue #N does not have
in-progresslabel. Runatddfirst."
- Check implementation branch: Verify current branch is not
main.- On
main: STOP. "Cannot verify on main. Check out the implementation branch first."
- On
Iron Law
NO COMPLETION CLAIMS WITHOUT FRESH VERIFICATION EVIDENCE.
"should work", "looks correct", "probably passes" are NOT evidence. Only command output is evidence.
FRESH means NOW: Execute in this session, in this message. Do not use cached results, previous session output, or sub-agent reports.
Spec Authority Check (before Verification Flow)
Resolve the authoritative AC source via bash lib/spec_check.sh:
slug=$(bash lib/spec_check.sh derive_slug <issue>); AC6 fallback applies (missing / continuation-fallback / tbd-persona, all prefixed[spec-warn]).status=$(bash lib/spec_check.sh spec_status "$slug")— tiebreak:approved/implemented→ spec is authoritative.draft/deprecated→ Issue comment ACs win; emit[spec-warn] draft:or[spec-warn] deprecated:.
- Report divergences using
docs/methodology/us-ac-format.md§ Spec ↔ Issue Divergence Matrix (5 patterns: Added / Removed / Modified / Reordered / Status drift).
Full matrix, fallback cases, and examples: docs/guides/spec-reference.md.
Verification Flow
- Read ACs from the authoritative source determined above
- For each AC, identify the verification command:
- Unit test AC → run specific test:
[test command] [test file/class] - Snapshot test AC → run snapshot tests
- E2E test AC → run E2E suite
- Build AC → run build command
- Unit test AC → run specific test:
- Execute each command
- Read the full output
- Map output to AC: PASS with evidence or FAIL with reason
Verification Checklist
## Verification Results
| AC | Test | Result | Evidence |
|----|------|--------|----------|
| AC1: [name] | [test command] | PASS / FAIL | [output excerpt] |
| AC2: [name] | [test command] | PASS / FAIL | [output excerpt] |
| ... | ... | ... | ... |
### Additional Verification
- [ ] Build passes with zero warnings: [command] -> [result]
- [ ] Lint passes: [command] -> [result]
- [ ] All existing tests pass: [command] -> [result]
What Counts as Evidence
- Command output showing PASS with test count
- Build log showing "Build Succeeded" with 0 warnings
- Lint output showing 0 violations
- NOT: "I ran the tests and they passed" (no output shown)
- NOT: "Should be fine" / "looks good"
- NOT: Previous session's test results (must be FRESH)
- NOT: Partial test run (must run ALL tests)
If Any AC Fails
- Report which AC failed and why
- Do NOT proceed to ship
- Return to
atddskill to fix
Status Output
Autopilot mode only (ARGUMENTS contains --autopilot). Skip in standalone mode.
Output a skill-status fenced code block as the last element of your response at every terminal point.
Terminal points:
- COMPLETE: All ACs pass with fresh evidence and ship about to be invoked.
- PENDING: Waiting for user input.
- BLOCKED: State Gate failed.
- FAILED: One or more ACs failed -- return to atdd.
SKILL_STATUS: COMPLETE | PENDING | BLOCKED | FAILED
PHASE: verify
RECOMMENDATION: <next action or error description in one sentence>
Examples:
SKILL_STATUS: COMPLETE
PHASE: verify
RECOMMENDATION: All ACs verified. Running ship.
SKILL_STATUS: BLOCKED
PHASE: verify
RECOMMENDATION: Issue #N does not have in-progress label. Run atdd first.
SKILL_STATUS: FAILED
PHASE: verify
RECOMMENDATION: AC3 failed: test output shows assertion error. Return to atdd to fix implementation.
See docs/guides/skill-status-spec.md for full field definitions, BLOCKED vs FAILED distinction, and autopilot action matrix.
If All ACs Pass
- Post verification results as Issue comment
- "All verification items passed. Running
atdd-kit:shipto finalize the PR." - Invoke
atdd-kit:shipvia the Skill tool.
Red Flags (STOP)
| Thought | Reality |
|---|---|
| "Tests should pass now" | RUN the verification. "Should" is not evidence. |
| "I'm confident this works" | Confidence is not evidence. Run the command. |
| "Just this once I'll skip lint" | No exceptions. Run every check. |
| "The sub-agent said it passed" | Verify independently. Trust no report without fresh output. |
| "I already ran these tests earlier" | Earlier is not now. Run FRESH. |
| "Partial test run is enough" | Must run ALL tests. Partial results hide regressions. |
| "The build is obviously fine" | Run the build command. "Obviously" is a red flag word. |
Gate Function Pattern
IDENTIFY -> what command proves this AC?
RUN -> execute the command
READ -> read the full output
VERIFY -> does output match the claim?
ONLY THEN -> state the result