name: run-autopilot description: Run full PRD lifecycle autonomously - catchup, planning, work, review, rework. Triggers on "autopilot", "run autopilot", "autopilot status", "drain backlog", "execute PRD end to end". argument-hint: "[<prd-filename> | status]"

Autopilot

Orchestrate the full PRD lifecycle: catchup → plan-tasks → work → review → rework loop → blind review → doubt review → done.

Makes autonomous decisions backed by research (dependencies, recurring issues, API/schema changes when PRD-driven) and pauses only for critical security, requirements ambiguity, or blocking decisions.

Execution Model

Run all phases in sequence without stopping. After each phase completes, immediately update state and proceed to the next phase. Do not pause between phases, do not summarize progress, do not wait for user input - unless the phase explicitly says PAUSE or STOP. Completing a sub-skill invocation (/catchup, /plan-tasks, /work, /review-work-completion, etc.) is NOT a stopping point. It is an intermediate step. Continue.

Entry Points

/run-autopilot — auto-select PRD (wip first, then backlog), run full cycle
/run-autopilot <prd-filename> — full cycle with specific PRD
/run-autopilot status — print current dashboard, no action

If invoked with status, read dev/local/autopilot/state.json, print phase/cycle/task summary, and stop.

State Management

All autopilot artifacts live under dev/local/autopilot/, organized by type:

dev/local/autopilot/
  state.json                              # current cycle state
  signal                                  # transient, used by stop hook
  reports/{batch_id}-report.md            # batch audit report
  deferred/{batch_id}-deferred.json       # unresolved items across PRDs

State file: dev/local/autopilot/state.json — see references/state-schema.md for schema.

Create dev/local/autopilot/ and subdirectories if missing. Initialize state file at PRD selection. Update state at every phase transition.

Resuming

When /run-autopilot is invoked and dev/local/autopilot/state.json exists with batch.completed_prds, this is a continuation after a session restart. Preserve batch.completed_prds (including batch.id) and proceed to Phase 0 to pick the next PRD.

Clean up stale signal file at start: delete dev/local/autopilot/signal if it exists.

Task Counts

At every state update, query TaskList and write current counts to the state file:

tasks_total — number of tasks (pending + in_progress + completed)
tasks_completed — number of tasks with status completed

This keeps the dashboard progress bar accurate.

Live Dashboard

The user can run pidash (from buvis-gems) in a separate terminal pane to watch progress in real time. It watches dev/local/autopilot/state.json automatically — no action needed from autopilot beyond keeping the state file updated.

Phase Banners

Print a banner at each phase transition:

── AUTOPILOT ── PRD: {prd-name} ── Phase: {PHASE} ──────────────────
── AUTOPILOT ── PRD: {prd-name} ── Phase: REVIEW ── Cycle {n} ─────
── AUTOPILOT ── PAUSED ── {n} issue(s) need your decision ──────────
── AUTOPILOT ── PRD: {prd-name} ── DONE ── {n} cycles ─────────────

Phase 0: PRD Selection

If argument provided, find that PRD in dev/local/prds/wip/ or dev/local/prds/backlog/. If found in backlog, mv to wip/.
Otherwise, auto-select (never ask the user): a. Check dev/local/prds/wip/:
- 1+ found → auto-pick lowest sequence number (by 00XXX- prefix), announce b. If wip is empty, check dev/local/prds/backlog/:
- PRDs available → auto-pick lowest sequence number, mv to wip/
- Empty → STOP: "No PRDs found. Create one with /create-prd."
Initialize batch in state file if not already present: id: "<yyyymmddHHMM>" (current timestamp), mode: "autopilot", completed_prds: []
Read the Active Work section of dev/local/project-capsule.md if it exists. This contains PRD progress and operational context from previous sessions. Use it to inform work in this session.
Initialize/update state with selected PRD, preserve batch field

Print progress:

── AUTOPILOT ── PRD {n}: {prd-name} ─────────────────────────────

Where {n} = len(batch.completed_prds) + 1

Phase 1: Catchup

Skip if: "catchup" in phases_completed in state file.

Invoke /catchup skill.

After completion, update state: add "catchup" to phases_completed, set phase: "planning".

Phase 2: Planning

Skip if: TaskList returns any pending or completed tasks (tasks already exist).

Invoke /plan-tasks with the selected PRD.

After completion, query TaskList and update state: add "planning" to phases_completed, set phase: "work", write tasks/tasks_total/tasks_completed snapshot (see Phase 3 for format).

Phase 3: Work

Skip if: All tasks completed, none pending.

Before invoking /work, query TaskList and write the full task snapshot to dev/local/autopilot/state.json:

tasks_total: total count
tasks_completed: completed count
tasks: array of {"id": "<task-id>", "name": "<title>", "status": "pending|in_progress|completed"} for EVERY task

Include the task id field — a PostToolUse hook on TaskUpdate uses it to automatically sync status changes to the dashboard. This is mandatory.

Invoke /work skill. It runs until all tasks complete.

After completion, query TaskList again and update state: add "work" to phases_completed, set phase: "review", write updated tasks/tasks_total/tasks_completed.

Phase 4: Review

Skip if: Review file exists in dev/local/reviews/ for current cycle (check filename pattern {prd-name}-review-{cycle}.md).

Invoke /review-work-completion skill.

After completion, update state: set phase: "decision-gate".

Phase 5: Decision Gate

Read the review output. Categorize each finding using references/decision-framework.md.

Safety Checks — evaluate BEFORE classifying individual issues:

Condition	Action
Cycle 3 reached	PAUSE: escalate ALL remaining issues regardless of severity
>10 follow-up tasks from review	PAUSE: scope alarm — ask user before proceeding
Issue count not decreasing vs previous cycle	PAUSE: escalate immediately — fixes aren't working
Same issue reappearing after previous fix	Route to research-then-decide Protocol B
A sub-skill errored during this cycle	PAUSE: report error, don't retry automatically

Classification (per finding):

Auto-fix (proceed without asking):

Low severity, any consensus
Medium severity, clear mechanical fix
Medium severity, 1/3 consensus
Any severity where fix is additive only (adds code/tests, doesn't modify signatures/types/schemas)

Research-then-decide (run protocol, then auto-fix or defer):

New dependency needed -> Protocol A from references/decision-framework.md
Recurring issue (appeared in previous cycle) -> Protocol B
High + data model change -> Protocol C
High + touches public API -> Protocol D

Execute the research protocol. If verdict is "proceed", treat as auto-fix. If verdict is "escalate", defer to batch end. Log with full research field in either case.

Defer to batch end (log, don't PAUSE):

Critical severity, always
Requirements ambiguity (PRD says X, code does Y)
Research-failed items (verdict "escalate" from research protocols)

PAUSE (present to user, block progress):

10 follow-up tasks from review (scope alarm)
Cycle 3 reached (hard stop)
Decision blocks subsequent tasks (e.g. API shape needed before frontend can proceed)
Data model choice that all remaining work depends on

Log every decision in state file (autonomous_decisions or deferred_decisions).

Outcomes:

All auto-fixable, no deferrals, no blockers → proceed to Phase 6
Has deferrals but no blockers → log deferred items to dev/local/autopilot/deferred/{batch_id}-deferred.json, proceed to Phase 6 with auto-fixable items only
Has blocking escalation → PAUSE. Present only the blocking issue(s) to user. Wait for decision. After user responds, proceed to Phase 6.
No issues found → proceed to Phase 7 (Blind Review)

Phase 6: Rework

Tag follow-up tasks from review findings with [C{cycle}] prefix and tasks from decision-gate resolutions with [D{cycle}] prefix. Both use the current cycle number.

Before invoking /work, update state with current task counts.

Invoke /work on follow-up tasks (auto-fixable ones + user-approved ones only).

The work skill may parallelize independent rework tasks when superpowers:dispatching-parallel-agents is available (see work skill's "Parallel dispatch for independent rework fixes").

Increment cycle counter. Update state: set phase: "review", update task counts. Loop back to Phase 4.

Phase 7: Blind Review

Skip if: "blind-review" in phases_completed in state file.

Spec-only verification by a reviewer with no implementation context. Invoke /review-blindly with only the PRD content - no file lists, implementation notes, or review history.

After the review:

No Critical/Important findings → update state: add "blind-review" to phases_completed, set phase: "doubt-review". Proceed to Phase 8.
Critical or Important findings → create tasks tagged [BLIND], invoke /work. After fixes, update state and proceed to Phase 8. Do not loop back to Phase 4.
Zero issues with no file references → suspicious result (reviewer may not have found the code). Log a warning but proceed.

Minor findings: defer to batch end (append to deferred_decisions in state).

This phase runs once per PRD.

Phase 8: Doubt Review

Skip if: "doubt-review" in phases_completed in state file.

Final sanity check before completion. Invoke /review-with-doubt.

The doubt review produces findings in three categories: FIX (fixable now), VERIFY (needs checking), and KNOWN (real limitation, out of scope). Process each:

Handling FIX items

Create a task tagged [DOUBT] for each FIX item.
Add an entry to doubts in state: {"description": "...", "category": "fix", "status": "pending"}

Handling VERIFY items

VERIFY items are resolved during the review itself (the doubt skill runs checks and reclassifies as FIX or dismissed). If any survive unresolved:

Treat as FIX — create a [DOUBT] task.
Add to doubts in state with "category": "verify".

Handling KNOWN items

KNOWN items cannot be fixed in this scope. They flow to batch-end review for the user to decide.

Add to doubts in state: {"description": "...", "category": "known", "justification": "...", "status": "pending"}
Do NOT create tasks for KNOWN items — they are deferred, not actionable here.

Execution

After classifying all items:

If no FIX or VERIFY tasks → proceed to Phase 9. KNOWN items (if any) will be surfaced at batch end.
If >5 FIX/VERIFY tasks → defer all to batch end (append each to deferred_decisions in state as {"type": "doubt-overflow", "description": "...", "category": "fix|verify", "status": "pending"}). Log warning but do NOT PAUSE. Proceed to Phase 9.
If ≤5 FIX/VERIFY tasks → invoke /work on [DOUBT]-tagged tasks immediately — no decision gate, no rework loop.
After work completes, mark each resolved doubt entry's status as "resolved" in state.
Update state: add "doubt-review" to phases_completed, set phase: "done", update task counts.

KNOWN items keep "status": "pending" — Phase 9 step 5 collects these into the batch deferred log for batch-end review.

This phase runs once per PRD. It does not loop back to Phase 4.

Phase 9: Completion

Update state: set phase: "done"
Move PRD from wip/ to done/ (use mv, keep 00XXX- prefix)
Append completed PRD to batch.completed_prds in state file
Delete all tasks from the completed PRD: query TaskList, mark every task as deleted via TaskUpdate. This prevents stale tasks from triggering Phase 2's skip logic on the next PRD.
Append items to dev/local/autopilot/deferred/{batch_id}-deferred.json (create if missing). Collect from the current state file:
- deferred_decisions with status "pending" or "deferred" -> type "deferred_decision" (preserve original type field if present, e.g. "doubt-overflow")
- doubts with status "pending" -> type "doubt"
- autonomous_decisions with research field -> type "autonomous_research" (for user awareness at batch end) Each entry gets tagged with prd (filename) and cycle. Preserve the full research field when present - this is the only copy that survives state reset. Skip this step if nothing to write.
Append PRD summary to dev/local/autopilot/reports/{batch_id}-report.md (create with header if missing). See references/batch-report-format.md for format. 6b. Append autonomous decisions to dev/local/decisions.md if that file exists (skip if absent - user opts in by creating it). For each non-trivial entry in autonomous_decisions from the state file, append one row:
```
| {YYYY-MM-DD} | {decision summary} | {rationale or research evidence} | batch-{batch_id} PRD {prd-number} |
```
Dedupe: grep the decision summary before appending; skip if already present.
Update the Active Work section of dev/local/project-capsule.md with batch progress. Use the Edit tool to replace the Active Work section content:
```
## Active Work

### Batch {batch_id}
- [x] {completed PRD name} ({n} cycles)
- [x] ...for each completed PRD in batch
- [ ] {PRD name} - for each PRD still in wip/ or backlog/

Observations: {any operational gotchas useful for next iteration}
```
If the capsule doesn't exist yet (catchup was skipped), create a minimal one with just the Active Work section.
Print per-PRD summary:

── AUTOPILOT ── PRD: {prd-name} ── DONE ── {n} cycles ─────────────

Summary:
- Cycles: {n}
- Autonomous decisions: {count}
- Escalated decisions: {count}
- Follow-up tasks fixed: {count}

Continuation

Check: any PRDs remaining in dev/local/prds/wip/*.md or dev/local/prds/backlog/*.md?
- Yes → reset state for next PRD: set phases_completed to [], cycle to 1, clear tasks/decisions/review_cycles/doubts. Preserve batch field. Write next to dev/local/autopilot/signal. Print:
```
── AUTOPILOT ── {prd-name} done ── next PRD in new session ────────
```
  Then STOP. The stop hook auto-exits the session. The shell loop starts a fresh session.
- No → print batch summary, delete state file. Do NOT write dev/local/autopilot/signal - the session stays interactive for batch-end review.
```
── AUTOPILOT ── COMPLETE ───────────────────────────────────────────

Completed {n} PRDs:
  1. {prd-name} ({cycles} cycles)
  2. {prd-name} ({cycles} cycles)
  ...
```
  Batch-End Review
  
  Before exiting, collect ALL pending items from across the batch and present them to the user. This is mandatory if any items exist - never auto-exit with unreviewed items.
  
  Source: dev/local/autopilot/deferred/{batch_id}-deferred.json (single source of truth - all items were written here at Phase 9 step 5 of each PRD). Contains four item types:
  - deferred_decision - issues that failed research or were deferred for other reasons
  - doubt - unresolved findings from doubt review
  - doubt-overflow - FIX/VERIFY items deferred when doubt review found >5 issues (present under UNRESOLVED DOUBTS)
  - autonomous_research - research-backed decisions made autonomously (for user awareness)
  Auto-fix trivial items first. Before presenting items to the user, scan for trivial fixes that are clearly additive-only: docstring/comment improvements, test helper fixes (missing kwargs, style), formatting. Fix these silently, commit, and remove from the deferred list. Only present items that genuinely need a user decision.
  
  Presentation format - chunked by PRD:
  
  Present items grouped by PRD, one PRD at a time. For each PRD chunk:
```
── BATCH REVIEW ── PRD: {prd-name} ── {n} items ──────────────────

DEFERRED DECISIONS ({count}):

1. {severity emoji} {issue description}
   Cycle: {n} | Consensus: {consensus}
   Context: {why this was deferred - include research evidence if available}
   File: {path}
   Options: fix now / create issue / ignore

2. ...

UNRESOLVED DOUBTS ({count}):

1. {severity emoji} {doubt description}
   Context: {what the doubt review found and why it matters}
   Options: fix now / create issue / ignore

AUTONOMOUS RESEARCH DECISIONS ({count} - for awareness):

1. {severity emoji} {issue description} -> {action taken}
   Research: {evidence_summary from research field}
   (No action needed unless you disagree)
```
  Wait for user decisions on each PRD chunk before showing the next. For "fix now" items, execute the fix before continuing. For "create issue", create a GitHub issue with the context shown.
  
  After all PRD chunks are reviewed (or user says stop), delete the deferred JSON and write done to dev/local/autopilot/signal. If the deferred JSON doesn't exist or is empty, write done to dev/local/autopilot/signal immediately. The stop hook auto-exits the session. The shell loop sees done and stops.

Session Loop

Autopilot supports automatic session cycling via a signal file + Stop hook. This enables unattended PRD-to-PRD transitions while keeping sessions interactive.

Signal file: dev/local/autopilot/signal — written at Phase 9 completion with next (more PRDs) or done (backlog empty).

Shell wrapper:

while true; do
  claude "/run-autopilot"
  signal=$(cat dev/local/autopilot/signal 2>/dev/null)
  rm -f dev/local/autopilot/signal
  if [ "$signal" != "next" ]; then
    echo "Backlog drained."
    break
  fi
  echo "Starting next PRD..."
done

Required: A Stop hook that auto-exits when the signal file exists. See scripts/autopilot-stop-hook.sh. Configure in settings.json:

{
  "hooks": {
    "Stop": [
      {
        "matcher": "",
        "command": "~/.claude/skills/run-autopilot/scripts/autopilot-stop-hook.sh"
      }
    ]
  }
}

Without the hook, sessions remain interactive but require manual exit (Ctrl+D) between PRDs. The shell loop still handles restart.

Shell Command Rules

Never chain commands with &&, |, or ; in a single Bash call. Use separate Bash tool calls instead.
Never use redirections like 2>/dev/null. Handle missing files by checking existence or catching errors in the tool result.
Use Glob or Read instead of ls where possible (e.g. to check if files exist or list directory contents).
Use mkdir -p in its own Bash call when creating directories.

Error Handling

Situation	Action
Sub-skill invocation fails	PAUSE, report which skill failed and error
No PRDs anywhere	STOP with message about /create-prd
State file corrupted	Delete it, restart from Phase 0
Review produces no parseable output	PAUSE, report — don't retry
All three reviewers fail	PAUSE, report — partial results usable if user confirms
`dev/local/` doesn't exist	Create it
Task tools unavailable	STOP, report — can't operate without tasks

Superpowers Integration

Autopilot depends on superpowers for quality gates. All integrations are conditional - autopilot works without them, but quality improves with them.

Used by the Work skill (Phases 3, 6, 7, 8)

Superpower	Step	Purpose
`test-driven-development`	2.7	Write failing tests before implementation
`systematic-debugging`	4.5	Root-cause analysis on errors
`verification-before-completion`	5.5	Run test suite before marking done
`requesting-code-review`	5.7	Per-task code review after commit
`receiving-code-review`	5.7	Evaluate review feedback before acting on it

Used in Phase 6 (Rework)

Superpower	Condition	Purpose
`dispatching-parallel-agents`	2+ independent rework tasks	Parallelize isolated review fixes

Not used (rationale)

Superpower	Reason
`brainstorming`	Interactive; happens before PRD exists
`writing-plans`	PRD is the plan; plan-tasks decomposes into tasks
`executing-plans`	Work skill manages per-task dispatch already
`subagent-driven-development`	Work skill's loop serves same purpose
`finishing-a-development-branch`	Autopilot works on current branch, not worktrees
`using-git-worktrees`	Separate architectural concern
`using-superpowers`	Session-start meta-skill; autopilot is autonomous, not conversational
`writing-skills`	Meta-skill for creating skills, not a workflow gate

Note on review layering

Per-task review (step 5.7), PRD-level review (Phase 4), and blind review (Phase 7) are complementary, not redundant. Per-task catches issues early before they compound. Phase 4 catches cross-task coherence and integration issues. Phase 7 catches spec drift and gaps that implementation-aware reviewers miss by giving a fresh agent only the spec. All three are needed.

Reference Files

references/state-schema.md — state file JSON schema and skip logic
references/decision-framework.md — auto-fix vs escalate classification rules
references/dashboard-format.md — live dashboard via pidash
references/batch-report-format.md — batch audit report format

ナビゲーション

Skillsとは？

リンク

run-autopilot