name: checkpoint-protocol description: Human interaction protocol with automation-first rule. Defines checkpoint types and when to use them.

Checkpoint Protocol

Golden Rule: If it has CLI/API, Claude does it. Humans only do what requires judgment.

Checkpoint Types

Distribution

Type	Frequency	When to Use
`checkpoint:human-verify`	90%	User confirms it works
`checkpoint:decision`	9%	User chooses between options
`checkpoint:human-action`	1%	Truly unavoidable manual step

Type 1: Human-Verify (90%)

Claude automates everything, human just confirms it works.

When to Use

Visual verification (UI looks right)
Interactive flows (click through app)
Functional verification (feature works as expected)

Format

<task type="auto">
  <name>Start dev server</name>
  <action>Run `npm run dev` in background</action>
  <verify>curl localhost:3000 returns 200</verify>
</task>

<task type="checkpoint:human-verify" gate="blocking">
  <what-built>Dashboard - server at http://localhost:3000</what-built>
  <how-to-verify>
    Visit http://localhost:3000/dashboard and verify:
    1. Desktop (>1024px): sidebar visible, cards display data
    2. Mobile (375px): single column, bottom nav visible
    3. Click "Settings" - modal opens
  </how-to-verify>
  <resume-signal>Type "approved" or describe issues</resume-signal>
</task>

Key Rules

Claude starts servers - User never runs npm run dev
Claude sets up data - User never creates test data
Claude provides URLs - User just clicks links
User only looks - Visual/functional confirmation

Example Output

🟢 checkpoint:human-verify

## Built: User Dashboard

**Server running:** http://localhost:3000/dashboard

### Please verify:
1. ✅ Page loads without errors
2. ✅ User data displays correctly
3. ✅ Sidebar navigation works
4. ✅ Mobile view is responsive

**Resume:** Type "approved" or describe issues

Type 2: Decision (9%)

Human must make a choice affecting implementation.

When to Use

Technology selection (which library)
Architecture decisions (approach A vs B)
Design choices (layout, color, UX)
Business logic (pricing, limits, rules)

Format

<task type="checkpoint:decision" gate="blocking">
  <decision>Select authentication provider</decision>
  <context>
    Need user auth for the app. Three options with tradeoffs.
  </context>
  <options>
    <option id="supabase">
      <name>Supabase Auth</name>
      <pros>Built-in with DB, free tier generous, email templates</pros>
      <cons>Less customizable UI, vendor lock-in</cons>
    </option>
    <option id="clerk">
      <name>Clerk</name>
      <pros>Best DX, beautiful UI, social logins easy</pros>
      <cons>Paid after 10k MAU, another vendor</cons>
    </option>
    <option id="custom">
      <name>Custom JWT</name>
      <pros>Full control, no external dependencies</pros>
      <cons>More implementation work, security responsibility</cons>
    </option>
  </options>
  <resume-signal>Select: supabase, clerk, or custom</resume-signal>
</task>

Key Rules

Present balanced options - No prescriptive recommendation
Include context - Why this decision matters
Show tradeoffs - Pros AND cons for each
No "correct" answer - All options are valid

Example Output

🟡 checkpoint:decision

## Decision Required: Authentication Provider

**Context:** Need user auth. Three approaches available.

| Option | Pros | Cons |
|--------|------|------|
| **Supabase** | Built-in, free tier | Less customizable |
| **Clerk** | Best DX, beautiful UI | Paid after 10k users |
| **Custom JWT** | Full control | More work |

**Select:** supabase, clerk, or custom

Type 3: Human-Action (1% - RARE)

Truly unavoidable manual step. Exhaust all automation first.

When to Use (Only These Cases)

Email verification clicks (can't automate)
3D Secure / MFA in payment flow
OAuth consent screens in browser
Physical hardware interaction
Captcha solving

When NOT to Use

❌ Running CLI commands (Claude runs them)
❌ Creating accounts (Claude uses API/CLI)
❌ Starting servers (Claude runs them)
❌ Creating files (Claude creates them)
❌ Configuration (Claude edits files)
❌ Database setup (Claude runs migrations)

Format

<task type="auto">
  <name>Create SendGrid account</name>
  <action>Use API to create account, request verification email</action>
</task>

<task type="checkpoint:human-action">
  <action>Complete email verification</action>
  <why-manual>Email verification requires clicking link in your inbox</why-manual>
  <instructions>
    1. Check your inbox for email from SendGrid
    2. Click the verification link
    3. Return here when done
  </instructions>
  <resume-signal>Type "done" when verified</resume-signal>
</task>

Key Rules

Try automation FIRST - Only ask for help when blocked
Explain why manual - User should know why this can't be automated
Minimize steps - Do everything possible before/after the manual step
Golden rule: If it has CLI/API, Claude MUST do it

Example Output

🔴 checkpoint:human-action

## Manual Step Required: Email Verification

**Why manual:** Email verification links can't be automated

### Instructions:
1. Check your inbox for email from SendGrid
2. Click "Verify Email" button
3. Return here when done

**Resume:** Type "done" when verified

Automation-First Checklist

Before using ANY checkpoint, ask:

□ Can I do this with a CLI command?
□ Can I do this with an API call?
□ Can I do this by editing a file?
□ Can I start/stop a server myself?
□ Can I create test data myself?
□ Can I run a script for this?

If ANY answer is YES → Don't ask user to do it.

Common Anti-Patterns

❌ Wrong: Asking User to Run Commands

Please run: npm run dev

✅ Right: Claude Runs Commands

# Claude executes
npm run dev &
# Then presents checkpoint
Visit http://localhost:3000 to verify

❌ Wrong: Asking User to Create Files

Please create a file at src/config.ts with:
[content]

✅ Right: Claude Creates Files

# Claude creates the file
Write src/config.ts
# Done - no checkpoint needed

❌ Wrong: Asking User to Set Up Database

Please create a PostgreSQL database called "myapp"

✅ Right: Claude Uses CLI

# Claude executes
createdb myapp
psql myapp < schema.sql
# Done - no checkpoint needed

❌ Wrong: Decision as Human-Action

Should I use React or Vue?
Please choose and let me know.

✅ Right: Use Decision Checkpoint

<task type="checkpoint:decision">
  <decision>Frontend framework</decision>
  <options>
    <option id="react">React - Larger ecosystem</option>
    <option id="vue">Vue - Simpler learning curve</option>
  </options>
</task>

Checkpoint Flow in Execution

FOR each plan:
    IF plan.autonomous == true:
        Execute all tasks automatically
        Generate LOGBOOK.md

    ELSE IF plan has checkpoint:
        Execute tasks up to checkpoint

        SWITCH checkpoint.type:
            CASE human-verify:
                Present what was built
                Show verification steps
                WAIT for "approved" or issues

            CASE decision:
                Present options with tradeoffs
                WAIT for selection
                Continue with selected option

            CASE human-action:
                Present instructions
                WAIT for "done"

        Continue remaining tasks
        Generate LOGBOOK.md

Integration with Wave Execution

# Plan with checkpoint (runs sequentially, not parallel)
---
phase: 3
plan: 06
wave: 3
autonomous: false
checkpoint:
  type: human-verify
  after_task: 4
  what: "Integration tests pass, dashboard functional"
depends_on: ["04", "05"]
---

Plans with checkpoints are NOT spawned in parallel
They run sequentially after parallel wave completes
Checkpoint pauses execution until user responds

Three-Level Threshold System

Threshold Levels

thresholds:
  warning:
    cost: 10.00        # Log warning, continue
    context: 30        # Log warning, continue (percentage)
    variance: 20       # Log warning, continue (percentage)
    errors: 2          # Log warning, continue (count)

  alert:
    cost: 25.00        # Pause for acknowledgment
    context: 40        # Prepare checkpoint
    variance: 30       # Pause for review
    errors: 3          # Pause for investigation

  stop:
    cost: 50.00        # Hard stop, require user action
    context: 50        # Must checkpoint, terminate
    variance: 50       # Escalate to user
    errors: 5          # Hard stop, likely systemic issue

Threshold Actions

Level	Action	User Impact
Warning	Log message, continue	No interruption
Alert	Pause until acknowledged	One-time acknowledgment
Stop	Hard stop, checkpoint	Must address issue

Threshold Handling Protocol

FUNCTION handleThreshold(type, level, value, state):
    """
    Handle threshold breach based on level.
    """
    LOG "⚠️ Threshold triggered: {type} at {level} ({value})"

    SWITCH level:
        CASE "warning":
            # Log and continue
            LOG "📝 Warning: {type} reached {value}"
            logToMetrics(type, level, value)
            CONTINUE

        CASE "alert":
            # Check if already acknowledged
            IF state.acknowledged[type]:
                LOG "⏩ Alert acknowledged previously, continuing..."
                CONTINUE
            ELSE:
                # Pause for acknowledgment
                LOG "🔶 ALERT: {type} at {value}"
                LOG ""
                LOG "This threshold requires acknowledgment to continue."
                LOG "Type 'ack' to acknowledge and continue, or 'stop' to halt."

                response = waitForUserResponse()

                IF response == "ack":
                    state.acknowledged[type] = {
                        value: value,
                        acknowledged_at: now(),
                        session_id: getCurrentSessionId()
                    }
                    saveState(state)
                    LOG "✅ Acknowledged. Continuing..."
                    CONTINUE
                ELSE:
                    createCheckpoint("alert_stop")
                    EXIT

        CASE "stop":
            # Hard stop
            LOG "🛑 STOP: {type} at {value}"
            LOG ""
            LOG "This threshold requires immediate action."
            LOG "Execution cannot continue until issue is resolved."

            createCheckpoint("threshold_stop", {
                type: type,
                value: value,
                reason: "Threshold exceeded"
            })

            EXIT_WITH_CHECKPOINT

Threshold Monitoring

FUNCTION monitorThresholds(execution_state):
    """
    Continuously monitor thresholds during execution.
    """
    thresholds = loadThresholds()

    # Cost threshold
    IF execution_state.total_cost >= thresholds.stop.cost:
        handleThreshold("cost", "stop", execution_state.total_cost, execution_state)
    ELSE IF execution_state.total_cost >= thresholds.alert.cost:
        handleThreshold("cost", "alert", execution_state.total_cost, execution_state)
    ELSE IF execution_state.total_cost >= thresholds.warning.cost:
        handleThreshold("cost", "warning", execution_state.total_cost, execution_state)

    # Context usage threshold
    context_percent = calculateContextUsage()
    IF context_percent >= thresholds.stop.context:
        handleThreshold("context", "stop", context_percent, execution_state)
    ELSE IF context_percent >= thresholds.alert.context:
        handleThreshold("context", "alert", context_percent, execution_state)
    ELSE IF context_percent >= thresholds.warning.context:
        handleThreshold("context", "warning", context_percent, execution_state)

    # Variance threshold
    variance = calculateVariance(execution_state.estimated, execution_state.actual)
    IF variance >= thresholds.stop.variance:
        handleThreshold("variance", "stop", variance, execution_state)
    ELSE IF variance >= thresholds.alert.variance:
        handleThreshold("variance", "alert", variance, execution_state)
    ELSE IF variance >= thresholds.warning.variance:
        handleThreshold("variance", "warning", variance, execution_state)

    # Error count threshold
    IF execution_state.error_count >= thresholds.stop.errors:
        handleThreshold("errors", "stop", execution_state.error_count, execution_state)
    ELSE IF execution_state.error_count >= thresholds.alert.errors:
        handleThreshold("errors", "alert", execution_state.error_count, execution_state)
    ELSE IF execution_state.error_count >= thresholds.warning.errors:
        handleThreshold("errors", "warning", execution_state.error_count, execution_state)

Acknowledgment Persistence

Storage Location

.autopilot/state/acknowledgments.json

Structure

{
  "acknowledgments": {
    "cost": {
      "acknowledged_at": "2026-01-29T14:30:00Z",
      "value_at_ack": 26.50,
      "session_id": "session-abc123"
    },
    "context": {
      "acknowledged_at": "2026-01-29T15:00:00Z",
      "value_at_ack": 42,
      "session_id": "session-def456"
    }
  },
  "last_reset": null
}

Persistence Protocol

FUNCTION loadAcknowledgments():
    """
    Load acknowledgments from persistent storage.
    """
    path = ".autopilot/state/acknowledgments.json"
    IF fileExists(path):
        RETURN parseJSON(readFile(path))
    ELSE:
        RETURN {acknowledgments: {}, last_reset: null}

FUNCTION saveAcknowledgment(type, value):
    """
    Persist acknowledgment for future sessions.
    """
    acks = loadAcknowledgments()
    acks.acknowledgments[type] = {
        acknowledged_at: now(),
        value_at_ack: value,
        session_id: getCurrentSessionId()
    }
    writeJSON(".autopilot/state/acknowledgments.json", acks)

FUNCTION resetAcknowledgments():
    """
    Clear all acknowledgments (--reset-alerts flag).
    """
    acks = {
        acknowledgments: {},
        last_reset: now()
    }
    writeJSON(".autopilot/state/acknowledgments.json", acks)
    LOG "✅ All alert acknowledgments cleared"

Reset Command

# Reset all acknowledgments
/autopilot:takeoff --reset-alerts

# or during execution
/autopilot:cockpit --reset-alerts

Additional Checkpoint Triggers

Trigger Configuration

checkpoint_triggers:
  # Task/Phase events
  - type: task_complete
    frequency: every
    action: update_state

  - type: phase_complete
    frequency: every
    action: checkpoint_and_verify

  # Context management
  - type: context_warning
    threshold: 30%
    action: log_warning

  - type: context_alert
    threshold: 40%
    action: prepare_checkpoint

  - type: context_stop
    threshold: 50%
    action: force_checkpoint

  # Cost management
  - type: cost_warning
    threshold: $10
    action: log_warning

  - type: cost_alert
    threshold: $25
    action: pause_for_ack

  - type: cost_stop
    threshold: $50
    action: hard_stop

  # Error handling
  - type: error_threshold
    count: 3
    action: pause_and_investigate

  # Time management
  - type: time_elapsed
    duration: 30min
    action: suggest_checkpoint

  # User interaction
  - type: user_interrupt
    signal: Ctrl+C
    action: graceful_checkpoint

Trigger Protocol

FUNCTION checkTriggers(execution_state):
    """
    Check all checkpoint triggers.
    """
    triggers = loadTriggers()

    FOR each trigger IN triggers:
        SWITCH trigger.type:
            CASE "task_complete":
                # Always triggered after task
                IF execution_state.just_completed_task:
                    updateState(execution_state)

            CASE "phase_complete":
                IF execution_state.just_completed_phase:
                    createCheckpoint("phase_complete")
                    verifyPhaseGoals(execution_state.phase)

            CASE "context_warning", "context_alert", "context_stop":
                context_pct = calculateContextUsage()
                IF context_pct >= trigger.threshold:
                    executeTriggerAction(trigger, context_pct)

            CASE "cost_warning", "cost_alert", "cost_stop":
                IF execution_state.total_cost >= trigger.threshold:
                    executeTriggerAction(trigger, execution_state.total_cost)

            CASE "error_threshold":
                IF execution_state.error_count >= trigger.count:
                    pauseAndInvestigate(execution_state.recent_errors)

            CASE "time_elapsed":
                elapsed = now() - execution_state.session_start
                IF elapsed >= trigger.duration:
                    suggestCheckpoint("time_elapsed", elapsed)

            CASE "user_interrupt":
                IF signalReceived(SIGINT):
                    gracefulCheckpoint(execution_state)

Autonomous vs Human-Verify Plans

Plan Types

# Autonomous plan (no checkpoint needed)
---
phase: 1
plan: 01
autonomous: true
checkpoint: null
---

# Tasks execute without interruption
# Summary generated at end

# Human-verify plan (requires confirmation)
---
phase: 3
plan: 06
autonomous: false
checkpoint:
  type: human-verify
  after_task: 4
  what: "Dashboard functional, data displays correctly"
---

# Pauses after task 4 for verification
# Continues after "approved" response

Execution Protocol

FUNCTION executePlan(plan):
    """
    Execute plan based on autonomous flag.
    """
    IF plan.autonomous == true:
        # Full autonomous execution
        LOG "🤖 Autonomous execution: Phase {plan.phase}, Plan {plan.plan}"

        FOR each task IN plan.tasks:
            executeTask(task)
            monitorThresholds(state)  # Still check thresholds

        generateSummary(plan)
        updateState(state)

    ELSE:
        # Execute with checkpoint
        LOG "👤 Human-verify execution: Phase {plan.phase}, Plan {plan.plan}"

        FOR index, task IN enumerate(plan.tasks):
            executeTask(task)
            monitorThresholds(state)

            # Check if checkpoint is after this task
            IF plan.checkpoint AND index + 1 == plan.checkpoint.after_task:
                presentCheckpoint(plan.checkpoint)
                response = waitForUserResponse()

                IF response == "approved":
                    LOG "✅ Checkpoint approved, continuing..."
                ELSE:
                    # User reported issues
                    handleIssues(response)

        generateSummary(plan)
        updateState(state)

Plan Type Selection

FUNCTION determinePlanType(phase, plan):
    """
    Determine if plan should be autonomous or human-verify.
    """
    # Always human-verify for these cases:
    IF plan.has_ui_changes:
        RETURN {autonomous: false, reason: "UI changes require visual verification"}

    IF plan.has_user_flows:
        RETURN {autonomous: false, reason: "User flows require functional verification"}

    IF plan.is_integration:
        RETURN {autonomous: false, reason: "Integration requires end-to-end verification"}

    IF phase.is_final:
        RETURN {autonomous: false, reason: "Final phase requires comprehensive verification"}

    # Default to autonomous for:
    IF plan.is_infrastructure:
        RETURN {autonomous: true, reason: "Infrastructure can be verified programmatically"}

    IF plan.is_backend_only:
        RETURN {autonomous: true, reason: "Backend verified by tests"}

    IF plan.is_setup:
        RETURN {autonomous: true, reason: "Setup verified by existence checks"}

    # Default: based on phase position
    IF phase.number <= 2:
        RETURN {autonomous: true, reason: "Early phases typically autonomous"}
    ELSE:
        RETURN {autonomous: false, reason: "Later phases need verification"}

Threshold Output Formats

Warning (No Interruption)

📝 Warning: Cost at $11.50 (threshold: $10)
   Continuing execution...

Alert (Requires Acknowledgment)

╔══════════════════════════════════════════════════════╗
║           🔶 ALERT: Threshold Reached                 ║
╠══════════════════════════════════════════════════════╣
║  Type:      Cost                                      ║
║  Current:   $26.50                                    ║
║  Threshold: $25.00 (alert)                            ║
║  Next:      $50.00 (stop)                             ║
╠══════════════════════════════════════════════════════╣
║  This alert requires acknowledgment to continue.      ║
║                                                       ║
║  Type 'ack' to acknowledge and continue               ║
║  Type 'stop' to halt execution                        ║
╚══════════════════════════════════════════════════════╝

Stop (Hard Stop)

╔══════════════════════════════════════════════════════╗
║           🛑 STOP: Threshold Exceeded                 ║
╠══════════════════════════════════════════════════════╣
║  Type:      Cost                                      ║
║  Current:   $52.30                                    ║
║  Threshold: $50.00 (stop)                             ║
╠══════════════════════════════════════════════════════╣
║  Execution HALTED. Checkpoint created.                ║
║                                                       ║
║  Options:                                             ║
║  1. Increase threshold in .autopilot/config.yaml      ║
║  2. Review and optimize remaining work                ║
║  3. Resume with /autopilot:cockpit --reset-alerts     ║
╚══════════════════════════════════════════════════════╝

Waypoint saved: .autopilot/waypoints/stop-20260129-143000.json

ナビゲーション

Skillsとは？

リンク

checkpoint-protocol

name: checkpoint-protocol description: Human interaction protocol with automation-first rule. Defines checkpoint types and when to use them.

Checkpoint Protocol

Checkpoint Types

Distribution

Type 1: Human-Verify (90%)

When to Use

Format

Key Rules

Example Output

Type 2: Decision (9%)

When to Use

Format

Key Rules

Example Output

Type 3: Human-Action (1% - RARE)

When to Use (Only These Cases)

When NOT to Use

Format

Key Rules

Example Output

Automation-First Checklist

Common Anti-Patterns

❌ Wrong: Asking User to Run Commands

✅ Right: Claude Runs Commands

❌ Wrong: Asking User to Create Files

✅ Right: Claude Creates Files

❌ Wrong: Asking User to Set Up Database

✅ Right: Claude Uses CLI

❌ Wrong: Decision as Human-Action

✅ Right: Use Decision Checkpoint

Checkpoint Flow in Execution

Integration with Wave Execution

Three-Level Threshold System

Threshold Levels

Threshold Actions

Threshold Handling Protocol

Threshold Monitoring

Acknowledgment Persistence

Storage Location

Structure

Persistence Protocol

Reset Command

Additional Checkpoint Triggers

Trigger Configuration

Trigger Protocol

Autonomous vs Human-Verify Plans

Plan Types

Execution Protocol

Plan Type Selection

Threshold Output Formats

Warning (No Interruption)

Alert (Requires Acknowledgment)

Stop (Hard Stop)

関連スキル(🌐 Web開発)