name: mine-best-practices description: Extract best practices from PR review comments to build a curated library for code review automation license: MIT argument-hint: "--since YYYY-MM-DD [--until YYYY-MM-DD] [--scope NAME]" metadata: author: Valon Technologies version: "1.0"

Mine Best Practices

Extract insights from PR review threads, validate against codebase, and consolidate into the best practices library.

Your Role as Orchestrator

You are the orchestrator for this multi-stage pipeline. Your responsibilities:

Execute scripts - Run the Python scripts that prepare batches and aggregate results
Launch subagents - Create Task() calls to dispatch specialized subagents for extraction, validation, and synthesis. Max 10 concurrent — if more batches exist, wait for a wave to complete before launching the next.
Validate outputs - After each phase, review subagent outputs for quality, format correctness, and issues
Stop on anomalies - If you detect problems (malformed output, unexpected results, low yield), stop and alert the user. Do not attempt to fix issues on-the-fly.

Key principle: Validate each stage's output before proceeding. Only interrupt the user when something needs human judgment.

When to Use This Skill

Use when:

Building/updating the best practices library from recent PRs
Mining a date range of PR reviews for patterns
Seeding the library from historical review threads

Don't use for:

Reviewing code against the library of current practices
General PR reviews

Usage

/mine-best-practices --since 2025-01-01
/mine-best-practices --since 2025-06-01 --until 2025-07-01 --scope backend

All date ranges refer to PR merge date (inclusive on both ends).

Advanced

For debugging and manual intervention:

/mine-best-practices resume validate --identifier web_2025-01-29
/mine-best-practices status
/mine-best-practices pending
/mine-best-practices for-topic error_handling

--batch-size and --id-prefix are tuning parameters rarely needed in normal operation.

Data Refresh

Before mining, ensure threads are up to date:

python3 scripts/mine.py refresh                          # Incremental (new PRs only)
python3 scripts/mine.py refresh --since 2025-01-01       # From specific merge date
python3 scripts/mine.py refresh --since 2026-01-09 --until 2026-01-26  # Specific range
python3 scripts/mine.py refresh --full                   # Full re-extraction

Requires gh CLI authenticated with repo access. Safe to re-fetch overlapping ranges (deduplicates by thread_id).

Execution Workflow

NOTE: All commands run from the skill directory (where this SKILL.md lives).

Step 1: Start Extraction

python3 scripts/mine.py extract --since 2025-01-01 --scope backend

Outputs extraction Task prompts for each batch.

Step 2: Launch Extraction Subagents

Launch the Task prompts from Step 1 in parallel using the Task tool.

Output: tmp/mining_{identifier}/extraction/batch_{n}.yaml

After subagents complete, validate:

Check each batch output file exists
Verify YAML format is correct (insights list, skipped entries)
Review yield rate (typically 30-40% extracted, 60-70% skipped)
Spot-check 2-3 insight content samples for quality
Stop and alert user if: yield is unusually low/high, format errors, or quality issues

Step 3: Aggregate Extraction

python3 scripts/aggregate_extraction.py {identifier}

Merges results into insights.yaml and outputs validation Task prompts.

After aggregation, validate:

Verify insights.yaml was updated with new insights
Check insight count matches expected (extracted - duplicates)
Review a few insight content samples
Stop and alert user if: counts don't match, format issues, or quality concerns

Step 4: Launch Validation Subagents

Launch validation Task prompts in parallel using the Task tool.

Output: tmp/mining_{identifier}/validation/batch_{n}.yaml

After subagents complete, validate:

Check each batch output file exists
Verify YAML format is correct (rejections list)
Review rejection rate (expect 0-10% for recent threads, higher for older)
Spot-check rejection reasons for appropriateness
Stop and alert user if: rejection rate is surprisingly high/low, unclear rejection reasons, or format issues

Step 5: Aggregate Validation

python3 scripts/aggregate_validation.py {identifier}

Updates insights.yaml with validation results and outputs topic assignment prompt.

After aggregation, validate:

Verify insights.yaml statuses updated (pending → validated or rejected)
Check all pending insights were processed
Review rejection reasons if any
Stop and alert user if: missing updates, unexpected rejection patterns

Step 6: Launch Topic Assignment

Launch the topic assignment Task prompt(s) in parallel.

Output: tmp/mining_{identifier}/topics/batch_{n}.yaml

After subagents complete:

Read all topics/batch_{n}.yaml outputs
Merge all assignments lists into one topics.yaml in the working directory
Deduplicate __new__: topics: same name across batches → keep as-is (natural merge). Similar but differently-named proposals → flag to user for resolution.
Verify all insight_ids were assigned, check topic distribution is reasonable
Stop and alert user if: many new topics proposed, odd distribution, or missing assignments

Step 7: Dispatch Synthesis

python3 scripts/dispatch_synthesis.py {identifier}

Applies topic assignments and outputs synthesis Task prompts (one per topic).

After dispatch, validate:

Verify insights.yaml was updated with topic assignments
Check all validated insights have topics
Review new topic files were created (for __new__: topics)
Stop and alert user if: assignments missing, too many new topics, or odd groupings

Step 8: Launch Synthesis Subagents

Launch synthesis Task prompts in parallel using the Task tool (one per topic).

Output: Updates library/{topic}.yaml directly.

After subagents complete, validate:

Check each topic's library file was updated
Verify YAML format is correct
Review subagent summaries (preserved/updated/added counts)
Spot-check 1-2 updated practices for quality
Stop and alert user if: files weren't updated, format errors, or suspicious changes

Step 9: Verify Synthesis Quality

Check that:

Existing practices were preserved appropriately
New practices are well-written and actionable
One-off patterns were filtered (not everything became a practice)
Code examples are correct and follow codebase conventions

Stop and alert user if: practices were deleted without replacement, excessive additions, or empty library files.

Step 10: Aggregate Synthesis

python3 scripts/aggregate_synthesis.py {identifier}

Marks all validated insights with topics as synthesized.

Step 11: Build

python3 scripts/build_sections.py

Generates markdown files for the review skill.

Output: the configured sections_output_dir

Step 12: Verify

python3 scripts/mine.py status
python3 scripts/mine.py pending

Confirm:

status shows insights as synthesized
pending shows no remaining work

Step 13: Build Review Rules

python3 scripts/build_bugbot.py

Produces Task prompts for generating bugbot rules from the library. Launch the Task prompts (one per scope). Each subagent reads the existing BUGBOT.md and library practices, then merges incrementally — adding rules for new practices, removing rules for deleted practices, and preserving unchanged rules verbatim.

Sections use ## {topic} headings (matching library filenames) with **{practice_title}** rule keys. Related practices are synthesized into fewer condensed rules.

Targets: Scope-specific rules files from config.yaml

After subagents complete, verify:

Diff is minimal — only new/removed/updated rules, not full rewrites
New rules are mechanical and actionable (not vague design guidance)
No duplication with root .cursor/BUGBOT.md (manually maintained cross-cutting rules)

Status Commands

python3 scripts/mine.py status        # Overview: threads, insights, library
python3 scripts/mine.py pending       # What needs work at each stage
python3 scripts/mine.py for-topic X   # All insights for topic X

Data Locations

Threads: code_insights/threads.yaml
Insights: code_insights/insights.yaml
Library: code_insights/library/*.yaml
Working dir: tmp/mining_{identifier}/

Architecture

User: /mine-best-practices --since 2024-01-01
  |
  v
mine.py --> Batch threads, output extraction prompts
  |
  v
Extraction subagents (parallel) --> batch_n.yaml
  |
  v
aggregate_extraction.py --> insights.yaml + validation prompts
  |
  v
Validation subagents (parallel) --> batch_n.yaml
  |
  v
aggregate_validation.py --> insights.yaml + topic prompt
  |
  v
Topic assignment subagent --> topics.yaml
  |
  v
dispatch_synthesis.py --> synthesis prompts (per topic)
  |
  v
Synthesis subagents (parallel) --> library/{topic}.yaml
  |
  v
[VERIFY: Check for anomalies]
  |
  v
aggregate_synthesis.py --> insights.yaml (status: synthesized)
  |
  v
build_sections.py --> sections/*.md
  |
  v
build_bugbot.py --> bugbot rules (via subagent)

Notes

Extraction filters out already-processed thread_ids
Validation checks patterns against current codebase
Synthesis prioritizes recurring patterns over one-offs
Library practices derive from insights.yaml (full provenance)

ナビゲーション

Skillsとは？

リンク

mine-best-practices

name: mine-best-practices description: Extract best practices from PR review comments to build a curated library for code review automation license: MIT argument-hint: "--since YYYY-MM-DD [--until YYYY-MM-DD] [--scope NAME]" metadata: author: Valon Technologies version: "1.0"

Mine Best Practices

Your Role as Orchestrator

When to Use This Skill

Usage

Advanced

Data Refresh

Execution Workflow

Step 1: Start Extraction

Step 2: Launch Extraction Subagents

Step 3: Aggregate Extraction

Step 4: Launch Validation Subagents

Step 5: Aggregate Validation

Step 6: Launch Topic Assignment

Step 7: Dispatch Synthesis

Step 8: Launch Synthesis Subagents

Step 9: Verify Synthesis Quality

Step 10: Aggregate Synthesis

Step 11: Build

Step 12: Verify

Step 13: Build Review Rules

Status Commands

Data Locations

Architecture

Notes

関連スキル(🔧 開発ツール)