Skills リサーチ

Skills Research

ナビゲーション

Skillsとは？

Skills（SKILL.md）は、AIエージェント（Claude Code、Cursor、Codexなど）に特定の能力を追加するための設定ファイルです。

詳しく見る →

リンク

Classifier-Style Agents — AI Skill Search

検索に戻る

Classifier-Style Agents

https://github.com/dcramer/peated

★90更新: 2週間前Claude Code🔧 開発ツールコードレビューコードレビュー自動化テストドキュメント Git・PR デバッグEnglish

Use this reference for agents that extract structure, rank candidates, classify actions, route work, moderate content, or gate automation.

Classifier-Style Agents

Use this reference for agents that extract structure, rank candidates, classify actions, route work, moderate content, or gate automation.

Review Sequence

Load the domain contract before reviewing prompt wording.
Separate deterministic stages from model-driven stages.
Check retrieval or candidate quality before blaming the model.
Check thresholds, post-model validation, and human-review boundaries.
Require slice-based evals before recommending larger architecture changes.

If the system has no written schema, taxonomy, or action set, flag that gap first.

Common Failure Types

false positive final action
false negative or missed match
wrong abstain or over-escalation
bad candidate set or missing evidence
schema drift or normalization mismatch
overconfident outputs on weak evidence
threshold regression after prompt changes
cost or latency regressions from unnecessary tool use

Design Rules

Keep the action set explicit and mutually exclusive.
Add an abstain, no_match, or manual-review path when evidence can be weak.
Keep candidate sets structured and comparable on decisive fields.
Bias the runtime against costly false positives.
Make confidence meaningful only if it drives thresholds or downstream policy.
Validate model outputs after generation, not only in the prompt.

Bottleneck Clues

Symptom	Likely bottleneck
The model chooses the wrong item from a good candidate set	Decision policy, prompt clarity, tool descriptions, or calibration
The right item is not present when the model decides	Retrieval, normalization, or candidate generation
Model output looks good but persisted state is wrong	Output schema, validation, or integration bugs
Confidence is high on weak evidence	Thresholding, confidence semantics, or missing abstain rules
Queue load spikes after an "accuracy" change	Threshold regression, auto-action policy, or precision/recall imbalance

Repo Anchor: Peated Bottle Matcher

When the task is about the current bottle matcher or label extractor, read:

docs/development/schema-conventions.md
apps/server/src/agents/whisky/guidance.ts
apps/server/src/agents/priceMatch/classifyStorePriceMatch.ts
apps/server/src/lib/priceMatchingProposals.ts
apps/server/src/schemas/priceMatches.ts
apps/server/src/lib/priceMatching.test.ts
apps/server/src/schemas/priceMatches.test.ts

Then check:

extraction conservatism: prefer null or [] over guessing
decisive identity fields: producer, distillery, expression, series, edition, age, cask flags, ABV, and years
candidate generation before web search
action set boundaries: match_existing, correction, create_new, no_match
confidence normalization and automation thresholds
server-side sanitization of ids and proposed entities
non-whisky rejection and human-review boundaries

Eval Minimum

Require:

confusion-style breakdown by action or class
hard-slice examples for decisive error modes
trace or tool-call review
cost, latency, and tool-usage metrics
before vs after comparison for proposed changes

GitHub で開く Raw を見るマーケットで出品する →

関連スキル(🔧 開発ツール)

blacksmith-testbox

Run Blacksmith Testbox for CI-parity checks, secrets, hosted services, migration

openclaw-ghsa-maintainer

Inspect, patch, validate, publish, or confirm OpenClaw GHSA security advisories

openclaw-parallels-smoke

Run, rerun, debug, or interpret OpenClaw Parallels install, onboarding, gateway

openclaw-pr-maintainer

Review, triage, close, label, comment on, or land OpenClaw PRs/issues with maint

openclaw-qa-testing

Run, watch, debug, extend, or explain OpenClaw qa-lab and qa-channel scenarios,

開発ツールのスキルをもっと見る →