Generate Trigger Tests

Generate comprehensive trigger tests for agent skills to validate proper activation behavior. Creates test cases for all 4 trigger types.

Output Format

Output ONLY valid YAML with no markdown fences, no explanations, no commentary.

The YAML must follow this exact structure:

skill: <skill-name>

test_cases:
  # === EXPLICIT TESTS ===
  # Direct invocation with $ prefix - should always trigger
  - id: explicit-1
    name: "Descriptive name for this test"
    type: explicit
    prompt: "$<skill-name> <action>"
    expected: trigger

  # === IMPLICIT TESTS ===
  # Describes the need without naming the skill
  - id: implicit-1
    name: "Descriptive name for this test"
    type: implicit
    prompt: "<scenario description>"
    expected: trigger

  # === CONTEXTUAL TESTS ===
  # Realistic prompts with domain context
  - id: contextual-1
    name: "Descriptive name for this test"
    type: contextual
    prompt: "<realistic noisy prompt>"
    expected: trigger

  # === NEGATIVE TESTS ===
  # Should NOT trigger - adjacent but distinct requests
  - id: negative-1
    name: "Descriptive name for this test"
    type: negative
    prompt: "<related but different request>"
    expected: no_trigger

Test Case Fields

Field	Required	Description
`id`	Yes	Unique identifier following pattern `<type>-<number>` (e.g., explicit-1, negative-2)
`name`	Yes	Short human-readable label describing WHAT the test validates
`type`	Yes	One of: `explicit`, `implicit`, `contextual`, `negative`
`prompt`	Yes	The test prompt to send
`expected`	Yes	Either `trigger` or `no_trigger`

Trigger Type Definitions

Explicit Tests (generate 3)

Direct skill invocation using the $<skill-name> prefix.

Every prompt MUST start with $<skill-name>
Vary the commands/actions to cover different skill capabilities
If the skill content mentions slash commands (e.g., /review, /pr) or alternative invocation methods, include at least one explicit test for each
expected: trigger

Example for a "create-report" skill:

- id: explicit-1
  name: "Generate sales summary report"
  prompt: "$create-report generate a sales summary"
- id: explicit-2
  name: "Generate Q4 metrics report"
  prompt: "$create-report for Q4 metrics"

Implicit Tests (generate 3)

Describe the user's NEED without naming the skill. Tests whether the skill's name and description alone enable autonomous selection by an agent.

MUST NOT use the $<skill-name> prefix
MUST NOT mention the skill name
MUST NOT mention the specific technology, library, or framework by name (e.g., for an "ant-design-skill", do NOT say "Ant Design" or "antd" — describe the UI pattern instead)
Describe what the user wants to accomplish in natural language
Focus on the USER'S PROBLEM or goal, not the tool that solves it

GOOD implicit examples:

- id: implicit-1
  name: "Generate quarterly sales summary"
  prompt: "I need to generate a summary document of our quarterly sales data"
- id: implicit-2
  name: "Create project metrics document"
  prompt: "Can you create a summary document of the project metrics?"

BAD implicit examples (DO NOT generate triggers like these):

"Create a report using the reporting framework" — names the technology
"I need report generation with charts, tables, and PDF export" — paraphrases the SKILL.md bullet points instead of describing a real user need

Contextual Tests (generate 3)

The user is in the middle of a real task and the skill need emerges from the SITUATION. The prompt includes project details, constraints, and surrounding context with the core request embedded in a longer narrative.

Must include realistic project context, constraints, or workflow details
The skill need should be INFERRED from the situation, not explicitly stated
DO NOT just add extra words around an explicit or implicit trigger
DO NOT embed literal tool commands, CLI syntax, or API calls in the prompt
Think: "What would a real user say mid-project when they need this skill?"
expected: trigger

GOOD contextual example:

- id: contextual-1
  name: "Board meeting Q4 performance report"
  prompt: "I'm preparing for the board meeting next week. We need a comprehensive document showing our Q4 performance. Can you pull together the sales data and create something presentable?"

BAD contextual examples (DO NOT generate triggers like these):

"I'm working on the project. $create-report for Q4 sales data. Thanks." — just an explicit trigger with noise around it
"Run create-report --format=pdf --data=sales.csv for the Q4 summary" — embeds literal CLI commands

Negative Tests (generate 4)

Requests that are adjacent to the skill's domain but should NOT activate it. Essential for catching false positives and over-eager triggering.

At least 2 MUST be near-miss boundary tests from the SAME domain (adjacent but distinct actions)
Avoid obvious non-matches from completely unrelated domains
Test similar actions on different objects, or different actions on similar objects
expected: no_trigger

GOOD negative examples for a "create-report" skill:

- id: negative-1
  name: "Reading PDF (not creating)"
  prompt: "How do I read a PDF report?"
- id: negative-2
  name: "Deleting reports (not creating)"
  prompt: "Delete the old quarterly reports"
- id: negative-3
  name: "Asking about tools (informational)"
  prompt: "What's the best reporting tool?"

BAD negative example (DO NOT generate triggers like these):

"Set up a Kubernetes cluster" — completely unrelated domain, not a useful boundary test

Complete Example

For a skill named write-commit-message:

skill: write-commit-message

test_cases:
  # === EXPLICIT TESTS ===
  - id: explicit-1
    name: "Direct invocation for recent changes"
    type: explicit
    prompt: "$write-commit-message for the changes I just made"
    expected: trigger

  - id: explicit-2
    name: "Direct invocation with format preference"
    type: explicit
    prompt: "$write-commit-message with conventional format"
    expected: trigger

  - id: explicit-3
    name: "Direct invocation for staged files"
    type: explicit
    prompt: "$write-commit-message for the staged files"
    expected: trigger

  # === IMPLICIT TESTS ===
  - id: implicit-1
    name: "Write commit message for staged changes"
    type: implicit
    prompt: "I need to write a commit message for my staged changes"
    expected: trigger

  - id: implicit-2
    name: "Draft commit message with best practices"
    type: implicit
    prompt: "Help me draft a good commit message following best practices"
    expected: trigger

  - id: implicit-3
    name: "Describe changes for version control"
    type: implicit
    prompt: "I want to describe my code changes before saving them to version control"
    expected: trigger

  # === CONTEXTUAL TESTS ===
  - id: contextual-1
    name: "Auth module refactor with JWT details"
    type: contextual
    prompt: "I just finished refactoring the authentication module. The changes include updating the JWT validation and adding refresh token support. Can you help me write a proper commit message for this?"
    expected: trigger

  - id: contextual-2
    name: "PR with login bug fix"
    type: contextual
    prompt: "Working on a PR for the new feature. Need to commit these files but not sure how to phrase the message. It's a bug fix for the login flow."
    expected: trigger

  - id: contextual-3
    name: "Multi-file refactor needing description"
    type: contextual
    prompt: "I've been cleaning up the codebase all morning — renamed some variables, extracted a helper function, and fixed a typo in the README. How should I describe all of this?"
    expected: trigger

  # === NEGATIVE TESTS ===
  - id: negative-1
    name: "Ask how to run git commit"
    type: negative
    prompt: "How do I run git commit?"
    expected: no_trigger

  - id: negative-2
    name: "View git log history"
    type: negative
    prompt: "Show me the git log for this repository"
    expected: no_trigger

  - id: negative-3
    name: "Undo last commit"
    type: negative
    prompt: "Undo my last commit"
    expected: no_trigger

  - id: negative-4
    name: "Ask about conventional commit types"
    type: negative
    prompt: "What are the conventional commit types?"
    expected: no_trigger

Quality Checklist

Before outputting, verify:

Implicit prompts do NOT mention the skill name, $ prefix, OR technology/library/framework name
Contextual prompts describe a SITUATION, not just a request with extra words around it
At least 2 negatives are near-miss boundary tests from the same domain
All prompts sound like something a real user would actually say
The 4 categories are clearly differentiated — no two tests from different categories should be interchangeable

ナビゲーション

Skillsとは？

リンク

Generate Trigger Tests

Generate Trigger Tests

Output Format

Test Case Fields

Trigger Type Definitions

Explicit Tests (generate 3)

Implicit Tests (generate 3)

Contextual Tests (generate 3)

Negative Tests (generate 4)

Complete Example

Quality Checklist

関連スキル(🔧 開発ツール)