Comprehensive guide to data quality validation, testing frameworks, anomaly detection, and data observability for production data pipelines
Skills(SKILL.md)は、AIエージェント(Claude Code、Cursor、Codexなど)に特定の能力を追加するための設定ファイルです。
詳しく見る →Comprehensive guide to data quality validation, testing frameworks, anomaly detection, and data observability for production data pipelines
Data discovery and analysis specialist focused on extracting actionable insights from complex datasets, identifying patterns and anomalies, and transforming raw data into strategic intelligence. Excels at multi-source data integration, advanced analytics, and data-driven decision support.
Expert data researcher specializing in discovering, collecting, and analyzing diverse data sources. Masters data mining, statistical analysis, and pattern recognition with focus on extracting meaningful insights from complex datasets to support evidence-based decisions.
Expert in statistical analysis, predictive modeling, machine learning, and data storytelling to drive business insights.
Create database seed scripts with realistic test data for development and testing. Use when setting up development environment or creating demo data.
data-visualizer
Expert for developing Streamlit data apps for Keboola deployment. Activates when building, modifying, or debugging Keboola data apps, Streamlit dashboards, adding filters, creating pages, or fixing data app issues. Validates data structures using Keboola MCP before writing code, tests implementations with Playwright browser automation, and follows SQL-first architecture patterns.
Database architecture and design specialist. Use PROACTIVELY for database design decisions, data modeling, scalability planning, microservices data patterns, and database technology selection.
Backup database before tests, migrations, or other database operations
Implement backup and restore strategies for disaster recovery. Use when creating backup plans, testing restore procedures, or setting up automated backups.
Configures and manages database connection pools, with a focus on SQLAlchemy. Use this skill for tasks involving connection pool configuration, sizing, lifecycle management, health checks, and monitoring. It provides specialized guidance for both traditional server-based databases and serverless databases like Neon and PlanetScale.
Specialized database operations for Chuukese language data including dictionary management, phrase collections, translation pairs, and linguistic metadata. Use when working with Chuukese language databases, managing translation data, or performing database operations on linguistic datasets.
Comprehensive database management workflow that orchestrates database architecture, schema design, performance optimization, and data governance. Handles everything from database design and implementation to performance tuning, backup strategies, and data migration.
Database schema migration patterns for Aurora MySQL including reconciliation migrations, idempotent operations, and MySQL-specific gotchas.
Expert database optimizer specializing in modern performance
Resets local development database by deleting all data and restarting API container to trigger auto-seeding. SINGLE SOURCE OF TRUTH for dev database reset automation.
Test migrations, integrity, and query performance.
Database schema validation, data integrity testing, migration testing, transaction isolation, and query performance. Use when testing data persistence, ensuring referential integrity, or validating database migrations.
Any time database-related activity is required.
Language-agnostic database best practices covering migrations, schema design, ORM patterns, query optimization, and testing strategies. Activate when working with database files, migrations, schema changes, SQL, ORM code, database tests, or when user mentions migrations, schema design, SQL optimization, NoSQL, database patterns, or connection pooling.
Databricks development guidance including Python SDK, Databricks Connect, CLI, and REST API. Use when working with databricks-sdk, databricks-connect, or Databricks APIs.
Kailash DataFlow - zero-config database framework with automatic model-to-node generation. Use when asking about 'database operations', 'DataFlow', 'database models', 'CRUD operations', 'bulk operations', 'database queries', 'database migrations', 'multi-tenancy', 'multi-instance', 'database transactions', 'PostgreSQL', 'MySQL', 'SQLite', 'MongoDB', 'pgvector', 'vector search', 'document database', 'RAG', 'semantic search', 'existing database', 'database performance', 'database deployment', 'database testing', or 'TDD with databases'. DataFlow is NOT an ORM - it generates 11 workflow nodes per SQL model, 8 nodes for MongoDB, and 3 nodes for vector operations.
Use when developing BigQuery Dataform transformations, SQLX files, source declarations, or troubleshooting pipelines - enforces TDD workflow (tests first), ALWAYS use ${ref()} never hardcoded table paths, comprehensive columns:{} documentation, safety practices (--schema-suffix dev, --dry-run), proper ref() syntax, .sqlx for new declarations, no schema config in operations/tests, and architecture patterns that prevent technical debt under time pressure
Extracts key specifications from component datasheet PDFs for maker projects. Use when user shares a datasheet PDF URL, asks about component specs, needs pin assignments, I2C addresses, timing requirements, or register maps. Downloads and parses PDF to extract essentials. Complements datasheet-parser for quick lookups.
DAW-specific quirks, known issues, and workarounds for Logic Pro, Ableton Live, Pro Tools, Cubase, Reaper, FL Studio, Bitwig with format-specific requirements (AU/VST3/AAX). Use when troubleshooting DAW compatibility, fixing host-specific bugs, implementing DAW workarounds, passing auval validation, or debugging automation issues.
This skill should be used when seeding databases with realistic fake data for development, testing, or staging environments. Supports PostgreSQL, MySQL, SQLite, MongoDB with ORM-based seeding (SQLAlchemy, Django, Prisma) and Faker library for generating realistic test data. Use when the user needs to populate databases with sample data, create test fixtures, or set up development/staging environments with realistic data.
SQLite database management with Prisma ORM, type-safe queries, and Railway deployment with Litestream backup. This skill should be used when creating database schemas, writing migrations, managing SQLite on Railway volumes, or troubleshooting database issues.
DBOS durable execution patterns and CRITICAL constraints for ChainGraph executor. Use when working on workflows, steps, execution, or any DBOS-related code. Contains MUST-FOLLOW constraints about what can be called from workflows vs steps. Triggers: dbos, workflow, step, durable, execution, startWorkflow, writeStream, recv, send, runStep, atomic, checkpoint, WorkflowQueue, queue, cancelWorkflow, Promise.allSettled. (project)
Complete guide for dbt data transformation including models, tests, documentation, incremental builds, macros, packages, and production workflows
PROACTIVE skill - STOP and invoke BEFORE writing dbt SQL. Validates models against coding conventions for staging, integration, and warehouse layers. Covers naming, SQL structure, field conventions, testing, and documentation. CRITICAL - When about to write .sql files in models/, invoke this skill first, write second. Supports project-specific convention overrides and sqlfluff integration.
Create dbt models following FF Analytics Kimball patterns and 2×2 stat model. This skill should be used when creating staging models, core facts/dimensions, or analytical marts. Guides through model creation with proper grain, tests, External Parquet configuration, and per-model YAML documentation using dbt 1.10+ syntax.
Comprehensive guide to dbt (data build tool) patterns, modeling best practices, testing strategies, and production workflows for modern data transformation
Transform AI agents into experts on dbt and Snowflake performance optimization, providing guidance
Provides expert-level assistance with dbt Semantic Layer, MetricFlow, semantic models, metrics, dimensions, entities, measures, and BI tool integrations. Use this skill when building semantic models, creating metrics (simple, ratio, cumulative, derived, conversion), debugging validation errors, or integrating with BI tools. Extracted from official dbt documentation and optimized for data practitioners.
ALWAYS USE when working with dbt models, SQL transformations, tests, snapshots, or macros. Use IMMEDIATELY when editing dbt_project.yml, profiles.yml, or creating SQL models. MUST be loaded before any transform-layer work. Enforces dbt owns SQL principle - never parse, validate, or transform SQL in Python.
Expert in D-Bus IPC (Inter-Process Communication) on Linux systems. Specializes in secure service communication, method calls, signal handling, and system integration. HIGH-RISK skill due to system service access and privileged operations.
Strategic and Tactical expertise in Gravito DDD. Trigger this for complex domains requiring Bounded Contexts, Aggregates, and Event-Driven architecture.
Créer des tests exhaustifs pour les bounded contexts DDD suivant une approche TDD (Test-Driven Development) avec des standards de coverage stricts.
Conduct authorized denial of service testing to assess network resilience and configure intrusion detection systems (IDS) to detect and alert on various DoS attack patterns. This skill covers volume-b
End-to-end associate workflow with time-boxed gates: thesis -> sourcing -> meetings -> diligence -> memo, ending with either IC-ready memo or explicit kill decision. Use when you need to run the full pipeline for a sector or a specific deal.
Wind/Wall/Door multi-perspective debate orchestration using debate-hall-mcp tools. Use when facilitating structured debates, architectural decisions, or multi-perspective analysis.
Start structured red vs. blue team debates via subagents. Use when exploring a topic from multiple adversarial perspectives.
Structured multi-perspective debate for important architectural decisions and complex trade-offs
GitHub Actions のワークフロー実行エラーを調査し、原因を特定して解決策を提案する。「Actions エラー」「ワークフロー失敗」「CI が落ちた」「ビルド失敗」「テスト失敗」「Actions を調べて」「CI のエラーを見て」などで起動。失敗したジョブのログを分析し、具体的な修正方法を提示。
プロットのスケール(対数軸など)、目盛、表示範囲の不具合を診断し、意図した見た目に修正する
Configures and builds YARS with debug symbols for debugging with valgrind or gdb
Use when users need to debug, modify, or extend the code-forge application's CLI commands, argument parsing, or CLI behavior. This includes adding new commands, fixing CLI bugs, updating command options, or troubleshooting CLI-related issues.
Debug web app by capturing and analyzing console errors using Chrome DevTools MCP
Universal PDCA debugging framework for systematic hypothesis verification. Use when debugging issues that require structured investigation, observing runtime behavior, or verifying fixes through iterative testing.
Debug FFmpeg integration and video/audio processing issues. Use when the user encounters FFmpeg errors, audio extraction problems, codec issues, or video processing failures.