Use when extracting structured data from medical research PDFs, parsing study characteristics, patient demographics, outcomes, and results. Invoke for systematic review data collection from papers.
Skills(SKILL.md)は、AIエージェント(Claude Code、Cursor、Codexなど)に特定の能力を追加するための設定ファイルです。
詳しく見る →Use when extracting structured data from medical research PDFs, parsing study characteristics, patient demographics, outcomes, and results. Invoke for systematic review data collection from papers.
Rooms as pipeline nodes, exits as edges, objects as messages
>
Skill professionnel pour Claude Code permettant d'accéder, télécharger et analyser les données ouvertes françaises via data.gouv.fr. Inclut une librairie Python complète, des exemples de code, et une documentation détaillée des datasets les plus utilisés.
Data governance strategy, quality validation rules, and data dictionary management for vehicle insurance platform. Use when defining data quality standards, implementing validation rules, managing field mappings, resolving data conflicts, or establishing data governance processes. Covers data cleaning standards, quality metrics, and mapping management.
Use when defining data contracts, consent policies, and monitoring for
Procedures and playbooks for responding to data quality incidents, data loss, corruption, and pipeline failures.
Build new data ingestion providers following the FF Analytics registry pattern. This skill should be used when adding new data sources (APIs, files, databases) to the data pipeline. Guides through creating provider packages, registry mappings, loader functions, storage integration, primary key tests, and sampling tools following established patterns.
Provides architectural guidance for data lake design including partitioning strategies, storage layout, schema design, and lakehouse patterns. Activates when users discuss data lake architecture, partitioning, or large-scale data organization.
Organize object storage with lifecycle policies.
Data Lake architecture and management including medallion architecture (bronze/silver/gold zones), data catalog with AWS Glue, partitioning strategies, schema evolution, data quality, governance, cost optimization, S3 lifecycle policies, data retention, compliance, query optimization with Athena, data formats (Parquet, ORC, Avro), incremental processing, CDC patterns, and production best practices for scalable data lakes.
Bronze Layer(LLM抽出ログ層)とGold Layer(確定データ層)の2層アーキテクチャ設計。LLM抽出結果の履歴管理と人間修正の保護を実現。抽出処理の実装、ExtractionLogの使用、is_manually_verifiedフラグの扱いに関するガイダンスを提供。
Mapping the flow of data from source to destination for transparency, impact analysis, and troubleshooting.
Service-scoped data orchestration for TMNL. Invoke when implementing search, data streams, kernel systems, or Effect-based DAQ. Covers hybrid dispatch (fibers + workers), Atom-as-State pattern, and progressive streaming.
Data mapping patterns for transforming API responses to internal types
Expert-level data mesh architecture, domain-oriented ownership, data products, federated governance, and self-serve platforms
Metabase REST API automation and troubleshooting: authenticate (API key preferred, session fallback), export/upsert questions (cards) and dashboards, standardize visualization_settings, and run/export results.
Plans and executes data migrations between systems, databases, and formats
You are a database migration expert specializing in safe schema changes and data migrations. Your goal is to ensure migrations are safe, reversible, and won't corrupt production data.
Plan and execute database migrations, data transformations, and system migrations safely with rollback strategies and data integrity validation. Use when migrating databases, transforming data schemas, moving between database systems, implementing versioned migrations, handling data transformations, ensuring data integrity, or planning zero-downtime migrations.
Create safe, reversible database migration scripts with rollback capabilities, data validation, and zero-downtime deployments. Use when changing database schemas, migrating data between systems, or performing large-scale data transformations.
Design data models with Pydantic schemas, comprehensive validation rules,
Data modeling with Entity-Relationship Diagrams (ERDs), data dictionaries, and conceptual/logical/physical models. Documents data structures, relationships, and attributes.
발굴조사 자료(논문/보고서/주변유적) 수집 및 메타데이터 정규화
Jupyter Notebook의 코드 품질, 문서화 수준, 실행 안정성을 표준화된 절차로 개선하는 워크플로우입니다.
Coordinates data pipeline tasks (ETL, analytics, feature engineering). Use when implementing data ingestion, transformations, quality checks, or analytics. Applies data-quality-standard.md (95% minimum).
data-orchestrator
Implements data persistence systems including DataStore patterns, session locking, data migration, error handling, and backup systems. Use when saving player progress, inventory, settings, or any persistent data.
Build orchestration pipelines with idempotency.
Expert data engineer for ETL/ELT pipelines, streaming, data warehousing. Activate on: data pipeline, ETL, ELT, data warehouse, Spark, Kafka, Airflow, dbt, data modeling, star schema, streaming data, batch processing, data quality. NOT for: API design (use api-architect), ML training (use ML skills), dashboards (use design skills).
Develop and manage data ingestion, processing, and transformation pipelines for pilot projects. Use when automating ETL workflows, integrating new data sources, or building canonical datasets to support downstream analytics.
Design and troubleshoot robust data pipelines with comprehensive quality validation, error handling, and monitoring capabilities for bioinformatics and data processing workflows
Monitor and troubleshoot dual-pipeline data collection systems on GCP. This skill should be used when checking pipeline health, viewing logs, diagnosing failures, or monitoring long-running operations for data collection workflows. Supports Cloud Run Jobs (batch pipelines) and VM systemd services (real-time streams).
Process data files through transformation pipelines with validation, cleaning, and export. Use for CSV/Excel/JSON data processing, encoding handling, batch operations, and data transformation workflows.
Orchestrate marketing data collection, transformation, and reporting workflows. Use when relevant to the task.
Load and preprocess imaging mass cytometry (IMC) and MIBI data. Covers MCD/TIFF handling, hot pixel removal, and image normalization. Use when starting IMC analysis from raw MCD files or preparing images for segmentation.
Detect PII, mask data, and manage consent and encryption.
Polibaseのデータ処理ワークフローとパイプラインを説明します。議事録処理、Web scraping、政治家データ収集、話者マッチングなどの処理フロー、依存関係、実行順序を理解する際にアクティベートされます。
Process JSON with jq and YAML/TOML with yq. Filter, transform, query structured data efficiently. Triggers on: parse JSON, extract from YAML, query config, Docker Compose, K8s manifests, GitHub Actions workflows, package.json, filter data.
Data product design patterns with contracts, SLAs, and governance for building self-serve data platforms using Data Mesh principles.
Profile datasets to understand schema, quality, and characteristics. Use when analyzing data files (CSV, JSON, Parquet), discovering dataset properties, assessing data quality, or when user mentions data profiling, schema detection, data analysis, or quality metrics. Provides basic and intermediate profiling including distributions, uniqueness, and pattern detection.
Assess data quality with checks for missing values, duplicates, type issues, and inconsistencies. Use for data validation, ETL pipelines, or dataset documentation.
Comprehensive guide to data quality validation, testing frameworks, anomaly detection, and data observability for production data pipelines
Enforce data quality rules and validations on pilot data streams and repositories. Use when checking for missing values, schema compliance, consistency issues, or anomalies before analysis and reporting.
Implement validation, profiling, and anomaly detection.
Generate comprehensive dbt test suites following FF Analytics data quality standards and dbt 1.10+ syntax. This skill should be used when creating tests for new dbt models, adding tests to existing models, standardizing test coverage, or implementing data quality gates. Covers grain uniqueness, FK relationships, enum validation, and freshness tests.
Systematic framework for catching data quality issues, query errors, metric calculation problems, and inconsistencies before they affect analysis results.
Write and verify SQL queries with BigQuery. Use when executing bq commands, writing SQL queries, or including query results in documents.
Set up database replication for high availability and disaster recovery. Use when configuring master-slave replication, multi-master setups, or replication monitoring.
Data discovery and analysis specialist focused on extracting actionable insights from complex datasets, identifying patterns and anomalies, and transforming raw data into strategic intelligence. Excels at multi-source data integration, advanced analytics, and data-driven decision support.