name: data-analysis description: End-to-end R data analysis workflow from exploration through regression to publication-ready tables and figures. disable-model-invocation: true argument-hint: "[dataset path or description of analysis goal]" allowed-tools: ["Read", "Grep", "Glob", "Write", "Edit", "Bash", "Task"]
Data Analysis
Run an end-to-end data analysis in R through 5 phases.
Phase 1: Setup and Data Loading
- Read R code conventions from
.claude/rules/r-code-conventions.md - Create script with proper header (title, author, purpose, inputs, outputs)
- Load packages with
library()(neverrequire()) - Set
set.seed()if stochastic elements are involved - Load data using relative paths
Phase 2: Exploratory Data Analysis
- Summary statistics (mean, sd, min, max, N)
- Missingness patterns
- Distributions of key variables
- Bivariate relationships
- Time patterns (if panel data)
- Group comparisons
- Save diagnostics to
output/diagnostics/
Phase 3: Main Analysis
- Use
fixestfor panel data with fixed effects (see/econometrics-rfor detailed patterns) - Use
lm/glmfor cross-section - Cluster standard errors appropriately
- Run multiple specifications (simple to complex)
- Report both statistical and economic significance
- For causal inference (IV, DiD, RDD), see
/causal-inference-rfor diagnostics and modern estimators - For spatial econometrics, see
/econometrics-rforspdep/spatialregpatterns
Phase 4: Publication-Ready Output
- Tables:
modelsummary(preferred) orstargazer, export as.texand.html - Figures:
ggplot2with project theme, explicit dimensions, white background, 300 DPI - Export figures as
.pdfand.png
Phase 5: Save and Review
saveRDS()for all computed objects- Run r-reviewer agent on generated script
- Address Critical/High issues
Script Structure Template
# ==============================================================================
# Title: [Analysis Name]
# Author: [Name]
# Date: [YYYY-MM-DD]
# Purpose: [What this script does]
# Inputs: [Data files]
# Outputs: [Tables, figures, RDS files]
# ==============================================================================
# 0. Setup ----
library(tidyverse)
library(fixest)
library(modelsummary)
# 1. Data Loading ----
# 2. Exploratory Analysis ----
# 3. Main Analysis ----
# 4. Tables and Figures ----
# 5. Export ----
All scripts saved to scripts/R/, all outputs to output/.