name: stata-accounting-research description: | STATA code pattern library for empirical archival accounting research. Provides tested syntax from 126 peer-reviewed JAR (Journal of Accounting Research) replication files (2017-2025). Use when the user asks procedural questions like "How do I implement [method]?" or "Show me code for [technique]" — including: entropy balancing, propensity score matching (PSM), difference-in-differences (DiD), regression discontinuity (RDD), instrumental variables (IV), event studies (CAR/BHAR), survival analysis, Fama-MacBeth regressions, bootstrap, quantile regression, reghdfe/xtreg/areg, clustering standard errors, fixed effects, esttab/outreg2 table formatting, winsorization, leads/lags. Users can specify their variables (e.g., treatment, outcomes, controls) and receive adapted syntax. NOTE: This skill provides code patterns from published papers, not research design advice.
Scope and Limitations
This skill is a code pattern library, not a methodological advisor.
| Can Do | Cannot Do |
|---|---|
| Show how published papers implemented methods | Explain when to use one method over another |
| Provide tested STATA syntax | Advise on identification strategy |
| Indicate which robustness tests accompany analyses | Discuss research design trade-offs |
| Cite source papers for code patterns | Recommend optimal research design |
When users ask methodology questions (e.g., "Should I use entropy balancing or PSM?", "How do I address endogeneity?", "Is my identification strategy valid?"):
- Acknowledge the limitation: "This skill provides code patterns from published papers, not research design guidance."
- Show how different papers approached similar problems (code examples)
- Suggest consulting methodology references: Breuer & deHaan (2024) for fixed effects, Angrist & Pischke for causal inference, or the user's methodologist/advisor
- Offer to show multiple implementations so the user can see variation in approaches
Workflow
Use references/REFERENCES.md as the primary index, then read targeted .do files.
Stage 1: Index Search
Search references/REFERENCES.md to identify relevant papers. The index contains structured metadata:
- Primary Method: STATA commands used (reghdfe, psmatch2, stcox, etc.)
- Identification Strategy: DiD, PSM, IV, RDD, Event Study, etc.
- Robustness/Special Features: Winsorization levels, clustering specs, placebo tests, etc.
Example queries on REFERENCES.md:
- "entropy balancing" → finds JAR_60_alv, JAR_60_bl, JAR_61_ds, JAR_62_5_llz, JAR_63_2_npstv
- "stacked DiD" → finds JAR_61_ds, JAR_62_5_aov, JAR_62_5_gibbons
- "Cox hazard" → finds JAR_59_ctv, JAR_62_2_xyz
Stage 2: Code Extraction
Read only the identified .do files to extract actual syntax. This reduces context usage and improves accuracy.
Stage 3: Adaptation and Citation
- Adapt patterns to the user's variable names and research context
- Cite source: "Based on [Authors] ([Year]), JAR Volume"
Fallback: Direct Grep Patterns
For very specific syntax queries (e.g., "how does absorb() handle singletons?"), grep .do files directly:
| Task | Grep Pattern |
|---|---|
| Panel regressions | reghdfe|xtreg|areg |
| Fixed effects | absorb\(|i\.year|i\.firm |
| Clustering | cluster\(|vce\(cluster |
| Matching/PSM | psmatch2|teffects|cem|ebalance|pscore |
| IV regression | xtivreg|ivregress|ivreg2 |
| DiD | post.*treat|treat.*post|parallel.*trend |
| RDD | rdrobust|rddensity |
| Event studies | CAR|BHAR|abnormal.*return |
| Survival | stcox|streg|stset |
| Fama-MacBeth | fama.?macbeth|newey.*west |
| Bootstrap | bootstrap|bsample |
| Quantile regression | qreg|sqreg|bsqreg |
| Table output | esttab|outreg2|eststo |
| Winsorization | winsor|winsor2 |
Corpus Overview
126 STATA .do files from JAR Volumes 55-63 (2017-2025). See references/REFERENCES.md for complete catalog with paper titles and authors.
File Naming Convention
- V55-61:
JAR_{volume}_{shortcode}.do - V62-63:
JAR_{volume}_{issue}_{shortcode}_{authors}.do
Volume Coverage
| Volume | Year | Papers |
|---|---|---|
| 55 | 2017 | 9 |
| 56 | 2018 | 12 |
| 57 | 2019 | 9 |
| 58 | 2020 | 13 |
| 59 | 2021 | 4 |
| 60 | 2022 | 22 |
| 61 | 2023 | 22 |
| 62 | 2024 | 25 |
| 63 | 2025 | 10 |
Standard Patterns
Clustering and Fixed Effects
* Firm and year FE with firm-clustered SEs (most common)
reghdfe depvar indepvar controls, absorb(firm year) cluster(firm)
* Industry-year FE
reghdfe depvar indepvar controls, absorb(ind_year) cluster(firm)
Output Conventions
eststo clear
eststo: reghdfe depvar indepvar controls, absorb(firm year) cluster(firm)
esttab using "table.tex", replace star(* 0.10 ** 0.05 *** 0.01) se
Winsorization
winsor2 varlist, cuts(1 99) replace