name: pyfixest description: >- Fast high-dimensional fixed effects: OLS, Poisson, IV with multi-way FE; DiD (TWFE, did2s, Sun-Abraham); clustered SEs; etable/coefplot/iplot. Use for FE regressions or DiD. For panel RE/between use linearmodels; for GLM without FE use statsmodels. metadata: audience: research-coders domain: python-library library-version: "0.40.0" skill-last-updated: "2026-03-27"
pyfixest Skill
pyfixest: fast high-dimensional fixed effects estimation for Python. Covers OLS, Poisson, and IV regression with multi-way fixed effects; difference-in-differences estimators (TWFE, did2s, lpdid, Sun-Abraham); clustered standard errors; wild bootstrap; and publication output (etable regression tables, coefplot, iplot event study plots). Use when running fixed effects regressions, difference-in-differences designs, Poisson count models with FE, or producing publication-ready regression tables. For panel random/between effects, use linearmodels; for GLM/time series without FE, use statsmodels.
Comprehensive skill for fixed effects regression, instrumental variables, and difference-in-differences estimation with pyfixest. Use decision trees below to find the right guidance, then load detailed references.
What is pyfixest?
pyfixest is a Python implementation of the R fixest package (Berge, Butts, & McDermott, 2026):
- Fast: Multi-way FE demeaning via alternating projections with numba/JAX/GPU backends
- Concise formula syntax: Fixed effects after
|, IV after second|, multiple estimation viasw()/csw() - Modern DiD: Built-in did2s, local projections DiD (lpdid), and Sun-Abraham saturated estimator
- Flexible inference: Switch SE types post-estimation; wild bootstrap, randomization inference, CCV
- Publication output:
etable()for regression tables,coefplot()andiplot()for coefficient visualization
Version Notes
This skill targets pyfixest 0.40.0, the major release aligning with R fixest 0.13. Breaking changes from earlier versions:
- Default standard errors changed from "cluster by first FE" to
"iid"— old code silently produces different SEs ssc()arguments renamed:adj→k_adj,fixef_k→k_fixef,cluster_adj→G_adj,cluster_df→G_dffixef_rmdefault changed from"none"to"singleton"— singletons now dropped by default- Multicollinearity tolerance reduced from 1e-10 to 1e-09
How to Use This Skill
Reference File Structure
Each topic in ./references/ contains focused documentation:
| File | Purpose | When to Read |
|---|---|---|
quickstart.md | Installation, first regression, formula syntax | Starting with pyfixest |
fixed-effects.md | Multi-way FE, SE types, clustering, wild bootstrap | FE models and inference |
instrumental-variables.md | IV syntax, first stage, weak instruments | IV/2SLS estimation |
difference-in-differences.md | TWFE, did2s, lpdid, Sun-Abraham, event studies | DiD designs |
tables-and-plots.md | etable, coefplot, iplot, dtable | Reporting results |
advanced-inference.md | Wild bootstrap, randomization inference, MHT corrections, Gelbach | Advanced statistical inference |
integration.md | Multiple estimation, Poisson, GLM, marginaleffects, online learning | Advanced features |
gotchas.md | Common errors, v0.40 breaking changes, fixest vs pyfixest | Debugging issues |
Reading Order
- New to pyfixest? Start with
quickstart.mdthenfixed-effects.md - Running DiD? Read
quickstart.md, thendifference-in-differences.md - Need IV? Read
quickstart.md, theninstrumental-variables.md - Making tables? Check
tables-and-plots.md - Coming from R fixest? Read
quickstart.mdthengotchas.md
Related Skills
| Skill | Relationship |
|---|---|
data-scientist | Methodology guidance — load for "why and when" behind methods |
statsmodels | Complement for non-FE models: GLM, time series, diagnostics |
linearmodels | Random effects, GMM, system estimation when pyfixest's FE-only approach is insufficient |
svy | Survey-weighted regression with complex survey designs. pyfixest's clustered SEs account for within-group correlation but do NOT handle full survey design features (stratification, unequal probability weights, FPC). If your data comes from a complex probability survey, use svy for design-based inference |
polars | Data preparation before estimation (convert to pandas before passing to pyfixest) |
plotnine | Custom visualization beyond pyfixest's built-in plots |
Quick Decision Trees
"I need to run a regression"
What kind of regression?
├─ OLS with fixed effects → ./references/quickstart.md
├─ OLS without fixed effects → ./references/quickstart.md
├─ IV / 2SLS → ./references/instrumental-variables.md
├─ Poisson (count data) → ./references/integration.md
├─ Logit / Probit → ./references/integration.md
├─ Quantile regression → ./references/integration.md
└─ Multiple models at once → ./references/integration.md
"I need difference-in-differences"
DiD design?
├─ Simple 2x2 DiD (one treatment date) → ./references/difference-in-differences.md
├─ Staggered treatment timing → ./references/difference-in-differences.md
│ ├─ did2s (Gardner imputation) → ./references/difference-in-differences.md
│ ├─ Local projections DiD → ./references/difference-in-differences.md
│ └─ Sun-Abraham saturated → ./references/difference-in-differences.md
├─ Event study plot → ./references/difference-in-differences.md
├─ Visualize treatment patterns → ./references/difference-in-differences.md
└─ Parallel trends assessment → ./references/difference-in-differences.md
"I need to choose standard errors"
What inference?
├─ Heteroskedasticity-robust (HC1) → ./references/fixed-effects.md
├─ Clustered (one-way / two-way) → ./references/fixed-effects.md
├─ Few clusters (<20) → ./references/advanced-inference.md
│ └─ Wild cluster bootstrap → ./references/advanced-inference.md
├─ HAC / Newey-West → ./references/fixed-effects.md
├─ Randomization inference → ./references/advanced-inference.md
├─ Multiple hypothesis testing → ./references/advanced-inference.md
└─ Causal cluster variance (CCV) → ./references/advanced-inference.md
"I need to present results"
Presenting results?
├─ Regression table (multiple models) → ./references/tables-and-plots.md
├─ Coefficient plot → ./references/tables-and-plots.md
├─ Event study plot → ./references/tables-and-plots.md
├─ Descriptive statistics table → ./references/tables-and-plots.md
└─ LaTeX output → ./references/tables-and-plots.md
"Something isn't working"
Having issues?
├─ Different results from old code → ./references/gotchas.md
├─ feglm with fixed effects error → ./references/gotchas.md
├─ numba installation problems → ./references/gotchas.md
├─ CRV3 memory issues → ./references/gotchas.md
├─ Poisson convergence → ./references/gotchas.md
├─ Formula parsing errors → ./references/gotchas.md
├─ R fixest vs pyfixest differences → ./references/gotchas.md
└─ Singleton warnings → ./references/gotchas.md
File-First Execution in Research Workflows
Important: In data research pipelines (see CLAUDE.md), pyfixest regressions are executed through script files, not interactively. This ensures auditability and reproducibility.
The pattern:
- Write regression code to
scripts/stage8_analysis/{step}_{task-name}.py - Execute via Bash with automatic output capture wrapper script
- Validation results get automatically embedded in scripts as comments
- If failed, create versioned copy for fixes
Closely read agent_reference/SCRIPT_EXECUTION_REFERENCE.md for the mandatory file-first execution protocol covering complete code file writing, output capture, and file versioning rules. All regression scripts must follow the Inline Audit Trail (IAT) standard — see agent_reference/INLINE_AUDIT_TRAIL.md. For regression code, document model specification choices (why this estimator, why this clustering level, what identifying assumptions) with # INTENT:, # REASONING:, and # ASSUMES: comments.
See:
agent_reference/WORKFLOW_PHASE4_ANALYSIS.md— Stage 8 (Analysis & Visualization)agent_reference/INLINE_AUDIT_TRAIL.md— IAT documentation standard
The examples below show pyfixest syntax. In research workflows, wrap them in scripts following the file-first pattern.
Quick Reference
Essential Import
import pyfixest as pf
Core Estimation Functions
| Function | Purpose |
|---|---|
pf.feols("Y ~ X | fe", data=df) | OLS with fixed effects |
pf.fepois("Y ~ X | fe", data=df) | Poisson with fixed effects |
pf.feols("Y ~ X2 | fe | X1 ~ Z1", data=df) | IV / 2SLS |
pf.did2s(data, yname, first_stage, second_stage, treatment, cluster) | Gardner (2022) DiD |
pf.event_study(data, yname, idname, tname, gname, estimator) | Unified event study |
pf.lpdid(data, yname, idname, tname, gname) | Local projections DiD |
Formula Syntax Quick Reference
| Pattern | Meaning | Example |
|---|---|---|
Y ~ X1 + X2 | No FE | "wage ~ educ + exper" |
Y ~ X | fe1 + fe2 | With FE | "wage ~ educ | state + year" |
Y ~ X | fe | endog ~ inst | FE + IV | "wage ~ exper | state | educ ~ college_prox" |
i(factor, ref=val) | Categorical with ref | "Y ~ i(year, ref=2000) | state" |
sw(X1, X2) | Stepwise alternatives | "Y ~ sw(educ, exper) | state" |
csw0(X1, X2) | Cumulative stepwise | "Y ~ csw0(educ, exper) | state" |
Y1 + Y2 ~ X | Multiple outcomes | "wage + hours ~ educ | state" |
Post-Estimation Essentials
fit = pf.feols("Y ~ X1 + X2 | fe", data=df)
fit.summary() # Print results
fit.tidy() # DataFrame of coefficients
fit.vcov("hetero") # Re-estimate with robust SEs (requires arg)
fit.vcov({"CRV1": "state"}) # Re-estimate with clustered SEs
fit.coef() # Coefficient values
fit.se() # Standard errors
fit.confint() # Confidence intervals
fit.predict() # Fitted values
fit.resid() # Residuals
fit.fixef() # Dict of FE name → numpy array (not a DataFrame)
Reporting
pf.etable([fit1, fit2, fit3]) # Regression table
pf.coefplot([fit1, fit2]) # Coefficient plot
pf.iplot(fit) # Event study / interaction plot
pf.panelview(data, unit, time, treat) # Treatment pattern visualization
Topic Index
| Topic | Reference File |
|---|---|
| Installation | ./references/quickstart.md |
| First regression | ./references/quickstart.md |
| Formula syntax | ./references/quickstart.md |
| SE comparison table | ./references/quickstart.md |
| Multi-way fixed effects | ./references/fixed-effects.md |
| Standard error types | ./references/fixed-effects.md |
| Clustered SEs | ./references/fixed-effects.md |
| HAC / Newey-West | ./references/fixed-effects.md |
| Backend options | ./references/fixed-effects.md |
| IV formula syntax | ./references/instrumental-variables.md |
| First-stage diagnostics | ./references/instrumental-variables.md |
| Weak instrument tests | ./references/instrumental-variables.md |
| TWFE | ./references/difference-in-differences.md |
| did2s | ./references/difference-in-differences.md |
| Local projections DiD | ./references/difference-in-differences.md |
| Sun-Abraham | ./references/difference-in-differences.md |
| Event study plots | ./references/difference-in-differences.md |
| Parallel trends | ./references/difference-in-differences.md |
| panelview | ./references/difference-in-differences.md |
| etable | ./references/tables-and-plots.md |
| coefplot | ./references/tables-and-plots.md |
| iplot | ./references/tables-and-plots.md |
| dtable | ./references/tables-and-plots.md |
| Wild cluster bootstrap | ./references/advanced-inference.md |
| Randomization inference | ./references/advanced-inference.md |
| Multiple testing corrections | ./references/advanced-inference.md |
| Gelbach decomposition | ./references/advanced-inference.md |
| CCV | ./references/advanced-inference.md |
| Multiple estimation | ./references/integration.md |
| Poisson regression | ./references/integration.md |
| GLM (logit/probit) | ./references/integration.md |
| Quantile regression | ./references/integration.md |
| marginaleffects | ./references/integration.md |
| Online learning | ./references/integration.md |
| Performance tuning | ./references/integration.md |
| Polars DataFrame input | ./references/gotchas.md |
| Polars-to-pandas conversion | ./references/quickstart.md |
| DiD clustering level | ./references/difference-in-differences.md |
| v0.40 breaking changes | ./references/gotchas.md |
| feglm FE limitation | ./references/gotchas.md |
| numba issues | ./references/gotchas.md |
| Formula parsing | ./references/gotchas.md |
| R fixest differences | ./references/gotchas.md |
Citation
When this library is used as a primary analytical tool, include in the report's Software & Tools references:
Berge, L., Butts, K., & McDermott, G. (2026). pyfixest: Fast high-dimensional fixed effects estimation [Computer software]. Based on fixest (R).
Cite when: pyfixest is used for regression estimation (OLS, Poisson, IV) or difference-in-differences analysis. Do not cite when: Only imported but no estimation performed.
For method-specific citations (e.g., individual DiD estimators or inference techniques),
consult the reference files in this skill and agent_reference/CITATION_REFERENCE.md.