โก TL;DR
A short, sharp practitioner warning: default event-study plots produced by software for the post-2020 DiD methods (Callaway-Sant'Anna, Sun-Abraham, Borusyak-Jaravel-Spiess) do NOT match traditional TWFE event-study plots, even on non-staggered timing. The new methods construct pre-treatment coefficients asymmetrically from post-treatment coefficients. As a result, visual heuristics for evaluating parallel-trends violations developed for TWFE event-studies cannot be transported to the new plots.
๐งฉ Setup & motivation
For 30+ years, applied econometrics has built visual intuition around the TWFE event-study plot: pre-treatment coefficients close to zero are "good", a kink at \(t = 0\) indicates the treatment effect, a smooth line through pre and post indicates pre-trends, etc. These visual heuristics drive whether papers get accepted.
Roth points out that the new heterogeneity-robust estimators (CS, SA, BJS) produce event-study plots that look different even when the underlying data is the same. Specifically, they construct the pre-treatment coefficients using a different baseline period or weighting scheme than the post-treatment coefficients. The result: kinks and jumps that don't exist in the TWFE plot, or absent kinks where TWFE shows them.
๐ Main results
The asymmetric construction
In TWFE event-studies, pre-treatment coefficients and post-treatment coefficients are estimated symmetrically: both are deviations from a reference period (usually \(k = -1\)) using the same control comparisons.
In the new methods:
- Callaway-Sant'Anna: pre-period coefficients use never-treated (or not-yet-treated) units as controls; post-period coefficients also use never-treated, but the "treated" group composition differs across event times.
- Sun-Abraham IW: weights pre-treatment leads using one cohort distribution and post-treatment lags using a different cohort distribution.
- Borusyak-Jaravel-Spiess: imputes the counterfactual from pre-period unit-level fixed-effects fits; pre-period coefficients are residuals from the fit, post-period are imputation differences.
Practical consequence
The same underlying data, when plotted with TWFE vs CS/SA/BJS, can show different shapes โ and the "kink at treatment" that practitioners interpret as the treatment effect can be a software artifact rather than economics.
๐ ๏ธ Implications for practice
- Show both the TWFE plot AND the heterogeneity-robust plot. If they look different, explain why.
- Do not use TWFE visual heuristics on CS/SA/BJS plots. They are different objects.
- When citing pre-trend evidence, state explicitly which estimator's plot you are reading and what its construction implies.
- The recommended practice in BCCGS (2026, JEL) is to plot the estimator's own pre-trend diagnostic, not to re-use TWFE intuition.
๐งญ Where this sits in the broader DiD literature
A practical follow-up to Sun-Abraham (2021), Callaway-Sant'Anna (2021), and Borusyak-Jaravel-Spiess (2024). Should be read alongside Roth (2022) "Pretest with Caution" for the full picture on parallel-trends diagnostics. Cited in the BCCGS 2026 JEL guide.
๐ฅ Read the paper
- Local PDF (987 KB) โ instant, no external request
- arXiv 2401.12309
- Springer