Efficient Difference-in-Differences and Event-Study Estimators

Xiaohong Chen (Yale) · Pedro H. C. Sant'Anna (Emory) · Haitian Xie (Peking U.)

⚡ TL;DR

Derives the semiparametric efficient influence function (EIF) in closed form for DiD and event-study causal parameters under heterogeneous treatment effects and standard parallel-trends assumptions. The resulting estimator achieves the smallest variance among all asymptotically normal, regular estimators. Establishes the efficiency benchmark for the post-2020 DiD landscape.

🧩 Setup & motivation

The post-2020 DiD methods (CS, SA, BJS, dC-dH, Wooldridge ETWFE) all target the same causal parameters under heterogeneity, but they differ in efficiency. Which is the most efficient? Until this paper, there was no definitive answer.

Chen-Sant'Anna-Xie compute the semiparametric efficiency bound for short-panel DiD and ES estimands. The bound is a function of the unit-level heterogeneity, time-varying covariates, and assignment mechanism. Their EIF-based estimator achieves the bound; existing estimators do not (some come close).

📐 Main results

The efficient influence function

Under standard parallel-trends assumptions and a short-panel structure, the EIF for the ATT at horizon \(k\) takes the form \(\psi_k(O; \eta) = \frac{D_i}{p} [(Y_{i,t+k} - Y_{i,t-1}) - m_k(X_i, \eta)] - \frac{(1-D_i) e(X_i)}{p(1-e(X_i))} [(Y_{i,t+k} - Y_{i,t-1}) - m_k(X_i, \eta)]\) where \(m_k\) is the conditional expectation of the outcome difference and \(e(X_i)\) the propensity score.

Neyman orthogonality

The EIF is automatically Neyman orthogonal: the first-order bias of \(\psi_k\) with respect to nuisance-parameter estimation is zero. This means ML estimators of the nuisances (\(m_k\) and \(e\)) can be plugged in without contaminating root-\(n\) inference — the same logic as in Chernozhukov-Chetverikov-Demirer-Duflo-Hansen-Newey-Robins (2018) double machine learning.

The proposed estimator

Simple plug-in: estimate \(m_k\) and \(e\) with any ML method via cross-fitting, then compute the sample average of \(\hat\psi_k\). Achieves the semiparametric efficiency bound asymptotically.

Efficiency comparison

The paper compares their EIF estimator to BJS, CS with covariates, SA with covariates, and Wooldridge ETWFE. The EIF achieves the bound; BJS is efficient under linearity but loses ground when nonlinear nuisances matter; CS with parametric covariate models is inefficient when the propensity score is far from \(1/2\).

🛠️ Implications for practice

For maximum statistical efficiency in a short-panel DiD with covariates, use the EIF estimator with cross-fitted ML for nuisances.
For larger panels, BJS is close to efficient and computationally simpler.
Software implementation forthcoming.

🧭 Where this sits in the broader DiD literature

Companion to Borusyak-Hull-Roth (2025) "Harvesting" — both chase efficiency from different angles. Builds on the double/debiased ML literature (Chernozhukov et al. 2018), Sant'Anna-Zhao (2020) doubly-robust DiD, and the semiparametric-efficiency tradition (Newey 1990, Van der Vaart 2000). Likely cited in BCCGS 2026 JEL as the modern efficiency benchmark.

📥 Read the paper

Local PDF (981 KB) — instant, no external request
arXiv 2506.17729
Cowles

Literature Readings