πŸ”’

Literature Readings

Reading notes β€” please enter access password

βœ— Incorrect password

Literature Readings

DiD Β· Paper Detail

←All Readings ←DiD Methodology

πŸ“„ This paper

⚑TL;DR 🧩Setup πŸ“Main results πŸ› οΈFor practice 🧭In the lit πŸ“₯PDF

Literature Readings Β· DiD Β· Paper Detail

Better Understanding Triple Differences Estimators

Marcelo Ortiz-Villavicencio Β· Pedro H. C. Sant'Anna (Emory)

arXiv 2025MethodologyTriple Differences

πŸ“₯ Read it

Local PDF (1.3 MB)arXiv 2505.09942

⚑ TL;DR

Modern treatment of triple-differences (DDD) estimators β€” DiD's natural extension when a third dimension of variation is available. Shows that common DDD implementations are invalid when conditioning on covariates is required, and that in staggered settings, pooling not-yet-treated units as controls introduces additional bias. Proposes regression-adjustment, IPW, and doubly-robust DDD estimators that remain valid. Companion R package: triplediff.

🧩 Setup & motivation

Triple-differences (DDD) designs are widely used to relax parallel-trends assumptions in DiD. A typical setup: units are observed in two groups (treated industry vs control industry) Γ— two regions (high-exposure state vs low-exposure state) Γ— two periods. The DDD difference compares the DiD in the treated industry to the DiD in the control industry.

The paper shows that the usual DDD implementations have hidden problems: (i) the "difference of two DiDs" approach is biased when identification requires covariate conditioning; (ii) the three-way fixed-effects regression is also biased in the same case; (iii) in staggered adoption settings, the common practice of pooling all not-yet-treated units introduces bias even without covariates.

πŸ“ Main results

The problem with common implementations

Standard DDD takes the form \(\hat\tau_{\text{DDD}} = (\bar Y_{1,T,A} - \bar Y_{1,C,A}) - (\bar Y_{0,T,A} - \bar Y_{0,C,A}) - [(\bar Y_{1,T,B} - \bar Y_{1,C,B}) - (\bar Y_{0,T,B} - \bar Y_{0,C,B})]\) where indices are period Γ— group Γ— stratum. This identifies the DDD-ATT only under a strong parallel-trends assumption that is generally not equivalent to the implicit identifying assumption in covariate-adjusted versions.

Three valid estimators

The paper proposes:

  • Regression-adjustment DDD: model the conditional expectation of the outcome difference and impute counterfactuals.
  • Inverse-probability-weighted DDD: weight observations by the propensity of being in each cell.
  • Doubly-robust DDD: combines RA + IPW so identification holds if either is correctly specified.

Staggered DDD

In staggered settings, the paper shows that pooling all not-yet-treated units biases the DDD-ATT due to differential composition. Solution: cohort-by-cohort DDD estimation followed by aggregation, similar to the Callaway-Sant'Anna logic for DiD.

πŸ› οΈ Implications for practice

  • If you're running DDD, replace the "difference of two DiDs" with the doubly-robust estimator from this paper.
  • For staggered DDD, do cohort-specific estimation before aggregation.
  • The R package triplediff implements all three estimators with a single call.

🧭 Where this sits in the broader DiD literature

Direct extension of Sant'Anna-Zhao (2020, J Econometrics) doubly-robust DiD framework to triple differences. Cited as the modern DDD reference in the BCCGS 2026 JEL guide. Related to Olden-MΓΈen (2022) on triple-differences identification.

πŸ“₯ Read the paper

  • Local PDF (1.3 MB) β€” instant, no external request
  • arXiv 2505.09942