(Empirical) Bayes Approaches to Parallel Trends

⚡ TL;DR

Proposes Bayes and Empirical Bayes approaches for handling violations of parallel trends. The researcher specifies a prior over both pre-treatment and post-treatment PT violations, then updates the posterior given pre-period observations and forms posterior means and credible sets for the treatment effect. In the empirical Bayes (EB) version, the prior is learned from pre-treatment data. A direct alternative to Rambachan-Roth honest bounds — uses Bayesian shrinkage rather than minimax bounds.

🧩 Setup & motivation

Standard DiD inference treats parallel trends as a binary assumption: either it holds (and inference is honest) or it doesn't (and headline coefficients are biased). The pre-trend test gates publication. Rambachan-Roth (2023) replaces the binary gate with minimax sensitivity bounds parameterized by a smoothness restriction \(M\).

Kwon-Roth propose a complementary approach: specify a prior distribution over PT violations and compute Bayesian posteriors. The Bayes approach is more decision-theoretically efficient than minimax bounds when the prior is well-calibrated; the EB version learns the prior from pre-treatment data so the researcher doesn't have to specify it.

📐 Main results

The Bayes model

The researcher specifies a joint prior on pre-period and post-period PT violations \((\delta_{\text{pre}}, \delta_{\text{post}})\), typically jointly normal with correlation parameter \(\rho\). Given an observed pre-trend estimate, Bayes' rule produces a posterior on \(\delta_{\text{post}}\), from which the ATT estimate is corrected. The posterior credible set for the ATT incorporates both sampling uncertainty and prior uncertainty about \(\delta_{\text{post}}\).

The empirical Bayes refinement

Rather than specifying the prior, the EB version learns it from pre-period observations: the distribution of pre-period violations across event-time leads serves as the prior. Uses methods from the EB literature (Tweedie's formula, James-Stein-style shrinkage) to construct optimal shrinkage estimators.

Comparison to Rambachan-Roth

Rambachan-Roth bounds are minimax: they guarantee coverage under the worst-case violation consistent with a smoothness restriction. Kwon-Roth Bayes/EB intervals are average-case: they minimize expected loss under the prior. When the prior is right, EB is more efficient (tighter intervals); when it's wrong, EB can be biased. The two approaches are complementary — many practitioners will report both.

🛠️ Implications for practice

Use Kwon-Roth EB when you have many event-time leads to estimate the prior, and you're willing to accept average-case (not worst-case) coverage.
Use Rambachan-Roth when you want worst-case coverage or when you have few pre-periods.
Report both as robustness when stakes are high.
Companion R implementation available in the supplementary materials.

🧭 Where this sits in the broader DiD literature

Direct alternative to Rambachan-Roth (2023, REStud) on PT sensitivity. Builds on the EB tradition (Efron 2011, Tweedie's formula) and the meta-analysis literature on shrinkage estimation. Cited as a complementary tool in the BCCGS 2026 JEL practitioner's guide. Companion: Kwon-Roth's earlier work on shrinkage in DiD inference.

📥 Read the paper

Local PDF (427 KB) — instant, no external request
AEA
arXiv

Literature Readings