In Defense of the Pre-Test: Valid Inference when Testing Violations of Parallel Trends for Difference-in-Differences

⚡ TL;DR

Modern pushback against Roth (2022) "Pretest with Caution." Argues that pre-tests should play an important role in DiD: proposes a conditional extrapolation assumption requiring the analyst to verify that pre-trend violations fall below an acceptable level before extrapolating to the post-treatment period. Provides theory of valid inference for testing parallel-trends violations, accounting for the post-test conditioning that Roth (2022) warned about.

🧩 Setup & motivation

Roth (2022) showed two problems with pre-tests in DiD: (i) standard pre-trend F-tests are underpowered against the violations that actually matter for post-treatment bias; (ii) conditioning publication on passing the pre-test creates post-test bias in inference. The implication: pre-tests should not gate DiD applications.

Lu responds: yes, but pre-tests are still useful if you use them correctly. Specifically, if you define a "tolerable violation" magnitude \(M\) a priori, test against violations larger than \(M\), and correctly account for the conditional inference, then the pre-test plays a valid screening role.

📐 Main results

The conditional extrapolation framework

Lu formalizes the screening logic: the researcher specifies a tolerance \(M\) (in units of pre-trend slope), and the pre-test is rejection if the pre-trend exceeds \(M\), not rejection of zero. Under the maintained smoothness assumption, if the pre-trend is below \(M\), the post-treatment bias is also bounded by \(M\) times a known constant.

Valid post-test inference

The paper derives confidence intervals for the ATT that are valid conditional on passing the pre-test. The intervals are wider than naive intervals to account for the post-test conditioning, but narrower than the unconditional Rambachan-Roth bounds. The result is a coherent framework where pre-testing is statistically defensible.

Comparison to Rambachan-Roth

Rambachan-Roth (2023) bounds report a range of estimates consistent with all violations up to magnitude \(M\). Lu's framework conditions on having observed a violation below \(M\) in the pre-period, so the inference is sharper. The two approaches are complementary: RR is unconditional, Lu is conditional on the pre-test result.

🛠️ Implications for practice

Pre-tests are not obsolete; they can be statistically valid if you define a tolerance \(M\) a priori and use valid post-test inference.
The pre-registered tolerance \(M\) should be theoretically motivated, not chosen ex-post to maximize the pre-test pass rate.
Researchers can now use pre-tests as a defensible screen, paired with Rambachan-Roth or Kwon-Roth bounds.

🧭 Where this sits in the broader DiD literature

Direct response to Roth (2022, AER:I) "Pretest with Caution." Complementary to Rambachan-Roth (2023, REStud) honest bounds and Kwon-Roth (2024, AEA P&P) Bayes approach. Part of the ongoing 2024–2026 debate about how to use pre-trend tests responsibly.

📥 Read the paper

Local PDF (2.1 MB) — instant, no external request
arXiv 2510.26470

Literature Readings