Skip to content

Commit 1518bbc

Browse files
igerberclaude
andcommitted
Add Phase 2 multi-horizon event study DID_l for ChaisemartinDHaultfoeuille
Implements ROADMAP items 2a-2h: multi-horizon DID_l via per-group DID_{g,l} building block (Eq 3 of dynamic paper), per-horizon analytical SE, dynamic placebos DID^{pl}_l, normalized DID^n_l, cost-benefit aggregate delta, sup-t simultaneous confidence bands, plot_event_study() integration, and R DIDmultiplegtDYN parity tests at multiple horizons. L_max parameter controls multi-horizon mode; L_max=None preserves exact Phase 1 behavior. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 5533a2b commit 1518bbc

16 files changed

Lines changed: 2342 additions & 98 deletions

CHANGELOG.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,16 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
2121
- **`twowayfeweights()`** — standalone helper function for the TWFE decomposition diagnostic (Theorem 1 of de Chaisemartin & D'Haultfœuille 2020), available without instantiating the full estimator. Returns a `TWFEWeightsResult` with per-cell weights, fraction negative, `sigma_fe`, and `beta_fe`.
2222
- **`generate_reversible_did_data()`** — new generator in `diff_diff.prep` producing reversible-treatment panel data for testing and tutorials. Patterns: `single_switch` (default, A5-safe), `joiners_only`, `leavers_only`, `mixed_single_switch`, `random`, `cycles`, `marketing`. Returns columns `group`, `period`, `treatment`, `outcome`, `true_effect`, `d_lag`, `switcher_type`.
2323
- **REGISTRY.md `## ChaisemartinDHaultfoeuille` section** — single canonical source for dCDH methodology, equations, edge cases, and all documented deviations from the R `DIDmultiplegtDYN` reference implementation. Cites the AER 2020 paper and the dynamic companion paper (NBER WP 29873) by reference; primary papers are upstream sources, not in-repo files.
24+
- **Phase 2: Multi-horizon event study for `ChaisemartinDHaultfoeuille`** — adds `L_max` parameter to `fit()` for computing `DID_l` at horizons `l = 1, ..., L_max` using the per-group building block from Equation 3 of the dynamic companion paper. Ships:
25+
- Per-horizon point estimates and cohort-recentered analytical SE
26+
- Dynamic placebos `DID^{pl}_l` with dual eligibility condition (Web Appendix Section 1.1)
27+
- Normalized estimator `DID^n_l = DID_l / delta^D_l` (Section 3.2)
28+
- Cost-benefit aggregate `delta` (Section 3.3, Lemma 4) — becomes `overall_att` when `L_max > 1`
29+
- Sup-t simultaneous confidence bands via multiplier bootstrap
30+
- `plot_event_study()` integration with `<50%` switcher warning for far horizons
31+
- `to_dataframe(level="event_study")` and `to_dataframe(level="normalized")` output
32+
- Per-horizon bootstrap with bootstrap SE/CI/p-value propagation to event_study_effects
33+
- `L_max=None` (default) preserves exact Phase 1 behavior
2434

2535
## [3.0.1] - 2026-04-07
2636

README.md

Lines changed: 27 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1213,6 +1213,32 @@ ChaisemartinDHaultfoeuille(
12131213
| `n_groups_dropped_crossers`, `n_groups_dropped_singleton_baseline` | Filter counts (multi-switch groups dropped before estimation; singleton-baseline groups excluded from variance) |
12141214
| `n_groups_dropped_never_switching` | Backwards-compatibility metadata. Never-switching groups participate in the variance via stable-control roles; this field is no longer a filter count. |
12151215

1216+
**Multi-horizon event study** (Phase 2 - pass `L_max` to `fit()`):
1217+
1218+
```python
1219+
results = est.fit(data, outcome="outcome", group="group",
1220+
time="period", treatment="treatment", L_max=5)
1221+
1222+
# Per-horizon effects with analytical SE
1223+
for horizon in sorted(results.event_study_effects):
1224+
e = results.event_study_effects[horizon]
1225+
print(f" l={horizon}: DID_l={e['effect']:.3f} (SE={e['se']:.3f})")
1226+
1227+
# Cost-benefit delta (becomes overall_att when L_max > 1)
1228+
print(f"Cost-benefit delta: {results.cost_benefit_delta['delta']:.3f}")
1229+
1230+
# Normalized effects: DID^n_l = DID_l / l (for binary treatment)
1231+
for horizon in sorted(results.normalized_effects):
1232+
print(f" DID^n_{horizon} = {results.normalized_effects[horizon]['effect']:.3f}")
1233+
1234+
# Event study DataFrame (includes placebos as negative horizons)
1235+
df = results.to_dataframe("event_study")
1236+
1237+
# Plot (integrates with plot_event_study)
1238+
from diff_diff import plot_event_study
1239+
plot_event_study(results)
1240+
```
1241+
12161242
**Standalone TWFE decomposition diagnostic** (without fitting the full estimator):
12171243

12181244
```python
@@ -1232,7 +1258,7 @@ print(f"sigma_fe (sign-flipping threshold): {diagnostic.sigma_fe:.3f}")
12321258
12331259
> **Note:** Phase 1 requires panels with a **balanced baseline** (every group observed at the first global period) and **no interior period gaps**. Late-entry groups (missing the baseline) raise `ValueError`; interior-gap groups are dropped with a warning; terminally-missing groups (early exit / right-censoring) are retained and contribute from their observed periods only. This is a documented deviation from R `DIDmultiplegtDYN`, which supports unbalanced panels — see [`docs/methodology/REGISTRY.md`](docs/methodology/REGISTRY.md) for the rationale, the defensive guards that make terminal missingness safe, and workarounds for unbalanced inputs.
12341260
1235-
> **Note:** Survey design (`survey_design`), event-study aggregation (`aggregate`), covariate adjustment (`controls`), and HonestDiD integration (`honest_did`) are not yet supported. They raise `NotImplementedError` with phase pointers see [`ROADMAP.md`](ROADMAP.md) for the full multi-phase rollout.
1261+
> **Note:** Survey design (`survey_design`), covariate adjustment (`controls`), group-specific linear trends (`trends_linear`), and HonestDiD integration (`honest_did`) are not yet supported. They raise `NotImplementedError` with phase pointers - see [`ROADMAP.md`](ROADMAP.md) for the Phase 3 rollout.
12361262
12371263
### Triple Difference (DDD)
12381264

ROADMAP.md

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -148,18 +148,18 @@ The dynamic companion paper subsumes the AER 2020 paper: `DID_1 = DID_M`. The si
148148

149149
### Phase 2: Dynamic event study (multiple horizons)
150150

151-
*Goal: Add `aggregate="event_study"` mode to the same class. Loops the Phase 1 machinery over horizons `l = 1, ..., L`. No API breakage from Phase 1. No new tutorial the comprehensive tutorial waits for Phase 3.*
151+
*Goal: Add multi-horizon event study to the same class via the `L_max` parameter. Loops the Phase 1 machinery over horizons `l = 1, ..., L`. No API breakage from Phase 1. No new tutorial - the comprehensive tutorial waits for Phase 3.*
152152

153153
| Item | Priority | Status |
154154
|------|----------|--------|
155-
| **2a.** Multi-horizon `DID_l` via the cohort framework, with horizon parameter `L_max` | HIGH | Not started |
156-
| **2b.** Multi-horizon analytical SE (same plug-in formula looped over horizons) | HIGH | Not started |
157-
| **2c.** Dynamic placebos `DID^{pl}_l` for pre-trends testing (Web Appendix Section 1.1 of dynamic paper) | HIGH | Not started |
158-
| **2d.** Normalized estimator `DID^n_l` (Section 3.2 of dynamic paper) | MEDIUM | Not started |
159-
| **2e.** Cost-benefit aggregate `delta` (Section 3.3 of dynamic paper, Lemma 4) | MEDIUM | Not started |
160-
| **2f.** Simultaneous (sup-t) confidence bands for event study plots | MEDIUM | Not started |
161-
| **2g.** `plot_event_study()` integration; `< 50%`-of-switchers warning for far horizons | MEDIUM | Not started |
162-
| **2h.** Parity tests vs `did_multiplegt_dyn` for multi-horizon designs | HIGH | Not started |
155+
| **2a.** Multi-horizon `DID_l` via per-group `DID_{g,l}` building block, with `L_max` parameter | HIGH | Shipped |
156+
| **2b.** Multi-horizon analytical SE (cohort-recentered plug-in per horizon) | HIGH | Shipped |
157+
| **2c.** Dynamic placebos `DID^{pl}_l` for pre-trends testing (Web Appendix Section 1.1 of dynamic paper) | HIGH | Shipped |
158+
| **2d.** Normalized estimator `DID^n_l` (Section 3.2 of dynamic paper) | MEDIUM | Shipped |
159+
| **2e.** Cost-benefit aggregate `delta` (Section 3.3 of dynamic paper, Lemma 4) | MEDIUM | Shipped |
160+
| **2f.** Simultaneous (sup-t) confidence bands for event study plots | MEDIUM | Shipped |
161+
| **2g.** `plot_event_study()` integration; `< 50%`-of-switchers warning for far horizons | MEDIUM | Shipped |
162+
| **2h.** Parity tests vs `did_multiplegt_dyn` for multi-horizon designs | HIGH | In progress |
163163

164164
### Phase 3: Covariates, extensions, and tutorial
165165

benchmarks/R/generate_dcdh_dynr_test_values.R

Lines changed: 101 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -287,6 +287,107 @@ scenarios$hand_calculable_worked_example <- list(
287287
results = extract_dcdh_l1(res5)
288288
)
289289

290+
# ---------------------------------------------------------------------------
291+
# Phase 2: Multi-horizon scenarios (effects > 1)
292+
# ---------------------------------------------------------------------------
293+
294+
# Helper: extract multi-horizon results from did_multiplegt_dyn output
295+
extract_dcdh_multi <- function(res, n_effects, n_placebos = 0) {
296+
effects <- res$results$Effects
297+
if (is.null(effects)) {
298+
stop("did_multiplegt_dyn returned no Effects; check the input data")
299+
}
300+
301+
out <- list(effects = list(), placebos = list())
302+
303+
for (i in seq_len(min(n_effects, nrow(effects)))) {
304+
out$effects[[as.character(i)]] <- list(
305+
overall_att = as.numeric(effects[i, "Estimate"]),
306+
overall_se = as.numeric(effects[i, "SE"]),
307+
overall_ci_lo = as.numeric(effects[i, "LB CI"]),
308+
overall_ci_hi = as.numeric(effects[i, "UB CI"]),
309+
n_switchers = as.numeric(effects[i, "N"])
310+
)
311+
}
312+
313+
placebos <- res$results$Placebos
314+
if (!is.null(placebos) && n_placebos > 0) {
315+
for (i in seq_len(min(n_placebos, nrow(placebos)))) {
316+
out$placebos[[as.character(i)]] <- list(
317+
effect = as.numeric(placebos[i, "Estimate"]),
318+
se = as.numeric(placebos[i, "SE"]),
319+
ci_lo = as.numeric(placebos[i, "LB CI"]),
320+
ci_hi = as.numeric(placebos[i, "UB CI"])
321+
)
322+
}
323+
}
324+
325+
out
326+
}
327+
328+
# Scenario 6: joiners_only multi-horizon (L_max=3, placebo=3)
329+
# Uses n_periods=8 to give enough room for 3 positive + 3 placebo horizons
330+
cat(" Scenario 6: joiners_only_multi_horizon\n")
331+
d6 <- gen_reversible(n_groups = N_GOLDEN, n_periods = 8,
332+
pattern = "joiners_only", seed = 106)
333+
res6 <- did_multiplegt_dyn(
334+
df = d6, outcome = "outcome", group = "group", time = "period",
335+
treatment = "treatment", effects = 3, placebo = 3, ci_level = 95
336+
)
337+
scenarios$joiners_only_multi_horizon <- list(
338+
data = export_data(d6),
339+
params = list(pattern = "joiners_only", n_groups = N_GOLDEN, n_periods = 8,
340+
seed = 106, effects = 3, placebo = 3, ci_level = 95),
341+
results = extract_dcdh_multi(res6, n_effects = 3, n_placebos = 3)
342+
)
343+
344+
# Scenario 7: leavers_only multi-horizon (L_max=3, placebo=3)
345+
cat(" Scenario 7: leavers_only_multi_horizon\n")
346+
d7 <- gen_reversible(n_groups = N_GOLDEN, n_periods = 8,
347+
pattern = "leavers_only", seed = 107)
348+
res7 <- did_multiplegt_dyn(
349+
df = d7, outcome = "outcome", group = "group", time = "period",
350+
treatment = "treatment", effects = 3, placebo = 3, ci_level = 95
351+
)
352+
scenarios$leavers_only_multi_horizon <- list(
353+
data = export_data(d7),
354+
params = list(pattern = "leavers_only", n_groups = N_GOLDEN, n_periods = 8,
355+
seed = 107, effects = 3, placebo = 3, ci_level = 95),
356+
results = extract_dcdh_multi(res7, n_effects = 3, n_placebos = 3)
357+
)
358+
359+
# Scenario 8: mixed_single_switch multi-horizon (L_max=5, placebo=4)
360+
# Uses n_periods=10 for far horizons
361+
cat(" Scenario 8: mixed_single_switch_multi_horizon\n")
362+
d8 <- gen_reversible(n_groups = N_GOLDEN, n_periods = 10,
363+
pattern = "mixed_single_switch", seed = 108)
364+
res8 <- did_multiplegt_dyn(
365+
df = d8, outcome = "outcome", group = "group", time = "period",
366+
treatment = "treatment", effects = 5, placebo = 4, ci_level = 95
367+
)
368+
scenarios$mixed_single_switch_multi_horizon <- list(
369+
data = export_data(d8),
370+
params = list(pattern = "mixed_single_switch", n_groups = N_GOLDEN, n_periods = 10,
371+
seed = 108, effects = 5, placebo = 4, ci_level = 95),
372+
results = extract_dcdh_multi(res8, n_effects = 5, n_placebos = 4)
373+
)
374+
375+
# Scenario 9: joiners_only long panel multi-horizon (L_max=5, placebo=5)
376+
# Uses n_periods=12 and n_groups=80 for thorough coverage
377+
cat(" Scenario 9: joiners_only_long_multi_horizon\n")
378+
d9 <- gen_reversible(n_groups = N_GOLDEN, n_periods = 12,
379+
pattern = "joiners_only", seed = 109)
380+
res9 <- did_multiplegt_dyn(
381+
df = d9, outcome = "outcome", group = "group", time = "period",
382+
treatment = "treatment", effects = 5, placebo = 5, ci_level = 95
383+
)
384+
scenarios$joiners_only_long_multi_horizon <- list(
385+
data = export_data(d9),
386+
params = list(pattern = "joiners_only", n_groups = N_GOLDEN, n_periods = 12,
387+
seed = 109, effects = 5, placebo = 5, ci_level = 95),
388+
results = extract_dcdh_multi(res9, n_effects = 5, n_placebos = 5)
389+
)
390+
290391
# ---------------------------------------------------------------------------
291392
# Write output
292393
# ---------------------------------------------------------------------------

0 commit comments

Comments
 (0)