Skip to content

Commit 4d71c59

Browse files
igerberclaude
andcommitted
Fix Round 6: align IF/SE eligibility with DID path, document l=1 and delta contracts
- IF/SE/bootstrap paths now use the same combined eligibility mask as _compute_multi_horizon_dids (singleton-baseline + empty-control-pool exclusions), so point estimate and inference agree on terminal-missing panels - Add REGISTRY Note documenting that event_study_effects[1] uses per-group DID_{g,1} (cohort-based controls) when L_max >= 2, which may differ from Phase 1 DID_M (period-based controls) on mixed-direction panels - Add REGISTRY Note documenting that delta SE uses delta-method (normal-theory) even when bootstrap is enabled, as an intentional exception to the bootstrap-inference-surface contract Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent a7f3299 commit 4d71c59

2 files changed

Lines changed: 16 additions & 6 deletions

File tree

diff_diff/chaisemartin_dhaultfoeuille.py

Lines changed: 13 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1093,8 +1093,13 @@ def fit(
10931093
unique_c[key] = len(unique_c)
10941094
cid_l[g] = unique_c[key]
10951095

1096-
U_l_elig = U_l[eligible_mask_var]
1097-
cid_elig = cid_l[eligible_mask_var]
1096+
# Combine singleton-baseline exclusion with the finalized
1097+
# eligible_mask from _compute_multi_horizon_dids (which
1098+
# excludes groups with empty control pools).
1099+
did_eligible = multi_horizon_dids[l_h]["eligible_mask"]
1100+
combined_mask = eligible_mask_var & did_eligible
1101+
U_l_elig = U_l[combined_mask]
1102+
cid_elig = cid_l[combined_mask]
10981103
U_centered_l = _cohort_recenter(U_l_elig, cid_elig)
10991104
N_l_h = multi_horizon_dids[l_h]["N_l"]
11001105
se_l = _plugin_se(U_centered=U_centered_l, divisor=N_l_h)
@@ -1356,7 +1361,10 @@ def fit(
13561361
if h_data is None or h_data["N_l"] == 0:
13571362
continue
13581363
U_l_full = multi_horizon_if[l_h]
1359-
U_l_elig = U_l_full[eligible_mask_b]
1364+
# Use same combined mask as analytical SE path
1365+
did_eligible_b = h_data["eligible_mask"]
1366+
combined_b = eligible_mask_b & did_eligible_b
1367+
U_l_elig = U_l_full[combined_b]
13601368
# Use the same cohort IDs as the analytical SE path
13611369
cohort_keys_b = [
13621370
(
@@ -1369,14 +1377,14 @@ def fit(
13691377
unique_cb: Dict[Tuple[int, int, int], int] = {}
13701378
cid_b = np.zeros(len(all_groups), dtype=int)
13711379
for g in range(len(all_groups)):
1372-
if not eligible_mask_b[g]:
1380+
if not combined_b[g]:
13731381
cid_b[g] = -1
13741382
continue
13751383
key = cohort_keys_b[g]
13761384
if key not in unique_cb:
13771385
unique_cb[key] = len(unique_cb)
13781386
cid_b[g] = unique_cb[key]
1379-
cid_elig = cid_b[eligible_mask_b]
1387+
cid_elig = cid_b[combined_b]
13801388
U_centered_h = _cohort_recenter(U_l_elig, cid_elig)
13811389
mh_boot_inputs[l_h] = (
13821390
U_centered_h,

docs/methodology/REGISTRY.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -533,13 +533,15 @@ Cost-benefit aggregate `delta = sum_l w_l * DID_l` (Lemma 4) where `w_l` are non
533533

534534
Dynamic placebos `DID^{pl}_l` look backward from each group's reference period, with a dual eligibility condition: `F_g - 1 - l >= 1` AND `F_g - 1 + l <= T_g`.
535535

536+
- **Note (Phase 2 `DID_1` vs Phase 1 `DID_M`):** When `L_max >= 2`, `event_study_effects[1]` uses the per-group `DID_{g,1}` building block (Equation 3 of the dynamic paper) with cohort-based controls, which may differ slightly from the Phase 1 `DID_M` value (Theorem 3 of AER 2020 with period-based stable-control sets). The Phase 1 `DID_M` value remains accessible via `fit(..., L_max=None).overall_att`. The difference arises because the per-group path conditions on baseline treatment `D_{g,1}` when selecting controls, while the per-period path does not. On pure-direction panels (all joiners or all leavers) the two agree; on mixed-direction panels they can differ by O(1%). This is the same period-vs-cohort control-set deviation documented in the Phase 1 Note above, extended to the `l=1` event-study entry.
537+
536538
- **Note (Phase 2 equal-cell weighting, deviation from R `DIDmultiplegtDYN`):** The Phase 1 equal-cell weighting contract carries forward to all Phase 2 estimands (`DID_l`, `DID^{pl}_l`, `DID^n_l`, `delta`). Each `(g, t)` cell contributes equally regardless of within-cell observation count. On individual-level inputs with uneven cell sizes, this produces a different estimand than R `DIDmultiplegtDYN` which weights by cell size. The parity tests use one-observation-per-cell generators so parity holds. See the Phase 1 weighting Note above for the full rationale.
537539

538540
- **Note (Phase 2 `<50%` switcher warning):** When fewer than 50% of the l=1 switchers contribute at a far horizon l, `fit()` emits a `UserWarning`. The paper recommends not reporting such horizons (Favara-Imbs application, footnote 14).
539541

540542
- **Note (Phase 2 Assumption 7 and cost-benefit delta):** Assumption 7 (`D_{g,t} >= D_{g,1}`) is required for the single-sign cost-benefit interpretation. When leavers are present (binary: 1->0 groups violate Assumption 7), the estimator emits a `UserWarning` and provides `delta_joiners` / `delta_leavers` separately on `results.cost_benefit_delta`.
541543

542-
- **Note (Phase 2 cost-benefit delta SE):** When `L_max >= 2`, `overall_att` holds the cost-benefit `delta`. Its SE is computed via the delta method from per-horizon SEs: `SE(delta) = sqrt(sum w_l^2 * SE(DID_l)^2)`, treating horizons as independent (conservative under Assumption 8). When bootstrap is enabled, per-horizon bootstrap SEs flow through the delta-method formula.
544+
- **Note (Phase 2 cost-benefit delta SE):** When `L_max >= 2`, `overall_att` holds the cost-benefit `delta`. Its SE is computed via the delta method from per-horizon SEs: `SE(delta) = sqrt(sum w_l^2 * SE(DID_l)^2)`, treating horizons as independent (conservative under Assumption 8). When bootstrap is enabled, per-horizon bootstrap SEs flow through the delta-method formula, so `overall_se` reflects bootstrap-derived per-horizon uncertainty but the delta aggregation itself uses normal-theory (not bootstrap percentile). This is an intentional exception to the general bootstrap-inference-surface contract: `overall_p_value` and `overall_conf_int` for `delta` use `safe_inference(delta, delta_se)`, not percentile bootstrap, because the delta is a derived aggregate rather than a directly bootstrapped estimand.
543545

544546
- **Note (Phase 2 dynamic placebo SE):** Dynamic placebos `DID^{pl}_l` (negative horizons in `placebo_event_study`) ship as point estimates with `NaN` inference in Phase 2. The placebo influence-function derivation follows the same cohort-recentered structure as the positive horizons but requires a separate IF computation for the backward outcome differences, which is deferred. The placebo point estimates are meaningful for visual pre-trends inspection; formal placebo inference will be added in a follow-up. Bootstrap placebo inference plumbing exists in the mixin but is not wired.
545547

0 commit comments

Comments
 (0)