Skip to content

Commit 1bfec37

Browse files
igerberclaude
andcommitted
Address PR #350 CI review round 1: P0 constant-dose + P1 staggered-detection
**P0 (blocker):** `_aggregate_multi_period_first_differences` reuses `D_{g, F}` as the single regressor for every event-time horizon. Without validation, panels where a unit's dose varies across post-treatment periods silently misattribute later-horizon effects to the period-F dose. Fix: `_validate_had_panel_event_study` now rejects panels where any unit has time-varying dose across post-periods (within-unit spread beyond float tolerance), with a `ValueError` redirecting to ChaisemartinDHaultfoeuille for genuinely time-varying regimes. **P1:** Staggered-timing auto-filter previously only ran inside `if first_treat_col is not None`. Multi-cohort panels without cohort metadata slipped through, treating later-cohort units as zero-dose "controls" at the inferred F, violating Appendix B.2's last-cohort-only contract. Fix: When `first_treat_col is None`, the validator computes per-unit first-positive-dose period from the dose path. If multiple distinct cohorts are detected, it raises a `ValueError` directing users to pass `first_treat_col` (which activates the last-cohort auto-filter) or use ChaisemartinDHaultfoeuille for full staggered support. **P2 (docs):** Reconciled contradictory REGISTRY guidance between the legacy edge-case note (line ~2251) and the new Phase 2b last-cohort filter note. Both now describe the auto-filter + front-door rejection of un-annotated staggered panels consistently. **P2 (tests):** Added regression tests for both blockers: - `test_time_varying_post_F_dose_rejected` - `test_staggered_without_first_treat_col_rejected` Also added a **Note (Phase 2b constant-dose requirement)** block to REGISTRY documenting the new validator guard. TODO.md entry updated to reflect front-door rejection of time-varying doses (not silent reuse as before). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent bb7bbc6 commit 1bfec37

4 files changed

Lines changed: 139 additions & 3 deletions

File tree

TODO.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -98,7 +98,7 @@ Deferred items from PR reviews that were not addressed before merge.
9898
| `HeterogeneousAdoptionDiD` Phase 3: `qug_test()`, `stute_test()`, `yatchew_hr_test()` pre-test diagnostics (paper Section 3.3). Composite helper `did_had_pretest_workflow()`. Not part of Phase 2a scope. | `diff_diff/had.py`, new module | Phase 2a | Medium |
9999
| `HeterogeneousAdoptionDiD` Phase 4: Pierce-Schott (2016) replication harness; reproduce paper Figure 2 values and Table 1 coverage rates. | `benchmarks/`, `tests/` | Phase 2a | Low |
100100
| `HeterogeneousAdoptionDiD` Phase 5: `practitioner_next_steps()` integration, tutorial notebook, and `llms.txt` updates (preserving UTF-8 fingerprint). | `diff_diff/practitioner.py`, `tutorials/`, `diff_diff/guides/` | Phase 2a | Low |
101-
| `HeterogeneousAdoptionDiD` time-varying dose on event study: Phase 2b uses `D_{g, F}` (first-treatment-period dose) as the single regressor for ALL event-time horizons (paper convention assumes "once treated, stay treated with same dose"). Panels where `D_{g,t}` varies for `t >= F` get the period-F dose used throughout — correct under the constant-dose interpretation but lossy under time-varying regimes. Paper Section 2 scope. | `diff_diff/had.py::_aggregate_multi_period_first_differences` | Phase 2b | Low |
101+
| `HeterogeneousAdoptionDiD` time-varying dose on event study: Phase 2b REJECTS panels where `D_{g,t}` varies within a unit for `t >= F` (the aggregation uses `D_{g, F}` as the single regressor for all horizons, paper Appendix B.2 constant-dose convention). A follow-up PR could add a time-varying-dose estimator for these panels; current behavior is front-door rejection with a redirect to `ChaisemartinDHaultfoeuille`. | `diff_diff/had.py::_validate_had_panel_event_study` | Phase 2b | Low |
102102
| `HeterogeneousAdoptionDiD` repeated-cross-section support: paper Section 2 defines HAD on panel OR repeated cross-section, but Phase 2a is panel-only. RCS inputs (disjoint unit IDs between periods) are rejected by the balanced-panel validator with the generic "unit(s) do not appear in both periods" error. A follow-up PR will add an RCS identification path based on pre/post cell means (rather than unit-level first differences), with its own validator and a distinct `data_mode` / API surface. | `diff_diff/had.py::_validate_had_panel`, `diff_diff/had.py::_aggregate_first_difference` | Phase 2a | Medium |
103103

104104
#### Performance

diff_diff/had.py

Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1116,6 +1116,73 @@ def _validate_had_panel_event_study(
11161116
f"column."
11171117
)
11181118

1119+
# Staggered-without-``first_treat_col`` detection. When cohort metadata
1120+
# is not supplied, the dose-invariant period classification still
1121+
# declares t=F=min-post-period based on "any unit has nonzero dose".
1122+
# That silently accepts staggered panels where units have DIFFERENT
1123+
# first-positive-dose periods: the later-treated cohorts enter
1124+
# ``d_arr`` as zero-dose "controls" at the inferred F, violating
1125+
# paper Appendix B.2's last-cohort-only contract. Compute per-unit
1126+
# first-positive-dose period directly from the dose path and raise
1127+
# if multiple cohorts are present, directing users to pass
1128+
# ``first_treat_col`` (which activates the last-cohort auto-filter)
1129+
# or to use ChaisemartinDHaultfoeuille for full staggered support.
1130+
if first_treat_col is None:
1131+
df_sorted = data_filtered.sort_values([unit_col, time_col])
1132+
# For each unit, the first period at which dose > 0.
1133+
pos_mask_global = df_sorted[dose_col] > 0
1134+
first_pos_per_unit = df_sorted.loc[pos_mask_global].groupby(unit_col)[time_col].first()
1135+
cohort_labels = list(first_pos_per_unit.unique())
1136+
if len(cohort_labels) > 1:
1137+
try:
1138+
distinct_cohorts = sorted(cohort_labels, key=lambda x: (x is None, x))
1139+
except TypeError:
1140+
distinct_cohorts = list(cohort_labels)
1141+
raise ValueError(
1142+
f"Staggered-timing panel detected (first_treat_col is "
1143+
f"None): {len(distinct_cohorts)} distinct first-positive-"
1144+
f"dose periods {distinct_cohorts!r} across units. HAD's "
1145+
f"last-cohort auto-filter (paper Appendix B.2) only runs "
1146+
f"when first_treat_col is supplied so the estimator can "
1147+
f"identify cohorts. Pass first_treat_col=<column> to "
1148+
f"enable the auto-filter to the last cohort, or use "
1149+
f"ChaisemartinDHaultfoeuille (did_multiplegt_dyn) for "
1150+
f"full staggered support."
1151+
)
1152+
1153+
# Constant post-period dose check. Paper Appendix B.2 assumes
1154+
# "once treated, stay treated with the same dose"; the event-study
1155+
# aggregation uses ``D_{g, F}`` as the single regressor for every
1156+
# event-time horizon. Panels where a unit's dose varies across
1157+
# post-periods (e.g., phased adoption, dose changes after F) would
1158+
# silently misattribute later-horizon effects to the period-F dose.
1159+
# Reject front-door with a redirect to ChaisemartinDHaultfoeuille
1160+
# for genuinely time-varying post-treatment doses.
1161+
if len(t_post_list) > 1:
1162+
post_data = data_filtered.loc[post_mask]
1163+
dose_spread_per_unit = post_data.groupby(unit_col)[dose_col].agg(
1164+
lambda x: float(x.max() - x.min())
1165+
)
1166+
abs_max_dose = float(np.max(np.abs(post_doses))) if post_doses.size else 0.0
1167+
tol = 1e-12 * max(1.0, abs_max_dose)
1168+
bad_mask = dose_spread_per_unit > tol
1169+
if bool(bad_mask.any()):
1170+
n_bad = int(bad_mask.sum())
1171+
max_spread = float(dose_spread_per_unit.max())
1172+
raise ValueError(
1173+
f"HAD event-study requires constant dose within unit for "
1174+
f"all post-treatment periods t >= F={F!r}. {n_bad} unit(s) "
1175+
f"have time-varying doses across post-periods "
1176+
f"{t_post_list!r} (max within-unit spread={max_spread!r}, "
1177+
f"tolerance={tol!r}). The aggregation uses D_{{g, F}} as "
1178+
f"the single regressor for every event-time horizon "
1179+
f"(paper Appendix B.2 constant-dose convention), so "
1180+
f"silently accepting time-varying post-treatment doses "
1181+
f"would misattribute later-horizon effects. For genuinely "
1182+
f"time-varying post-treatment doses use "
1183+
f"ChaisemartinDHaultfoeuille (did_multiplegt_dyn)."
1184+
)
1185+
11191186
return F, t_pre_list, t_post_list, data_filtered, filter_info
11201187

11211188

docs/methodology/REGISTRY.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2248,7 +2248,7 @@ so `δ_0` is recovered by OLS of `ΔY` on `X` and `D_2 * X`; Average Slope is `(
22482248
- **Extensive-margin effects**: ruled out by Assumption 3. If a jump `Y_2(0) ≠ Y_2(0+)` is suspected, the target parameter and estimator are not appropriate.
22492249
- **Partial identification of WAS_{d̲}**: only identified up to a positive constant offset `≤ ε` by the bound in Equation 22 (Jensen inequality argument in Appendix C.3).
22502250
- **Density at boundary**: Assumption 4 requires `f_{D_2}(0) > 0`. This is a non-trivial assumption since 0 is on the boundary of `Supp(D_2)`.
2251-
- **Variation in treatment timing**: Appendix B.2 - "in designs with variation in treatment timing, there must be an untreated group, at least till the period where the last cohort gets treated." The implementation errors (hard fail, not warning) on this configuration and redirects users to `ChaisemartinDHaultfoeuille`.
2251+
- **Variation in treatment timing**: Appendix B.2 - "in designs with variation in treatment timing, there must be an untreated group, at least till the period where the last cohort gets treated." In Phase 2b (`aggregate="event_study"`) the implementation auto-filters to the last-treatment cohort plus never-treated units with a `UserWarning` when `first_treat_col` is supplied (see Phase 2b last-cohort filter note below); when `first_treat_col` is omitted the estimator detects multiple first-positive-dose cohorts from the dose path and raises a front-door `ValueError` directing users to pass `first_treat_col` or use `ChaisemartinDHaultfoeuille`.
22522252
- **Mechanical zero at reference period under linear trends (Footnote 13, main text p. 31)**: with industry/unit-specific linear trends, the pre-trends estimator is mechanically zero in the second-to-last pre-period (the slope anchor year). Practical consequence: that year is not an informative placebo check.
22532253

22542254
*Algorithm (Design 1' nonparametric - summarized from Section 3.1.3-3.1.4 and Equations 7-8):*
@@ -2330,7 +2330,8 @@ Shipped as `did_had_pretest_workflow()` and surfaced via `practitioner_next_step
23302330
- **Note (Phase 2a/2b scope):** Phase 2a ships the single-period `aggregate="overall"` path; Phase 2b lifts `aggregate="event_study"` (Appendix B.2 multi-period extension) which returns a `HeterogeneousAdoptionDiDEventStudyResults` with per-event-time WAS estimates and pointwise CIs. `survey=` and `weights=` kwargs raise `NotImplementedError` pointing to the follow-up survey-integration PR.
23312331
- **Note (panel-only):** The paper (Section 2) defines HAD on *panel or repeated cross-section* data, but both the overall and event-study paths ship a panel-only implementation: `HeterogeneousAdoptionDiD.fit()` requires a balanced panel with a unit identifier so that unit-level first differences `ΔY_{g,t} = Y_{g,t} - Y_{g,t_anchor}` can be formed. Repeated-cross-section inputs (disjoint unit IDs between periods) are rejected by the balanced-panel validator. RCS support is queued for a follow-up PR (tracked in `TODO.md`); it will need a separate identification path based on pre/post cell means rather than unit-level differences.
23322332
- [x] Phase 2b: Multi-period event-study extension (Appendix B.2). `aggregate="event_study"` produces per-event-time WAS estimates using a uniform `F-1` baseline (`ΔY_{g,t} = Y_{g,t} - Y_{g,F-1}` for every horizon), reusing the three Phase 2a design paths on per-horizon first differences. Pre-period placebos included for `e <= -2` (the anchor `e = -1` is skipped since `ΔY = 0` trivially). Post-period estimates for `e >= 0`. The joint Stute test (Equation 18) across pre-periods is a SEPARATE diagnostic deferred to Phase 3 (pre-test diagnostics).
2333-
- **Note (Phase 2b last-cohort filter):** When `first_treat_col` indicates more than one nonzero cohort, the panel is auto-filtered to the last-treatment cohort (`F_last = max(cohorts)`) **plus never-treated units** (`first_treat = 0`), with a `UserWarning` naming kept/dropped unit counts and dropped cohort labels. Paper Appendix B.2 is explicit that HAD "may be used only for the LAST treatment cohort in a staggered design"; the auto-filter implements this prescription, retaining never-treated units per the paper's "there must be an untreated group, at least till the period where the last cohort gets treated" requirement. Only earlier-cohort units (with `first_treat > 0` and `< F_last`) are dropped — never-treated units satisfy the dose invariant at every period (`D = 0` throughout) and preserve Design 1' identifiability (boundary at `0`) when last-cohort doses are uniformly positive. Panels without `first_treat_col` with >2 periods infer `F` from the dose invariant (all-zero-dose periods → pre; any-nonzero period → post) and require dose contiguity (pre-periods < post-periods in natural ordering). Non-contiguous dose sequences (e.g., reverse treatment) raise with a pointer to `ChaisemartinDHaultfoeuille`.
2333+
- **Note (Phase 2b last-cohort filter):** When `first_treat_col` indicates more than one nonzero cohort, the panel is auto-filtered to the last-treatment cohort (`F_last = max(cohorts)`) **plus never-treated units** (`first_treat = 0`), with a `UserWarning` naming kept/dropped unit counts and dropped cohort labels. Paper Appendix B.2 is explicit that HAD "may be used only for the LAST treatment cohort in a staggered design"; the auto-filter implements this prescription, retaining never-treated units per the paper's "there must be an untreated group, at least till the period where the last cohort gets treated" requirement. Only earlier-cohort units (with `first_treat > 0` and `< F_last`) are dropped — never-treated units satisfy the dose invariant at every period (`D = 0` throughout) and preserve Design 1' identifiability (boundary at `0`) when last-cohort doses are uniformly positive. When `first_treat_col` is omitted on a >2-period panel, the validator infers each unit's first-positive-dose period from the dose path; if multiple distinct first-positive-dose cohorts are detected, the estimator raises a front-door `ValueError` directing users to pass `first_treat_col` (which activates the auto-filter) or use `ChaisemartinDHaultfoeuille` for full staggered support — there is no silent acceptance of staggered panels without cohort metadata. Common-adoption panels (single first-positive-dose cohort, or only never-treated + one cohort) pass through unchanged with `F` inferred from the dose invariant, and require dose contiguity (pre-periods < post-periods in natural ordering). Non-contiguous dose sequences (e.g., reverse treatment) raise with a pointer to `ChaisemartinDHaultfoeuille`.
2334+
- **Note (Phase 2b constant-dose requirement):** The event-study aggregation uses `D_{g, F}` (first-treatment-period dose) as the single regressor for every event-time horizon, per paper Appendix B.2's "once treated, stay treated with the same dose" convention. The validator REJECTS panels where a unit has time-varying dose across post-treatment periods (`D_{g, t} != D_{g, F}` for any `t >= F` within-unit, beyond float tolerance) with a front-door `ValueError`, directing users with genuinely time-varying post-treatment doses to `ChaisemartinDHaultfoeuille` (`did_multiplegt_dyn`). Silent acceptance would misattribute later-horizon treatment-effect heterogeneity to the period-F dose. A follow-up PR could implement a time-varying-dose estimator; tracked in `TODO.md`.
23342335
- **Note (Phase 2b per-horizon SE):** Each event-time horizon uses an INDEPENDENT sandwich computed on that horizon's first differences: continuous paths use the CCT-2014 robust SE from Phase 1c divided by `|den|`; mass-point path uses the structural-residual 2SLS sandwich from Phase 2a. This produces pointwise CIs per horizon, matching the paper's Pierce-Schott application (Section 5.2, Figure 2: "nonparametric pointwise CIs"). Joint cross-horizon covariance (IF-based stacking or block bootstrap) is NOT computed — the paper does not derive it and all reported CIs are pointwise. Follow-up PRs may add joint covariance for cross-horizon hypothesis tests; current tracking in `TODO.md`.
23352336
- **Note (Phase 2b baseline convention):** All event-time horizons use a uniform `F-1` anchor: `ΔY_{g,t} = Y_{g,t} - Y_{g,F-1}` for every `t`. This is consistent with the paper's Garrett-et-al. application (Section 5.1: "outcome `Y_{g,t} - Y_{g,2001}`" where `F = 2002`), simplifies event-time indexing (`e = t - F` so `e = -1` is the anchor, skipped), and keeps the implementation symmetric for pre- and post-period horizons. The paper review text's asymmetric "`Y_{g,t} - Y_{g,1}` for pre" / "`Y_{g,t} - Y_{g,F-1}` for post" phrasing is covered by the uniform convention since both give the same placebo interpretation under parallel trends (the paper's own applications use the uniform anchor).
23362337
- **Note (Phase 2b result class):** `aggregate="event_study"` returns a new `HeterogeneousAdoptionDiDEventStudyResults` dataclass (distinct from the single-period `HeterogeneousAdoptionDiDResults`) with per-horizon arrays (`event_times`, `att`, `se`, `t_stat`, `p_value`, `conf_int_low`, `conf_int_high`, `n_obs_per_horizon`) and shared metadata. `to_dataframe()` returns a tidy per-horizon DataFrame; `to_dict()` returns a dict with list-of-per-horizon fields. The static return-type annotation on `fit()` is `HeterogeneousAdoptionDiDResults` (the common case); callers passing `aggregate="event_study"` should annotate their variable as `HeterogeneousAdoptionDiDEventStudyResults` for type checkers.

tests/test_had.py

Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2690,6 +2690,74 @@ def test_no_pre_period_rejected(self):
26902690
panel, "outcome", "dose", "period", "unit", aggregate="event_study"
26912691
)
26922692

2693+
def test_time_varying_post_F_dose_rejected(self):
2694+
"""Within-unit dose variation across post-periods raises.
2695+
2696+
Paper Appendix B.2 assumes "once treated, stay treated with the
2697+
same dose"; the aggregation uses ``D_{g, F}`` as the single
2698+
regressor for every horizon. Silent acceptance of time-varying
2699+
post-treatment doses would misattribute later-horizon effects.
2700+
Covers CI reviewer round 1 P0: `_aggregate_multi_period_first_differences`
2701+
would otherwise use period-F dose for all horizons.
2702+
"""
2703+
rng = np.random.default_rng(0)
2704+
G = 50
2705+
rows = []
2706+
for g in range(G):
2707+
d_F = float(rng.uniform(0.1, 0.5))
2708+
d_F_plus_1 = d_F + 0.3 # time-varying: dose changes after F
2709+
for t in range(1, 6):
2710+
if t < 3:
2711+
dose = 0.0
2712+
elif t == 3:
2713+
dose = d_F
2714+
else:
2715+
dose = d_F_plus_1 # different from d_F
2716+
rows.append(
2717+
{
2718+
"unit": g,
2719+
"period": t,
2720+
"dose": dose,
2721+
"outcome": rng.standard_normal(),
2722+
}
2723+
)
2724+
panel = pd.DataFrame(rows)
2725+
with pytest.raises(ValueError, match="constant dose|time-varying"):
2726+
HeterogeneousAdoptionDiD(design="auto").fit(
2727+
panel, "outcome", "dose", "period", "unit", aggregate="event_study"
2728+
)
2729+
2730+
def test_staggered_without_first_treat_col_rejected(self):
2731+
"""Multi-cohort panel without first_treat_col raises (not silent).
2732+
2733+
Without cohort metadata, the dose-invariant period classification
2734+
would silently treat later-cohort units as zero-dose "controls"
2735+
at the inferred F, violating Appendix B.2's last-cohort-only
2736+
contract. Covers CI reviewer round 1 P1.
2737+
"""
2738+
rng = np.random.default_rng(0)
2739+
G = 100
2740+
rows = []
2741+
for g in range(G):
2742+
# Assign cohort: half treat at t=3, half at t=5.
2743+
F_g = 3 if g < G // 2 else 5
2744+
d_g = float(rng.uniform(0.1, 1.0))
2745+
for t in range(1, 7):
2746+
dose = d_g if t >= F_g else 0.0
2747+
rows.append(
2748+
{
2749+
"unit": g,
2750+
"period": t,
2751+
"dose": dose,
2752+
"outcome": rng.standard_normal(),
2753+
}
2754+
)
2755+
panel = pd.DataFrame(rows)
2756+
with pytest.raises(ValueError, match="Staggered-timing|first_treat_col"):
2757+
HeterogeneousAdoptionDiD(design="auto").fit(
2758+
panel, "outcome", "dose", "period", "unit", aggregate="event_study"
2759+
)
2760+
26932761

26942762
class TestEventStudyGuardsPreserved:
26952763
"""Phase 2a policy guards fire on the event-study path too."""

0 commit comments

Comments
 (0)