Skip to content

Commit 16ad99c

Browse files
igerberclaude
andcommitted
Address codex R6 P3s on HAD: trends_lin already shipped + scope wording
- P3 (Methodology): the promoted HAD materials described the Eq. 17/18 `trends_lin=True` linear-trend-detrended variant as "deferred per Phase 4". This conflated TWO different things: (a) the FEATURE — which is shipped via the `trends_lin: bool = False` keyword-only kwarg on HAD.fit(), joint_pretrends_test, and joint_homogeneity_test (PR igerber#389; R-parity locked against DIDHAD::did_had(trends_lin=TRUE) v2.0.0 in test_did_had_parity.py); and (b) the PIERCE-SCHOTT NUMERICAL REPLICATION against the published p=0.51 anchor on the LBD-restricted panel, which IS waived per REGISTRY Deviations Note igerber#3. Updated 3 surfaces (paper-review L194, METHODOLOGY_REVIEW Eq. 18 Verified-Components row, test_methodology_had.py module docstring + TestHADJointStute class docstring) to distinguish "feature shipped + R-parity locked elsewhere" from "Pierce-Schott numerical replication waived". - P3 (Documentation/Tests): TestHADJointStute promotion narrative overstated H1 coverage as "H0 fail-to-reject and H1 reject on linear vs nonlinear DGPs" for both joint_pretrends_test and joint_homogeneity_test. Reality: H1 rejection is tested only on joint_homogeneity_test via a quadratic post- DGP; joint_pretrends_test gets H0-only coverage in this file (H1 would require a violating-pretrends fixture that re-verifies bootstrap calibration covered by test_had_pretests.py). Narrowed wording in METHODOLOGY_REVIEW Verified-Components row + TestHADJointStute class docstring; CHANGELOG entry unchanged (the H1 reject claim in CHANGELOG explicitly cites the homogeneity side via "H1 reject under nonlinear DGP", which is accurate). All 35 methodology tests pass; lint clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
1 parent 0fd4564 commit 16ad99c

3 files changed

Lines changed: 26 additions & 7 deletions

File tree

METHODOLOGY_REVIEW.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -697,7 +697,7 @@ and covariate-adjusted specifications.)
697697
- [x] Eq. 11 / Theorem 3 (`WAS_{d_lower}` under Assumption 6, mass-point path) — `tests/test_methodology_had.py::TestHADTheorem3MassPoint` (5 tests including Wald-IV closed-form equivalence at `atol=1e-9`)
698698
- [x] Theorem 4 (QUG null test, limit law `T_λ = (λ + E_1) / E_2` under Exp(1)/Exp(1)) — `tests/test_methodology_had.py::TestHADTheorem4QUG` (6 tests; MC distributional match against closed-form `F(t) = t/(1+t)` at KS-stat ≤ 0.05, n_draws=5000)
699699
- [x] Eq. 29 / Theorem 7 (Yatchew-HR linearity test, paper-literal `σ²_diff = 1/(2G)` normalization) — `tests/test_methodology_had.py::TestHADTheorem7YatchewHR` (6 tests; standard-normal limit, normalization lock, both `null="linearity"` and `null="mean_independence"` modes)
700-
- [x] Eq. 18 mean-independence variant (joint Stute pre-trends + homogeneity, sum-of-CvMs + shared-η Mammen wild bootstrap) — `tests/test_methodology_had.py::TestHADJointStute` (5 tests; H0 fail-to-reject and H1 reject on linear vs. nonlinear DGPs). Eq. 18 linear-trend-detrended variant deferred per REGISTRY checklist (Phase 4 follow-up, `trends_lin=True`).
700+
- [x] Eq. 18 joint Stute pre-trends + homogeneity (sum-of-CvMs + shared-η Mammen wild bootstrap; both mean-independence and linearity nulls) — `tests/test_methodology_had.py::TestHADJointStute` (5 tests). Coverage scope: H0 fail-to-reject on `joint_pretrends_test` (mean-independence) and `joint_homogeneity_test` (linearity); H1 rejection demonstrated on `joint_homogeneity_test` via a nonlinear DGP. **Out of scope for the new methodology file:** the `trends_lin=True` linear-trend-detrended variant is SHIPPED in the library (R-parity locked against `DIDHAD::did_had(..., trends_lin=TRUE)` v2.0.0; see REGISTRY § "Note (Phase 4 — Eq 17 / Eq 18 linear-trend detrending shipped)" and `tests/test_did_had_parity.py`) but its methodology-walk-through tests are NOT duplicated in `test_methodology_had.py`. Pierce-Schott NUMERICAL replication against the published p=0.51 anchor on the LBD-restricted panel is the waived item (REGISTRY Deviations Note #3).
701701
- [x] R parity (`chaisemartin::did_had`) at `atol=1e-8` on 3 DGPs × 5 method combos (bit-exact, `rtol=0`) — `tests/test_did_had_parity.py::TestPointSEParity` + `TestYatchewParity` (5 direct parity tests; YatchewTest closed-form parity at `atol=1e-10`)
702702
- [x] `nprobust` (Calonico-Cattaneo-Farrell) port at machine precision (`atol=1e-14`) — `tests/test_nprobust_port.py` (7 classes spanning kernel constants, QR-based `(X'X)^{-1}`, three-stage MSE-DPI bandwidth, clustered variance, weighted local-linear, single-eval-point parity)
703703
- [x] Bandwidth selector (CCF MSE-DPI) at 1% tolerance — `tests/test_bandwidth_selector.py` (8 classes covering public-API wrapper, stage diagnostics)

docs/methodology/papers/dechaisemartin-2026-review.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -191,7 +191,7 @@ Alternative to Stute when `G` is large or heteroskedasticity is suspected.
191191
- [ ] Warnings for extensive-margin effects / positive mass of untreated (not fatal; suggests running existing DiD). **Status 2026-05-20 (partial):** `qug_test()` filters zero-dose observations upfront with a `UserWarning` naming the exclusion count — surfaces the *presence* of extensive-margin / positive-mass-of-untreated units to users running pre-tests. The paper-language "suggests running existing DiD" recommendation is NOT a separate fit-time warning on the main `HeterogeneousAdoptionDiD.fit()` path; this item remains open as a Low-priority follow-up tracked in `TODO.md`.
192192
- [x] Documentation of non-testability of Assumptions 5 and 6. **Phase 4 closure (2026-05-20):** `HeterogeneousAdoptionDiD.fit()` emits a `UserWarning` at fit time when `resolved_design ∈ {continuous_near_d_lower, mass_point}` (Design 1 family) explicitly flagging that point identification of `WAS_{d_lower}` requires Assumption 6, sign identification requires Assumption 5, and NEITHER is testable via pre-trends (`diff_diff/had.py`, search for "---- Assumption 5/6 warning on Design 1 paths ----"). The `HeterogeneousAdoptionDiD` class docstring + `qug_test` / `stute_test` / `yatchew_hr_test` / `did_had_pretest_workflow` Notes sections cross-reference this and explicitly state that the available pre-tests verify ADJACENT identifying conditions: QUG tests the Theorem 4 / Design 1' support-infimum null `d_lower = 0` — adjacent evidence on the `d_lower = 0` clause of Assumption 4 only, NOT a test of full Assumption 4's boundary-density / conditional-mean smoothness / variance regularity statement; the raw `stute_test` / `yatchew_hr_test` helpers test Assumption 8 linearity (residuals from `dy ~ 1 + d`); `joint_pretrends_test` tests Assumption 7 mean-independence (intercept-only residuals via `null_form="mean_independence"`). None of these test Assumptions 5 or 6 directly. The composite workflow verdict string does NOT mention Assumptions 5 or 6 — it only flags the Assumption 7 step-2 gap on the two-period `aggregate="overall"` path. The Assumption 5/6 caveat is surfaced separately by the Design 1 fit-time `UserWarning` and by T21 tutorial prose.
193193
- [x] Multi-period event-study extension (Appendix B.2). **Phase 2b implementation (2026-04):** `aggregate="event_study"` returns per-event-time WAS estimates using uniform `F-1` anchor. Staggered-timing contract (see L190 closure for full statement): when `first_treat_col` is supplied, the panel auto-filters to last-cohort + never-treated units with a `UserWarning` per Appendix B.2 prescription; when omitted on a multi-cohort panel, the estimator raises `ValueError` (fail-closed, see REGISTRY § "Library extension: Staggered-timing fail-closed"). Pointwise CIs per horizon (no joint cross-horizon covariance; matches paper's Pierce-Schott Figure 2). Pre-period placebos at `e <= -2`; the anchor `e = -1` is skipped since `ΔY = 0` there by construction.
194-
- [x] Joint Stute tests (paper Section 4.2 step 2 + Section 4.3 joint extension, pages 23-25 + 32). **Phase 3 follow-up (2026-04):** `stute_joint_pretest()` (residuals-in core) + `joint_pretrends_test()` (mean-independence null) + `joint_homogeneity_test()` (linearity null) in `diff_diff/had_pretests.py`. Sum-of-CvMs aggregation, shared-η Mammen wild bootstrap across horizons (Delgado-Manteiga 2001), per-horizon exact-linear short-circuit. Paper Eq (18) linear-trend detrending variant (Section 5.2 Pierce-Schott p=0.51) deferred to Phase 4 replication harness where the published value serves as parity anchor.
194+
- [x] Joint Stute tests (paper Section 4.2 step 2 + Section 4.3 joint extension, pages 23-25 + 32). **Phase 3 follow-up (2026-04):** `stute_joint_pretest()` (residuals-in core) + `joint_pretrends_test()` (mean-independence null) + `joint_homogeneity_test()` (linearity null) in `diff_diff/had_pretests.py`. Sum-of-CvMs aggregation, shared-η Mammen wild bootstrap across horizons (Delgado-Manteiga 2001), per-horizon exact-linear short-circuit. **Eq (18) linear-trend detrending variant SHIPPED (PR #389):** the `trends_lin: bool = False` keyword-only kwarg on `HeterogeneousAdoptionDiD.fit(aggregate="event_study")`, `joint_pretrends_test`, and `joint_homogeneity_test` applies the per-group linear-trend slope `Y[g, F-1] - Y[g, F-2]` adjustment. R parity validated against `DIDHAD::did_had(..., trends_lin=TRUE)` v2.0.0 (`Credible-Answers/did_had`) — see REGISTRY § "Note (Phase 4 — Eq 17 / Eq 18 linear-trend detrending shipped)". The Pierce-Schott (2016) NUMERICAL REPLICATION against the published p=0.51 anchor on the LBD-restricted panel is waived per REGISTRY Deviations Note #3.
195195

196196
**Eq (18) transcription (paper page 31):** The Pierce-Schott linear-trend-detrended joint Stute test of pre-trends reads
197197
```

tests/test_methodology_had.py

Lines changed: 24 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -20,8 +20,13 @@
2020
- Theorem 4 (QUG): T_lambda = (lambda + E_1) / E_2 limit law, lambda=0
2121
under H_0: d_lower = 0
2222
- Eq. 18 / (Algorithm): joint Stute pre-trends + homogeneity
23-
(mean-independence variant; Eq. 18 detrending
24-
deferred per REGISTRY checklist)
23+
(mean-independence and linearity nulls).
24+
The trends_lin=True linear-trend-detrended
25+
variant is shipped in the library (R-parity
26+
locked against DIDHAD::did_had(trends_lin=TRUE)
27+
in tests/test_did_had_parity.py) but is
28+
OUT OF SCOPE for this methodology file (no
29+
coverage duplication).
2530
- Eq. 29 / Theorem 7: T_hr = sqrt(G) (sigma2_lin - sigma2_diff) / sigma2_W
2631
2732
See:
@@ -701,9 +706,23 @@ class TestHADJointStute:
701706
The library ships the mean-independence variant in
702707
``joint_pretrends_test`` (residuals from OLS Y_t - Y_base ~ 1) and
703708
the linearity (homogeneity) variant in ``joint_homogeneity_test``
704-
(residuals from OLS Y_t - Y_base ~ 1 + D). The Eq. 18
705-
linear-trend-detrended variant is deferred per REGISTRY (Phase 4
706-
follow-up); this class targets the shipped mean-independence variant.
709+
(residuals from OLS Y_t - Y_base ~ 1 + D).
710+
711+
**Coverage scope of this class:** H0 fail-to-reject is exercised
712+
for both ``joint_pretrends_test`` (mean-independence null) and
713+
``joint_homogeneity_test`` (linearity null) on a linear-DGP panel
714+
where D is independent of pre-Y; H1 rejection is demonstrated on
715+
``joint_homogeneity_test`` only, via a nonlinear (D + D^2) post-
716+
period DGP. An H1 violating-pretrends test for
717+
``joint_pretrends_test`` is not added here (a synthetic
718+
correlated-D-vs-pre-Y DGP would re-verify the bootstrap
719+
calibration covered by ``test_had_pretests.py``).
720+
721+
The ``trends_lin=True`` Eq. 17 / Eq. 18 linear-trend-detrended
722+
variant is SHIPPED in the library and R-parity-locked against
723+
``DIDHAD::did_had(..., trends_lin=TRUE)`` in
724+
``tests/test_did_had_parity.py`` (3 DGPs x 5 method combos at
725+
``atol=1e-8``). It is OUT OF SCOPE for this methodology file.
707726
"""
708727

709728
def _build_multi_period_panel(

0 commit comments

Comments
 (0)