Skip to content

Commit 7c9f7f1

Browse files
authored
Merge pull request #346 from igerber/did-no-untreated
Phase 2a: HeterogeneousAdoptionDiD class (single-period, 3 design paths)
2 parents c76eea3 + 22d7f65 commit 7c9f7f1

6 files changed

Lines changed: 3686 additions & 6 deletions

File tree

TODO.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -89,6 +89,16 @@ Deferred items from PR reviews that were not addressed before merge.
8989
| `bias_corrected_local_linear`: support `weights=` once survey-design adaptation lands. nprobust's `lprobust` has no weight argument so there is no parity anchor; derivation needed. | `diff_diff/local_linear.py`, `diff_diff/_nprobust_port.py::lprobust` | Phase 1c | Medium |
9090
| `bias_corrected_local_linear`: support multi-eval grid (`neval > 1`) with cross-covariance (`covgrid=TRUE` branch of `lprobust.R:253-378`). Not needed for HAD but useful for multi-dose diagnostics. | `diff_diff/_nprobust_port.py::lprobust` | Phase 1c | Low |
9191
| Clustered-DGP parity: Phase 1c's DGP 4 uses manual `h=b=0.3` to sidestep an nprobust-internal singleton-cluster bug in `lpbwselect.mse.dpi`'s pilot fits. Once nprobust ships a fix (or we derive one independently), add a clustered-auto-bandwidth parity test. | `benchmarks/R/generate_nprobust_lprobust_golden.R` | Phase 1c | Low |
92+
| `HeterogeneousAdoptionDiD` Phase 2b: multi-period event-study extension (Appendix B.2). `aggregate="event_study"` raises `NotImplementedError` in Phase 2a; Phase 2b will aggregate per-cohort first-differences into an event-study result with `.event_study_effects` / `.event_study_se` result fields. | `diff_diff/had.py`, `tests/test_had.py` | Phase 2a | Medium |
93+
| `HeterogeneousAdoptionDiD`: survey-design integration (`survey=SurveyDesign(...)`). Currently raises `NotImplementedError`. Requires Taylor-linearization of the β-scale rescaling and replicate-weight-compatible 2SLS variance on the mass-point path. | `diff_diff/had.py` | Phase 2a | Medium |
94+
| `HeterogeneousAdoptionDiD`: `weights=` support. Deferred jointly with survey integration. nprobust's `lprobust` has no weight argument so the nonparametric continuous path needs a derivation; the 2SLS mass-point path needs weighted-sandwich parity. | `diff_diff/had.py` | Phase 2a | Medium |
95+
| `HeterogeneousAdoptionDiD` mass-point: `vcov_type in {"hc2", "hc2_bm"}` raises `NotImplementedError` pending a 2SLS-specific leverage derivation. The OLS leverage `x_i' (X'X)^{-1} x_i` is wrong for 2SLS; the correct finite-sample correction uses `x_i' (Z'X)^{-1} (...) (X'Z)^{-1} x_i`. Needs derivation plus an R / Stata (`ivreg2 small robust`) parity anchor. | `diff_diff/had.py::_fit_mass_point_2sls` | Phase 2a | Medium |
96+
| `HeterogeneousAdoptionDiD` continuous paths: thread `cluster=` through `bias_corrected_local_linear` (Phase 1c's wrapper already supports cluster; Phase 2a ignores it with a `UserWarning` on the continuous path to keep scope tight). | `diff_diff/had.py`, `diff_diff/local_linear.py` | Phase 2a | Low |
97+
| `HeterogeneousAdoptionDiD` Phase 3: `qug_test()`, `stute_test()`, `yatchew_hr_test()` pre-test diagnostics (paper Section 3.3). Composite helper `did_had_pretest_workflow()`. Not part of Phase 2a scope. | `diff_diff/had.py`, new module | Phase 2a | Medium |
98+
| `HeterogeneousAdoptionDiD` Phase 4: Pierce-Schott (2016) replication harness; reproduce paper Figure 2 values and Table 1 coverage rates. | `benchmarks/`, `tests/` | Phase 2a | Low |
99+
| `HeterogeneousAdoptionDiD` Phase 5: `practitioner_next_steps()` integration, tutorial notebook, and `llms.txt` updates (preserving UTF-8 fingerprint). | `diff_diff/practitioner.py`, `tutorials/`, `diff_diff/guides/` | Phase 2a | Low |
100+
| `HeterogeneousAdoptionDiD` staggered-timing reduction: Phase 2a requires exactly 2 time periods and raises on `>2` periods with or without `first_treat_col`. A "last-cohort subgroup" reduction scheme (slice to max-cohort's 2-period window) could lift this in a targeted follow-up PR before full Phase 2b multi-period aggregation. | `diff_diff/had.py::_validate_had_panel` | Phase 2a | Low |
101+
| `HeterogeneousAdoptionDiD` repeated-cross-section support: paper Section 2 defines HAD on panel OR repeated cross-section, but Phase 2a is panel-only. RCS inputs (disjoint unit IDs between periods) are rejected by the balanced-panel validator with the generic "unit(s) do not appear in both periods" error. A follow-up PR will add an RCS identification path based on pre/post cell means (rather than unit-level first differences), with its own validator and a distinct `data_mode` / API surface. | `diff_diff/had.py::_validate_had_panel`, `diff_diff/had.py::_aggregate_first_difference` | Phase 2a | Medium |
92102

93103
#### Performance
94104

diff_diff/__init__.py

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,10 @@
5555
triangular_kernel,
5656
uniform_kernel,
5757
)
58+
from diff_diff.had import (
59+
HeterogeneousAdoptionDiD,
60+
HeterogeneousAdoptionDiDResults,
61+
)
5862
from diff_diff.estimators import (
5963
DifferenceInDifferences,
6064
MultiPeriodDiD,
@@ -252,6 +256,7 @@
252256
EDiD = EfficientDiD
253257
ETWFE = WooldridgeDiD
254258
DCDH = ChaisemartinDHaultfoeuille
259+
HAD = HeterogeneousAdoptionDiD
255260

256261
__version__ = "3.2.0"
257262
__all__ = [
@@ -431,6 +436,10 @@
431436
# Bias-corrected local-linear (Phase 1c for HeterogeneousAdoptionDiD)
432437
"BiasCorrectedFit",
433438
"bias_corrected_local_linear",
439+
# HeterogeneousAdoptionDiD (Phase 2a)
440+
"HeterogeneousAdoptionDiD",
441+
"HeterogeneousAdoptionDiDResults",
442+
"HAD",
434443
# Datasets
435444
"load_card_krueger",
436445
"load_castle_doctrine",

0 commit comments

Comments
 (0)