Skip to content

Commit 6d6e950

Browse files
igerberclaude
andcommitted
Address PR #402 R1 review (1 P1, 4 P2)
P1 methodology fix: Step-4 routing in _handle_had + _handle_had_event_study no longer says "switch away from HAD if untreated units exist" - that contradicts REGISTRY § HeterogeneousAdoptionDiD edge cases (line 2403: "Authors do NOT require untreated units to be dropped"; line 2408 + had.py:1325: never-treated units RETAINED on staggered event-study). Reframed as the actual estimand differentiator: HAD targets WAS at the dose support boundary; ContinuousDiD targets per-dose ATT(d) / ACRT(d) and requires never-treated controls. Routing fires only when the user wants the ATT(d) estimand AND has never-treated controls, not on "untreated units exist". Tightens the corresponding Choosing-an-Estimator table row to surface WAS vs ATT(d) as the differentiator. P2 (a) signatures: llms-full.txt HAD constructor + fit() blocks now match the actual HeterogeneousAdoptionDiD.__init__ / .fit signatures exactly. Drops invented kwargs (h, b, rcond) and adds the real ones (d_lower, kernel, vcov_type, robust, cluster). aggregate default corrected from None to "overall". fit() now lists survey, weights, cband (positional-or-keyword) and survey_design + trends_lin (keyword-only). P2 (b) snippet bugs: result.bandwidth_diagnostics -> results.bandwidth_diagnostics (matching the plural convention of other handlers); sup-t snippet now imports SurveyDesign and constructs sd before passing survey_design=sd (was survey_design=design with no design defined). P2 (c) tests: New TestLLMsFullHADCoverage tests use inspect.signature(HeterogeneousAdoptionDiD.__init__) and .fit() to regress the documented signatures against the real API. New test_llms_full_had_section_methodology_compatible_with_untreated locks the negative assertion that the section does NOT carry framing contradicting the registry. Practitioner tests gain test_had_step_4_does_not_misframe_untreated_unit_routing + test_had_handler_snippets_are_valid_python_syntax (catches snippet syntax errors via ast.parse) + test_handle_continuous_step_4_snippet_is_valid_python. 83 tests pass (47 in test_practitioner including 5 new + 36 in test_guides including 9 new). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent d152b50 commit 6d6e950

5 files changed

Lines changed: 237 additions & 72 deletions

File tree

CHANGELOG.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
88
## [Unreleased]
99

1010
### Added
11-
- **HAD `practitioner_next_steps()` handler + `llms-full.txt` reference section** (Phase 5). Adds `_handle_had` and `_handle_had_event_study` to `diff_diff/practitioner.py::_HANDLERS`, routing both `HeterogeneousAdoptionDiDResults` (single-period) and `HeterogeneousAdoptionDiDEventStudyResults` (event-study) through HAD-specific Baker et al. (2025) step guidance: `did_had_pretest_workflow` (step 3 — paper Section 4.2 step-2 closure on the event-study path), `ContinuousDiD` / `CallawaySantAnna` routing nudge (step 4 — fires on the wrong-estimator-for-this-data path), `bandwidth_diagnostics` inspection on continuous designs and simultaneous (sup-t) `cband_*` reading on weighted event-study fits (step 6), per-horizon WAS event-study disaggregation (step 7), and the explicit design-auto-detection / last-cohort-only-WAS framing (step 8). Symmetric pair: `_handle_continuous` gains a Step-4 nudge to `HeterogeneousAdoptionDiD` for ContinuousDiD users on no-untreated panels — the routing loop is now bidirectional. Extends `_check_nan_att` with an ndarray branch via lazy `numpy` import for HAD's per-horizon `att` array; uses `np.all(np.isnan(arr))` semantics so partial-NaN arrays (legitimate event-study output under degenerate horizon-specific designs) do not over-fire the warning. Scalar path is bit-exact preserved across all 12 untouched handlers. Adds full HAD section + `HeterogeneousAdoptionDiDResults` / `HeterogeneousAdoptionDiDEventStudyResults` blocks + `## HAD Pretests` index covering all 7 pretest entry points + Choosing-an-Estimator row to `diff_diff/guides/llms-full.txt` (the bundled-in-wheel agent reference). Tightens the existing `Continuous treatment intensity` Choosing row to `(some units untreated)` so the contrast with the new HAD row is explicit. Framing convention follows the "no untreated unit" / dose variation language; locked by negative-assertion tests on both the handler text and the `llms-full.txt` HAD section. `docs/doc-deps.yaml` updated to remove the `llms-full.txt` deferral note on `had.py` and add `llms-full.txt` entries to `had.py`, `had_pretests.py`, and `practitioner.py` blocks. Patch-level (additive on stable surfaces). 21 new tests (14 in `tests/test_practitioner.py::TestHADDispatch` + 6 in `tests/test_guides.py::TestLLMsFullHADCoverage` + 1 fixture-minimality regression locking the "handlers are STRING-ONLY at runtime" stability invariant). Closes the Phase 5 "agent surfaces" gap; T21 pretest tutorial and T22 weighted/survey tutorial remain queued as separate notebook PRs.
11+
- **HAD `practitioner_next_steps()` handler + `llms-full.txt` reference section** (Phase 5). Adds `_handle_had` and `_handle_had_event_study` to `diff_diff/practitioner.py::_HANDLERS`, routing both `HeterogeneousAdoptionDiDResults` (single-period) and `HeterogeneousAdoptionDiDEventStudyResults` (event-study) through HAD-specific Baker et al. (2025) step guidance: `did_had_pretest_workflow` (step 3 — paper Section 4.2 step-2 closure on the event-study path), an estimand-difference routing nudge to `ContinuousDiD` (step 4 — fires when the user wants per-dose ATT(d) / ACRT(d) curves rather than HAD's WAS estimand and has never-treated controls; framed around estimand difference, NOT around the existence of untreated units, since HAD remains valid with a small never-treated share per REGISTRY § HeterogeneousAdoptionDiD edge cases and explicitly retains never-treated units on the staggered event-study path per paper Appendix B.2 / `had.py:1325`), `results.bandwidth_diagnostics` inspection on continuous designs and simultaneous (sup-t) `cband_*` reading on weighted event-study fits (step 6), per-horizon WAS event-study disaggregation (step 7), and the explicit design-auto-detection / last-cohort-only-WAS framing (step 8). Symmetric pair: `_handle_continuous` gains a Step-4 nudge to `HeterogeneousAdoptionDiD` for ContinuousDiD users on no-untreated panels (this direction is correct because ContinuousDiD's identification requires never-treated controls). Extends `_check_nan_att` with an ndarray branch via lazy `numpy` import for HAD's per-horizon `att` array; uses `np.all(np.isnan(arr))` semantics so partial-NaN arrays (legitimate event-study output under degenerate horizon-specific designs) do not over-fire the warning. Scalar path is bit-exact preserved across all 12 untouched handlers. Adds full HAD section + `HeterogeneousAdoptionDiDResults` / `HeterogeneousAdoptionDiDEventStudyResults` blocks + `## HAD Pretests` index covering all 7 pretest entry points + Choosing-an-Estimator row to `diff_diff/guides/llms-full.txt` (the bundled-in-wheel agent reference); the documented constructor + `fit()` signatures match the real `HeterogeneousAdoptionDiD.__init__` / `.fit` API exactly (verified by `inspect.signature`-based regression tests). Tightens the existing `Continuous treatment intensity` Choosing row to surface ATT(d) vs WAS as the estimand differentiator. `docs/doc-deps.yaml` updated to remove the `llms-full.txt` deferral note on `had.py` and add `llms-full.txt` entries to `had.py`, `had_pretests.py`, and `practitioner.py` blocks. Patch-level (additive on stable surfaces). 26 new tests (16 in `tests/test_practitioner.py::TestHADDispatch` + 9 in `tests/test_guides.py::TestLLMsFullHADCoverage` + 1 fixture-minimality regression locking the "handlers are STRING-ONLY at runtime" stability invariant). Closes the Phase 5 "agent surfaces" gap; T21 pretest tutorial and T22 weighted/survey tutorial remain queued as separate notebook PRs.
1212

1313
## [3.3.2] - 2026-04-26
1414

diff_diff/guides/llms-full.txt

Lines changed: 19 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -592,17 +592,19 @@ results.print_summary()
592592

593593
### HeterogeneousAdoptionDiD
594594

595-
HeterogeneousAdoption DiD estimator (de Chaisemartin, Ciccia, D'Haultfœuille & Knau 2026). Targets a Weighted Average Slope (WAS) on **Heterogeneous Adoption Designs where no unit remains untreated** — every unit receives the treatment at some positive dose level, so the comparison structure comes from dose variation across units rather than from an untreated holdout. Treatment varies in intensity, not in status. Uses a bias-corrected local-linear estimator at the dose support boundary on continuous-dose designs (Design 1' / Design 1) and a 2SLS Wald-IV estimator on the mass-point design.
595+
HeterogeneousAdoption DiD estimator (de Chaisemartin, Ciccia, D'Haultfœuille & Knau 2026). Targets a Weighted Average Slope (WAS) at the dose support boundary on **Heterogeneous Adoption Designs** — designs where treatment varies in dose intensity across units. Comparison comes from dose variation across units. The estimator does NOT require dropping never-treated units: a small share of never-treated units is fully compatible (paper edge case — Garrett et al. 2020 retained 12 untreated counties out of 2,954), and on staggered event-study panels never-treated units are explicitly retained as the untreated-group comparison (paper Appendix B.2). Uses a bias-corrected local-linear estimator at the dose support boundary on continuous-dose designs (Design 1' / Design 1) and a 2SLS Wald-IV estimator on the mass-point design.
596596

597597
```python
598598
HeterogeneousAdoptionDiD(
599-
design: str = "auto", # "auto" / "continuous_at_zero" / "continuous_near_d_lower" / "mass_point"
599+
design: str = "auto", # "auto" / "continuous_at_zero" / "continuous_near_d_lower" / "mass_point"
600+
d_lower: float | None = None, # Support infimum; auto-detected when None
601+
kernel: str = "epanechnikov", # Local-linear kernel
600602
alpha: float = 0.05,
601-
n_bootstrap: int = 999, # Multiplier-bootstrap iterations for sup-t bands
603+
vcov_type: str | None = None, # Mass-point only: "classical" (default) or "hc1"
604+
robust: bool = False, # Mass-point only: HC1 robust SE shorthand
605+
cluster: str | None = None, # Mass-point only: cluster column for CR1 cluster-robust SE
606+
n_bootstrap: int = 999, # Multiplier-bootstrap iterations for sup-t bands (event-study + weighted)
602607
seed: int | None = None,
603-
h: float | None = None, # Bias-corrected local-linear bandwidth (auto-selected if None)
604-
b: float | None = None, # Pilot bandwidth (auto-selected if None)
605-
rcond: float | None = None,
606608
)
607609
```
608610

@@ -614,14 +616,17 @@ HeterogeneousAdoptionDiD(
614616
had.fit(
615617
data: pd.DataFrame,
616618
outcome_col: str,
617-
unit_col: str,
618-
time_col: str,
619619
dose_col: str,
620+
time_col: str,
621+
unit_col: str,
620622
first_treat_col: str | None = None, # Required on staggered panels (last-cohort auto-filter trigger)
621-
aggregate: str | None = None, # None (single scalar WAS) or "event_study" (per-horizon WAS)
623+
aggregate: str = "overall", # "overall" (single scalar WAS) or "event_study" (per-horizon WAS)
624+
survey: SurveyDesign | None = None, # DEPRECATED alias of survey_design=
625+
weights: np.ndarray | None = None, # DEPRECATED pweight shortcut alias
622626
cband: bool = True, # Simultaneous (sup-t) confidence bands on weighted event-study fits
623-
survey_design: SurveyDesign | None = None, # Survey weights, strata, PSU, FPC
624-
weights: np.ndarray | None = None, # pweight shortcut (mutually exclusive with survey_design)
627+
*,
628+
survey_design: SurveyDesign | None = None, # Canonical survey-design kwarg (weights, strata, PSU, FPC)
629+
trends_lin: bool = False, # Eq 17 linear-trend detrending (event-study; mutually exclusive with survey_design)
625630
) -> HeterogeneousAdoptionDiDResults | HeterogeneousAdoptionDiDEventStudyResults
626631
```
627632

@@ -636,7 +641,7 @@ report = did_had_pretest_workflow(
636641
dose_col='d', first_treat_col='first_treat')
637642
print(report.summary())
638643

639-
# Single-period scalar WAS:
644+
# Single-period scalar WAS (aggregate="overall" default):
640645
est = HeterogeneousAdoptionDiD()
641646
results = est.fit(data, outcome_col='y', unit_col='unit',
642647
time_col='t', dose_col='d',
@@ -1887,8 +1892,8 @@ DIFF_DIFF_BACKEND=rust pytest # Force Rust (fail if unavailable)
18871892
| Staggered treatment timing | `CallawaySantAnna`, `ImputationDiD`, or `SunAbraham` |
18881893
| Few treated units / synthetic control | `SyntheticDiD` |
18891894
| Interactive fixed effects / factor confounding | `TROP` |
1890-
| Continuous treatment intensity (some units untreated) | `ContinuousDiD` |
1891-
| No untreated unit / universal rollout (every unit treated at different doses) | `HeterogeneousAdoptionDiD` |
1895+
| Continuous treatment intensity, per-dose ATT(d) / ACRT(d) (requires never-treated controls) | `ContinuousDiD` |
1896+
| Continuous treatment intensity, WAS at dose support boundary (compatible with universal rollout or small never-treated share) | `HeterogeneousAdoptionDiD` |
18921897
| Two-criterion treatment, simultaneous (2x2x2 DDD) | `TripleDifference` |
18931898
| Two-criterion treatment, staggered timing + eligibility | `StaggeredTripleDifference` |
18941899
| Nonlinear outcome (binary/count) with staggered timing | `WooldridgeDiD` |

diff_diff/practitioner.py

Lines changed: 51 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -876,28 +876,31 @@ def _handle_had(results: Any):
876876
),
877877
_step(
878878
baker_step=4,
879-
label="Switch to ContinuousDiD or CallawaySantAnna if untreated units exist",
879+
label="Confirm WAS is the target estimand (vs ATT(d) for ContinuousDiD)",
880880
why=(
881-
"HAD targets the no-untreated-unit case where every unit "
882-
"is treated at some positive dose. If your panel actually "
883-
"contains units with D = 0 (genuinely untreated), HAD's "
884-
"WAS divisor under-weights the never-treated subset and a "
885-
"different estimator is correct: ContinuousDiD for "
886-
"dose-response on data with untreated controls, or "
887-
"CallawaySantAnna for binary-staggered timing."
881+
"HAD targets WAS (Weighted Average Slope) at the dose "
882+
"support boundary. If you specifically want per-dose "
883+
"ATT(d) / ACRT(d) dose-response curves AND your panel "
884+
"has never-treated controls (units with first_treat == 0), "
885+
"ContinuousDiD is the alternative — different estimand, "
886+
"and ContinuousDiD's identification requires never-treated "
887+
"controls. HAD itself remains valid even with a small "
888+
"share of never-treated units (paper compatibility; see "
889+
"REGISTRY § HeterogeneousAdoptionDiD edge cases — "
890+
"Garrett et al. 2020 retained 12 untreated counties out "
891+
"of 2,954). The choice is about estimand, not about "
892+
"whether untreated units exist."
888893
),
889894
code=(
890-
"# Check for untreated units:\n"
891-
"if (data['first_treat'] == 0).any():\n"
892-
" # Untreated units exist - switch to ContinuousDiD:\n"
893-
" from diff_diff import ContinuousDiD\n"
894-
" cdid = ContinuousDiD()\n"
895-
" cdid_results = cdid.fit(\n"
896-
" data, outcome='y', unit='unit', time='t',\n"
897-
" first_treat='first_treat', dose='d')\n"
898-
" # Or CallawaySantAnna for binary-staggered timing:\n"
899-
" # from diff_diff import CallawaySantAnna\n"
900-
" # cs = CallawaySantAnna(control_group='never_treated')"
895+
"# HAD reports WAS at the dose support boundary.\n"
896+
"# If you instead want per-dose ATT(d)/ACRT(d) dose-response\n"
897+
"# curves AND the panel has never-treated controls:\n"
898+
"from diff_diff import ContinuousDiD\n"
899+
"cdid = ContinuousDiD()\n"
900+
"cdid_results = cdid.fit(\n"
901+
" data, outcome='y', unit='unit', time='t',\n"
902+
" first_treat='first_treat', dose='d',\n"
903+
" aggregate='dose')"
901904
),
902905
step_name="estimator_selection",
903906
),
@@ -910,13 +913,12 @@ def _handle_had(results: Any):
910913
"for the bias-corrected local-linear estimator. Bandwidth "
911914
"choice affects WAS - verify the selector landed on a "
912915
"viable bandwidth (not boundary-clipped or near-degenerate). "
913-
"result.bandwidth_diagnostics is None on the mass_point "
916+
"results.bandwidth_diagnostics is None on the mass_point "
914917
"design (parametric, no bandwidth)."
915918
),
916919
code=(
917920
"# Inspect the auto-selected bandwidths:\n"
918-
"result.bandwidth_diagnostics # None on mass_point\n"
919-
"# Re-fit with explicit h= / b= to test sensitivity"
921+
"results.bandwidth_diagnostics # None on mass_point"
920922
),
921923
priority="medium",
922924
step_name="sensitivity",
@@ -1005,23 +1007,29 @@ def _handle_had_event_study(results: Any):
10051007
),
10061008
_step(
10071009
baker_step=4,
1008-
label="Switch to ContinuousDiD or CallawaySantAnna if untreated units exist",
1010+
label="Confirm WAS is the target estimand (vs ATT(d) for ContinuousDiD)",
10091011
why=(
1010-
"HAD targets the no-untreated-unit case. If your panel "
1011-
"contains units with D = 0, switch to "
1012-
"ContinuousDiD(aggregate='eventstudy') for dose-response "
1013-
"event study with untreated controls, or CallawaySantAnna "
1014-
"with aggregate='event_study' for binary-staggered timing."
1012+
"HAD targets per-event-time WAS at the dose support "
1013+
"boundary. If you instead want per-dose ATT(d) / ACRT(d) "
1014+
"dose-response curves AND your panel has never-treated "
1015+
"controls, ContinuousDiD(aggregate='eventstudy') is the "
1016+
"alternative — different estimand, requires never-treated. "
1017+
"HAD itself remains valid even with a small share of "
1018+
"never-treated units (paper compatibility); on staggered "
1019+
"panels HAD's last-cohort filter explicitly RETAINS "
1020+
"never-treated units as the untreated-group comparison "
1021+
"(paper Appendix B.2). The choice is about estimand."
10151022
),
10161023
code=(
1017-
"# Check for untreated units:\n"
1018-
"if (data['first_treat'] == 0).any():\n"
1019-
" from diff_diff import ContinuousDiD\n"
1020-
" cdid = ContinuousDiD()\n"
1021-
" es = cdid.fit(\n"
1022-
" data, outcome='y', unit='unit', time='t',\n"
1023-
" first_treat='first_treat', dose='d',\n"
1024-
" aggregate='eventstudy')"
1024+
"# HAD reports per-event-time WAS at the dose boundary.\n"
1025+
"# If you instead want per-dose ATT(d)/ACRT(d) event-study\n"
1026+
"# curves AND the panel has never-treated controls:\n"
1027+
"from diff_diff import ContinuousDiD\n"
1028+
"cdid = ContinuousDiD()\n"
1029+
"cdid_es = cdid.fit(\n"
1030+
" data, outcome='y', unit='unit', time='t',\n"
1031+
" first_treat='first_treat', dose='d',\n"
1032+
" aggregate='eventstudy')"
10251033
),
10261034
step_name="estimator_selection",
10271035
),
@@ -1033,18 +1041,21 @@ def _handle_had_event_study(results: Any):
10331041
"as a joint pattern. On weighted fits (survey_design= or "
10341042
"weights=), fit(cband=True) constructs simultaneous (sup-t) "
10351043
"bands across horizons via multiplier bootstrap. "
1036-
"result.cband_low / cband_high give the band endpoints; "
1037-
"cband_crit_value reports the sup-t critical value used."
1044+
"results.cband_low / results.cband_high give the band "
1045+
"endpoints; results.cband_crit_value reports the sup-t "
1046+
"critical value used."
10381047
),
10391048
code=(
1040-
"from diff_diff import HeterogeneousAdoptionDiD\n"
1049+
"from diff_diff import HeterogeneousAdoptionDiD, SurveyDesign\n"
1050+
"# Construct your survey design (adapt to your data):\n"
1051+
"sd = SurveyDesign(weights='weight_col')\n"
10411052
"est = HeterogeneousAdoptionDiD(n_bootstrap=999, seed=42)\n"
10421053
"es = est.fit(\n"
10431054
" data, outcome_col='y', unit_col='unit',\n"
10441055
" time_col='t', dose_col='d',\n"
10451056
" first_treat_col='first_treat',\n"
10461057
" aggregate='event_study',\n"
1047-
" survey_design=design, cband=True)\n"
1058+
" survey_design=sd, cband=True)\n"
10481059
"es.cband_low, es.cband_high # simultaneous band endpoints"
10491060
),
10501061
priority="medium",

0 commit comments

Comments
 (0)