igerber
diff --git a/‎CHANGELOG.md‎
Lines changed: 1 addition & 1 deletion b/‎CHANGELOG.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎diff_diff/guides/llms-full.txt‎
Lines changed: 19 additions & 14 deletions b/‎diff_diff/guides/llms-full.txt‎
Lines changed: 19 additions & 14 deletions
diff --git a/‎diff_diff/practitioner.py‎
Lines changed: 51 additions & 40 deletions b/‎diff_diff/practitioner.py‎
Lines changed: 51 additions & 40 deletions
@@ -8,7 +8,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 ## [Unreleased]
 
 ### Added
-- **HAD `practitioner_next_steps()` handler + `llms-full.txt` reference section** (Phase 5). Adds `_handle_had` and `_handle_had_event_study` to `diff_diff/practitioner.py::_HANDLERS`, routing both `HeterogeneousAdoptionDiDResults` (single-period) and `HeterogeneousAdoptionDiDEventStudyResults` (event-study) through HAD-specific Baker et al. (2025) step guidance: `did_had_pretest_workflow` (step 3 — paper Section 4.2 step-2 closure on the event-study path), `ContinuousDiD` / `CallawaySantAnna` routing nudge (step 4 — fires on the wrong-estimator-for-this-data path), `bandwidth_diagnostics` inspection on continuous designs and simultaneous (sup-t) `cband_*` reading on weighted event-study fits (step 6), per-horizon WAS event-study disaggregation (step 7), and the explicit design-auto-detection / last-cohort-only-WAS framing (step 8). Symmetric pair: `_handle_continuous` gains a Step-4 nudge to `HeterogeneousAdoptionDiD` for ContinuousDiD users on no-untreated panels — the routing loop is now bidirectional. Extends `_check_nan_att` with an ndarray branch via lazy `numpy` import for HAD's per-horizon `att` array; uses `np.all(np.isnan(arr))` semantics so partial-NaN arrays (legitimate event-study output under degenerate horizon-specific designs) do not over-fire the warning. Scalar path is bit-exact preserved across all 12 untouched handlers. Adds full HAD section + `HeterogeneousAdoptionDiDResults` / `HeterogeneousAdoptionDiDEventStudyResults` blocks + `## HAD Pretests` index covering all 7 pretest entry points + Choosing-an-Estimator row to `diff_diff/guides/llms-full.txt` (the bundled-in-wheel agent reference). Tightens the existing `Continuous treatment intensity` Choosing row to `(some units untreated)` so the contrast with the new HAD row is explicit. Framing convention follows the "no untreated unit" / dose variation language; locked by negative-assertion tests on both the handler text and the `llms-full.txt` HAD section. `docs/doc-deps.yaml` updated to remove the `llms-full.txt` deferral note on `had.py` and add `llms-full.txt` entries to `had.py`, `had_pretests.py`, and `practitioner.py` blocks. Patch-level (additive on stable surfaces). 21 new tests (14 in `tests/test_practitioner.py::TestHADDispatch` + 6 in `tests/test_guides.py::TestLLMsFullHADCoverage` + 1 fixture-minimality regression locking the "handlers are STRING-ONLY at runtime" stability invariant). Closes the Phase 5 "agent surfaces" gap; T21 pretest tutorial and T22 weighted/survey tutorial remain queued as separate notebook PRs.
+- **HAD `practitioner_next_steps()` handler + `llms-full.txt` reference section** (Phase 5). Adds `_handle_had` and `_handle_had_event_study` to `diff_diff/practitioner.py::_HANDLERS`, routing both `HeterogeneousAdoptionDiDResults` (single-period) and `HeterogeneousAdoptionDiDEventStudyResults` (event-study) through HAD-specific Baker et al. (2025) step guidance: `did_had_pretest_workflow` (step 3 — paper Section 4.2 step-2 closure on the event-study path), an estimand-difference routing nudge to `ContinuousDiD` (step 4 — fires when the user wants per-dose ATT(d) / ACRT(d) curves rather than HAD's WAS estimand and has never-treated controls; framed around estimand difference, NOT around the existence of untreated units, since HAD remains valid with a small never-treated share per REGISTRY § HeterogeneousAdoptionDiD edge cases and explicitly retains never-treated units on the staggered event-study path per paper Appendix B.2 / `had.py:1325`), `results.bandwidth_diagnostics` inspection on continuous designs and simultaneous (sup-t) `cband_*` reading on weighted event-study fits (step 6), per-horizon WAS event-study disaggregation (step 7), and the explicit design-auto-detection / last-cohort-only-WAS framing (step 8). Symmetric pair: `_handle_continuous` gains a Step-4 nudge to `HeterogeneousAdoptionDiD` for ContinuousDiD users on no-untreated panels (this direction is correct because ContinuousDiD's identification requires never-treated controls). Extends `_check_nan_att` with an ndarray branch via lazy `numpy` import for HAD's per-horizon `att` array; uses `np.all(np.isnan(arr))` semantics so partial-NaN arrays (legitimate event-study output under degenerate horizon-specific designs) do not over-fire the warning. Scalar path is bit-exact preserved across all 12 untouched handlers. Adds full HAD section + `HeterogeneousAdoptionDiDResults` / `HeterogeneousAdoptionDiDEventStudyResults` blocks + `## HAD Pretests` index covering all 7 pretest entry points + Choosing-an-Estimator row to `diff_diff/guides/llms-full.txt` (the bundled-in-wheel agent reference); the documented constructor + `fit()` signatures match the real `HeterogeneousAdoptionDiD.__init__` / `.fit` API exactly (verified by `inspect.signature`-based regression tests). Tightens the existing `Continuous treatment intensity` Choosing row to surface ATT(d) vs WAS as the estimand differentiator. `docs/doc-deps.yaml` updated to remove the `llms-full.txt` deferral note on `had.py` and add `llms-full.txt` entries to `had.py`, `had_pretests.py`, and `practitioner.py` blocks. Patch-level (additive on stable surfaces). 26 new tests (16 in `tests/test_practitioner.py::TestHADDispatch` + 9 in `tests/test_guides.py::TestLLMsFullHADCoverage` + 1 fixture-minimality regression locking the "handlers are STRING-ONLY at runtime" stability invariant). Closes the Phase 5 "agent surfaces" gap; T21 pretest tutorial and T22 weighted/survey tutorial remain queued as separate notebook PRs.
 
 ## [3.3.2] - 2026-04-26
 
 
@@ -592,17 +592,19 @@ results.print_summary()
 
 ### HeterogeneousAdoptionDiD
 
-HeterogeneousAdoption DiD estimator (de Chaisemartin, Ciccia, D'Haultfœuille & Knau 2026). Targets a Weighted Average Slope (WAS) on **Heterogeneous Adoption Designs where no unit remains untreated** — every unit receives the treatment at some positive dose level, so the comparison structure comes from dose variation across units rather than from an untreated holdout. Treatment varies in intensity, not in status. Uses a bias-corrected local-linear estimator at the dose support boundary on continuous-dose designs (Design 1' / Design 1) and a 2SLS Wald-IV estimator on the mass-point design.
+HeterogeneousAdoption DiD estimator (de Chaisemartin, Ciccia, D'Haultfœuille & Knau 2026). Targets a Weighted Average Slope (WAS) at the dose support boundary on **Heterogeneous Adoption Designs** — designs where treatment varies in dose intensity across units. Comparison comes from dose variation across units. The estimator does NOT require dropping never-treated units: a small share of never-treated units is fully compatible (paper edge case — Garrett et al. 2020 retained 12 untreated counties out of 2,954), and on staggered event-study panels never-treated units are explicitly retained as the untreated-group comparison (paper Appendix B.2). Uses a bias-corrected local-linear estimator at the dose support boundary on continuous-dose designs (Design 1' / Design 1) and a 2SLS Wald-IV estimator on the mass-point design.
 
 ```python
 HeterogeneousAdoptionDiD(
-    design: str = "auto",                     # "auto" / "continuous_at_zero" / "continuous_near_d_lower" / "mass_point"
+    design: str = "auto",              # "auto" / "continuous_at_zero" / "continuous_near_d_lower" / "mass_point"
+    d_lower: float | None = None,      # Support infimum; auto-detected when None
+    kernel: str = "epanechnikov",      # Local-linear kernel
     alpha: float = 0.05,
-    n_bootstrap: int = 999,                   # Multiplier-bootstrap iterations for sup-t bands
+    vcov_type: str | None = None,      # Mass-point only: "classical" (default) or "hc1"
+    robust: bool = False,              # Mass-point only: HC1 robust SE shorthand
+    cluster: str | None = None,        # Mass-point only: cluster column for CR1 cluster-robust SE
+    n_bootstrap: int = 999,            # Multiplier-bootstrap iterations for sup-t bands (event-study + weighted)
     seed: int | None = None,
-    h: float | None = None,                   # Bias-corrected local-linear bandwidth (auto-selected if None)
-    b: float | None = None,                   # Pilot bandwidth (auto-selected if None)
-    rcond: float | None = None,
 )
 ```
 
@@ -614,14 +616,17 @@ HeterogeneousAdoptionDiD(
 had.fit(
     data: pd.DataFrame,
     outcome_col: str,
-    unit_col: str,
-    time_col: str,
     dose_col: str,
+    time_col: str,
+    unit_col: str,
     first_treat_col: str | None = None,        # Required on staggered panels (last-cohort auto-filter trigger)
-    aggregate: str | None = None,              # None (single scalar WAS) or "event_study" (per-horizon WAS)
+    aggregate: str = "overall",                # "overall" (single scalar WAS) or "event_study" (per-horizon WAS)
+    survey: SurveyDesign | None = None,        # DEPRECATED alias of survey_design=
+    weights: np.ndarray | None = None,         # DEPRECATED pweight shortcut alias
     cband: bool = True,                        # Simultaneous (sup-t) confidence bands on weighted event-study fits
-    survey_design: SurveyDesign | None = None, # Survey weights, strata, PSU, FPC
-    weights: np.ndarray | None = None,         # pweight shortcut (mutually exclusive with survey_design)
+    *,
+    survey_design: SurveyDesign | None = None, # Canonical survey-design kwarg (weights, strata, PSU, FPC)
+    trends_lin: bool = False,                  # Eq 17 linear-trend detrending (event-study; mutually exclusive with survey_design)
 ) -> HeterogeneousAdoptionDiDResults | HeterogeneousAdoptionDiDEventStudyResults
 ```
 
@@ -636,7 +641,7 @@ report = did_had_pretest_workflow(
     dose_col='d', first_treat_col='first_treat')
 print(report.summary())
 
-# Single-period scalar WAS:
+# Single-period scalar WAS (aggregate="overall" default):
 est = HeterogeneousAdoptionDiD()
 results = est.fit(data, outcome_col='y', unit_col='unit',
                   time_col='t', dose_col='d',
@@ -1887,8 +1892,8 @@ DIFF_DIFF_BACKEND=rust pytest     # Force Rust (fail if unavailable)
 | Staggered treatment timing | `CallawaySantAnna`, `ImputationDiD`, or `SunAbraham` |
 | Few treated units / synthetic control | `SyntheticDiD` |
 | Interactive fixed effects / factor confounding | `TROP` |
-| Continuous treatment intensity (some units untreated) | `ContinuousDiD` |
-| No untreated unit / universal rollout (every unit treated at different doses) | `HeterogeneousAdoptionDiD` |
+| Continuous treatment intensity, per-dose ATT(d) / ACRT(d) (requires never-treated controls) | `ContinuousDiD` |
+| Continuous treatment intensity, WAS at dose support boundary (compatible with universal rollout or small never-treated share) | `HeterogeneousAdoptionDiD` |
 | Two-criterion treatment, simultaneous (2x2x2 DDD) | `TripleDifference` |
 | Two-criterion treatment, staggered timing + eligibility | `StaggeredTripleDifference` |
 | Nonlinear outcome (binary/count) with staggered timing | `WooldridgeDiD` |
 
@@ -876,28 +876,31 @@ def _handle_had(results: Any):
         ),
         _step(
             baker_step=4,
-            label="Switch to ContinuousDiD or CallawaySantAnna if untreated units exist",
+            label="Confirm WAS is the target estimand (vs ATT(d) for ContinuousDiD)",
             why=(
-                "HAD targets the no-untreated-unit case where every unit "
-                "is treated at some positive dose. If your panel actually "
-                "contains units with D = 0 (genuinely untreated), HAD's "
-                "WAS divisor under-weights the never-treated subset and a "
-                "different estimator is correct: ContinuousDiD for "
-                "dose-response on data with untreated controls, or "
-                "CallawaySantAnna for binary-staggered timing."
+                "HAD targets WAS (Weighted Average Slope) at the dose "
+                "support boundary. If you specifically want per-dose "
+                "ATT(d) / ACRT(d) dose-response curves AND your panel "
+                "has never-treated controls (units with first_treat == 0), "
+                "ContinuousDiD is the alternative — different estimand, "
+                "and ContinuousDiD's identification requires never-treated "
+                "controls. HAD itself remains valid even with a small "
+                "share of never-treated units (paper compatibility; see "
+                "REGISTRY § HeterogeneousAdoptionDiD edge cases — "
+                "Garrett et al. 2020 retained 12 untreated counties out "
+                "of 2,954). The choice is about estimand, not about "
+                "whether untreated units exist."
             ),
             code=(
-                "# Check for untreated units:\n"
-                "if (data['first_treat'] == 0).any():\n"
-                "    # Untreated units exist - switch to ContinuousDiD:\n"
-                "    from diff_diff import ContinuousDiD\n"
-                "    cdid = ContinuousDiD()\n"
-                "    cdid_results = cdid.fit(\n"
-                "        data, outcome='y', unit='unit', time='t',\n"
-                "        first_treat='first_treat', dose='d')\n"
-                "    # Or CallawaySantAnna for binary-staggered timing:\n"
-                "    # from diff_diff import CallawaySantAnna\n"
-                "    # cs = CallawaySantAnna(control_group='never_treated')"
+                "# HAD reports WAS at the dose support boundary.\n"
+                "# If you instead want per-dose ATT(d)/ACRT(d) dose-response\n"
+                "# curves AND the panel has never-treated controls:\n"
+                "from diff_diff import ContinuousDiD\n"
+                "cdid = ContinuousDiD()\n"
+                "cdid_results = cdid.fit(\n"
+                "    data, outcome='y', unit='unit', time='t',\n"
+                "    first_treat='first_treat', dose='d',\n"
+                "    aggregate='dose')"
             ),
             step_name="estimator_selection",
         ),
@@ -910,13 +913,12 @@ def _handle_had(results: Any):
                 "for the bias-corrected local-linear estimator. Bandwidth "
                 "choice affects WAS - verify the selector landed on a "
                 "viable bandwidth (not boundary-clipped or near-degenerate). "
-                "result.bandwidth_diagnostics is None on the mass_point "
+                "results.bandwidth_diagnostics is None on the mass_point "
                 "design (parametric, no bandwidth)."
             ),
             code=(
                 "# Inspect the auto-selected bandwidths:\n"
-                "result.bandwidth_diagnostics  # None on mass_point\n"
-                "# Re-fit with explicit h= / b= to test sensitivity"
+                "results.bandwidth_diagnostics  # None on mass_point"
             ),
             priority="medium",
             step_name="sensitivity",
@@ -1005,23 +1007,29 @@ def _handle_had_event_study(results: Any):
         ),
         _step(
             baker_step=4,
-            label="Switch to ContinuousDiD or CallawaySantAnna if untreated units exist",
+            label="Confirm WAS is the target estimand (vs ATT(d) for ContinuousDiD)",
             why=(
-                "HAD targets the no-untreated-unit case. If your panel "
-                "contains units with D = 0, switch to "
-                "ContinuousDiD(aggregate='eventstudy') for dose-response "
-                "event study with untreated controls, or CallawaySantAnna "
-                "with aggregate='event_study' for binary-staggered timing."
+                "HAD targets per-event-time WAS at the dose support "
+                "boundary. If you instead want per-dose ATT(d) / ACRT(d) "
+                "dose-response curves AND your panel has never-treated "
+                "controls, ContinuousDiD(aggregate='eventstudy') is the "
+                "alternative — different estimand, requires never-treated. "
+                "HAD itself remains valid even with a small share of "
+                "never-treated units (paper compatibility); on staggered "
+                "panels HAD's last-cohort filter explicitly RETAINS "
+                "never-treated units as the untreated-group comparison "
+                "(paper Appendix B.2). The choice is about estimand."
             ),
             code=(
-                "# Check for untreated units:\n"
-                "if (data['first_treat'] == 0).any():\n"
-                "    from diff_diff import ContinuousDiD\n"
-                "    cdid = ContinuousDiD()\n"
-                "    es = cdid.fit(\n"
-                "        data, outcome='y', unit='unit', time='t',\n"
-                "        first_treat='first_treat', dose='d',\n"
-                "        aggregate='eventstudy')"
+                "# HAD reports per-event-time WAS at the dose boundary.\n"
+                "# If you instead want per-dose ATT(d)/ACRT(d) event-study\n"
+                "# curves AND the panel has never-treated controls:\n"
+                "from diff_diff import ContinuousDiD\n"
+                "cdid = ContinuousDiD()\n"
+                "cdid_es = cdid.fit(\n"
+                "    data, outcome='y', unit='unit', time='t',\n"
+                "    first_treat='first_treat', dose='d',\n"
+                "    aggregate='eventstudy')"
             ),
             step_name="estimator_selection",
         ),
@@ -1033,18 +1041,21 @@ def _handle_had_event_study(results: Any):
                 "as a joint pattern. On weighted fits (survey_design= or "
                 "weights=), fit(cband=True) constructs simultaneous (sup-t) "
                 "bands across horizons via multiplier bootstrap. "
-                "result.cband_low / cband_high give the band endpoints; "
-                "cband_crit_value reports the sup-t critical value used."
+                "results.cband_low / results.cband_high give the band "
+                "endpoints; results.cband_crit_value reports the sup-t "
+                "critical value used."
             ),
             code=(
-                "from diff_diff import HeterogeneousAdoptionDiD\n"
+                "from diff_diff import HeterogeneousAdoptionDiD, SurveyDesign\n"
+                "# Construct your survey design (adapt to your data):\n"
+                "sd = SurveyDesign(weights='weight_col')\n"
                 "est = HeterogeneousAdoptionDiD(n_bootstrap=999, seed=42)\n"
                 "es = est.fit(\n"
                 "    data, outcome_col='y', unit_col='unit',\n"
                 "    time_col='t', dose_col='d',\n"
                 "    first_treat_col='first_treat',\n"
                 "    aggregate='event_study',\n"
-                "    survey_design=design, cband=True)\n"
+                "    survey_design=sd, cband=True)\n"
                 "es.cband_low, es.cband_high  # simultaneous band endpoints"
             ),
             priority="medium",