igerber
diff --git a/‎CHANGELOG.md‎
Lines changed: 6 additions & 0 deletions b/‎CHANGELOG.md‎
Lines changed: 6 additions & 0 deletions
diff --git a/‎TODO.md‎
Lines changed: 1 addition & 0 deletions b/‎TODO.md‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎diff_diff/__init__.py‎
Lines changed: 2 additions & 0 deletions b/‎diff_diff/__init__.py‎
Lines changed: 2 additions & 0 deletions
diff --git a/‎diff_diff/had.py‎
Lines changed: 56 additions & 29 deletions b/‎diff_diff/had.py‎
Lines changed: 56 additions & 29 deletions
@@ -7,6 +7,12 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 
 ## [Unreleased]
 
+### Changed
+- **HAD survey-design API consolidated to single `survey_design=` kwarg** across all 8 HAD surfaces: `HeterogeneousAdoptionDiD.fit`, `did_had_pretest_workflow`, `qug_test`, `stute_test`, `yatchew_hr_test`, `stute_joint_pretest`, `joint_pretrends_test`, `joint_homogeneity_test`. Matches the rest of the library (`ContinuousDiD`, `EfficientDiD`, `ChaisemartinDHaultfoeuille` already used `survey_design=`). On data-in surfaces (HAD.fit, workflow, joint data-in wrappers) `survey_design=` accepts a `SurveyDesign` instance (column references resolved against `data` at fit time, same convention as the rest of the library). On array-in surfaces (`stute_test`, `yatchew_hr_test`, `stute_joint_pretest`, `qug_test`) `survey_design=` accepts a pre-resolved `ResolvedSurveyDesign`; passing a `SurveyDesign` raises `TypeError` with migration guidance (no `data` to resolve column names against). New public helper `make_pweight_design(weights: np.ndarray) -> ResolvedSurveyDesign` exported from the `diff_diff` top level for the pweight-only convenience on array-in helpers (formerly the private `survey._make_trivial_resolved`, kept as a permanent private alias). Three-way mutex (`survey_design + survey + weights`) extends the prior 2-way (`survey + weights`) — at most one may be non-None per call; two distinct error messages per surface group point users to the right migration target. Patch-level addition (additive new kwarg + permanent alias for the helper; no breaking changes this release).
+
+### Deprecated
+- **`HeterogeneousAdoptionDiD.fit(survey=, weights=)`, `did_had_pretest_workflow(survey=, weights=)`, and the 6 HAD pretest helpers' `survey=` / `weights=` kwargs are deprecated** in favor of the canonical `survey_design=`. Emits `DeprecationWarning` with migration guidance; the deprecated kwargs continue to route through the unchanged legacy back-end paths so numerical results are identical to pre-PR (bit-exact regression locked by parity tests in `tests/test_had_dual_knob_deprecation.py`). Both `survey=` and `weights=` will be removed in the next minor release.
+
 ### Added
 - **HAD linearity-family pretests under survey (Phase 4.5 C).** `stute_test`, `yatchew_hr_test`, `stute_joint_pretest`, `joint_pretrends_test`, `joint_homogeneity_test`, and `did_had_pretest_workflow` now accept `weights=` / `survey=` keyword-only kwargs. Stute family uses **PSU-level Mammen multiplier bootstrap** via `bootstrap_utils.generate_survey_multiplier_weights_batch` (the same kernel as PR #363's HAD event-study sup-t bootstrap): each replicate draws an `(n_bootstrap, n_psu)` Mammen multiplier matrix, broadcast to per-obs perturbation `eta_obs[g] = eta_psu[psu(g)]`, weighted OLS refit, weighted CvM via new `_cvm_statistic_weighted` helper. Joint Stute SHARES the multiplier matrix across horizons within each replicate, preserving both the vector-valued empirical-process unit-level dependence AND PSU clustering. Yatchew uses **closed-form weighted OLS + pweight-sandwich variance components** (no bootstrap): `sigma2_lin = sum(w·eps²)/sum(w)`, `sigma2_diff = sum(w_avg·diff²)/(2·sum(w))` with arithmetic-mean pair weights `w_avg_g = (w_g+w_{g-1})/2`, `sigma4_W = sum(w_avg·prod)/sum(w_avg)`, `T_hr = sqrt(sum(w))·(sigma2_lin-sigma2_diff)/sigma2_W`. All three Yatchew components reduce bit-exactly to the unweighted formulas at `w=ones(G)` (locked at `atol=1e-14` by direct helper test). The pweight `weights=` shortcut routes through a synthetic trivial `ResolvedSurveyDesign` (new `survey._make_trivial_resolved` helper) so the same kernel handles both entry paths. `did_had_pretest_workflow(..., survey=, weights=)` removes the Phase 4.5 C0 `NotImplementedError`, dispatches to the survey-aware sub-tests, **skips the QUG step with `UserWarning`** (per C0 deferral), sets `qug=None` on the report, and appends a `"linearity-conditional verdict; QUG-under-survey deferred per Phase 4.5 C0"` suffix to the verdict. `HADPretestReport.qug` retyped from `QUGTestResults` to `Optional[QUGTestResults]`; `summary()` / `to_dict()` / `to_dataframe()` updated to None-tolerant rendering. Replicate-weight survey designs (BRR/Fay/JK1/JKn/SDR) raise `NotImplementedError` at every entry point (defense in depth, reciprocal-guard discipline) — parallel follow-up after this PR. **Stratified designs (`SurveyDesign(strata=...)`) also raise `NotImplementedError` on the Stute family** — the within-stratum demean + `sqrt(n_h/(n_h-1))` correction that the HAD sup-t bootstrap applies to match the Binder-TSL stratified target has not been derived for the Stute CvM functional, so applying raw multipliers from `generate_survey_multiplier_weights_batch` directly to residual perturbations would leave the bootstrap p-value silently miscalibrated. Phase 4.5 C narrows survey support to **pweight-only**, **PSU-only** (`SurveyDesign(weights=, psu=)`), and **FPC-only** (`SurveyDesign(weights=, fpc=)`) designs; stratified is a follow-up after the matching Stute-CvM stratified-correction derivation lands. Strictly positive weights required on Yatchew (the adjacent-difference variance is undefined under contiguous-zero blocks). Per-row `weights=` / `survey=col` aggregated to per-unit via existing HAD helpers `_aggregate_unit_weights` / `_aggregate_unit_resolved_survey` (constant-within-unit invariant enforced). Unweighted code paths preserved bit-exactly. Patch-level addition (additive on stable surfaces). See `docs/methodology/REGISTRY.md` § "QUG Null Test" — Note (Phase 4.5 C) for the full methodology.
 - **`ChaisemartinDHaultfoeuille.by_path` + `placebo=True`** — per-path backward-horizon placebos `DID^{pl}_{path, l}` for `l = 1..L_max`. The same per-path SE convention used for the event-study (joiners/leavers IF precedent: switcher-side contributions zeroed for non-path groups; cohort structure and control pool unchanged; plug-in SE with path-specific divisor `N^{pl}_{l, path}`) is applied to backward horizons via the new `switcher_subset_mask` parameter on `_compute_per_group_if_placebo_horizon`. Surfaced on `results.path_placebo_event_study[path][-l]` (negative-int inner keys mirroring `placebo_event_study`); `summary()` renders the rows alongside per-path event-study horizons; `to_dataframe(level="by_path")` emits negative-horizon rows alongside the existing positive-horizon rows. **Bootstrap** (when `n_bootstrap > 0`) propagates per-`(path, lag)` percentile CI / p-value through the same `_bootstrap_one_target` dispatch as the per-path event-study, with the canonical NaN-on-invalid contract enforced on the new surface (PR #364 library-wide invariant). **SE inherits the cross-path cohort-sharing deviation from R** documented for `path_effects` (full-panel cohort-centered plug-in vs R's per-path re-run): tracks R within tolerance on single-path-cohort panels, diverges materially on cohort-mixed panels — the bootstrap SE is a Monte Carlo analog of the analytical SE and inherits the same deviation. R-parity confirmed at `tests/test_chaisemartin_dhaultfoeuille_parity.py::TestDCDHDynRParityByPathPlacebo` on the new `multi_path_reversible_by_path_placebo` scenario (point estimates exact match; SE within Phase-2 envelope rtol ≤ 5%); positive analytical + bootstrap invariants at `tests/test_chaisemartin_dhaultfoeuille.py::TestByPathPlacebo` (and the gated `::TestBootstrap` subclass). See `docs/methodology/REGISTRY.md` §ChaisemartinDHaultfoeuille `Note (Phase 3 by_path ...)` → "Per-path placebos" for the full contract.
 
@@ -99,6 +99,7 @@ Deferred items from PR reviews that were not addressed before merge.
 | `HeterogeneousAdoptionDiD` Phase 4.5: weight-aware auto-bandwidth MSE-DPI selector. Phase 4.5 A ships weighted `lprobust` with an unweighted DPI selector; users who want a weight-aware bandwidth must pass `h`/`b` explicitly. Extending `lpbwselect_mse_dpi` to propagate weights through density, second-derivative, and variance stages is ~300 LoC of methodology and was out of scope. | `diff_diff/_nprobust_port.py::lpbwselect_mse_dpi` | Phase 4.5 | Low |
 | `HeterogeneousAdoptionDiD` Phase 4.5 C: replicate-weight SurveyDesigns (BRR / Fay / JK1 / JKn / SDR) on the continuous-dose paths. Phase 4.5 A raises `NotImplementedError` on replicate designs in `_aggregate_unit_resolved_survey`. Rao-Wu-style replicate bootstrap for HAD paths requires deriving the per-replicate weight-ratio rescaling for the local-linear intercept IF. | `diff_diff/had.py::_aggregate_unit_resolved_survey` | Phase 4.5 C | Low |
 | `HeterogeneousAdoptionDiD` mass-point: `vcov_type in {"hc2", "hc2_bm"}` raises `NotImplementedError` pending a 2SLS-specific leverage derivation. The OLS leverage `x_i' (X'X)^{-1} x_i` is wrong for 2SLS; the correct finite-sample correction uses `x_i' (Z'X)^{-1} (...) (X'Z)^{-1} x_i`. Needs derivation plus an R / Stata (`ivreg2 small robust`) parity anchor. | `diff_diff/had.py::_fit_mass_point_2sls` | Phase 2a | Medium |
+| `HeterogeneousAdoptionDiD` survey-design API consolidation, **next minor bump**: drop the deprecated `survey=` and `weights=` kwargs on all 8 HAD surfaces (`HeterogeneousAdoptionDiD.fit`, `did_had_pretest_workflow`, `qug_test`, `stute_test`, `yatchew_hr_test`, `stute_joint_pretest`, `joint_pretrends_test`, `joint_homogeneity_test`); only `survey_design=` remains. Also fold the legacy back-end `weights=` paths (e.g. `_aggregate_unit_weights` ad-hoc routing) into the unified `_resolve_survey_for_fit`-driven path. The `_make_trivial_resolved` underscore alias on `survey.py` stays (one-line, harmless). DeprecationWarning ships in this PR; the removal PR is ~50 LoC of cleanup. | `diff_diff/had.py`, `diff_diff/had_pretests.py` | next minor bump | Medium |
 | `HeterogeneousAdoptionDiD` continuous paths: thread `cluster=` through `bias_corrected_local_linear` (Phase 1c's wrapper already supports cluster; Phase 2a ignores it with a `UserWarning` on the continuous path to keep scope tight). | `diff_diff/had.py`, `diff_diff/local_linear.py` | Phase 2a | Low |
 | `HeterogeneousAdoptionDiD` Eq 18 linear-trend detrending (Pierce-Schott style): the joint-Stute infrastructure shipped in the Phase 3 follow-up supports pre-trends (mean-indep) and post-homogeneity (linearity) nulls. The Pierce-Schott application (paper Section 5.2) uses a LINEAR-TREND detrending of pre-period outcomes before the joint CvM — `Y_{g,t} - Y_{g,t_anchor} - (t - t_anchor)*(Y_{g,t_anchor} - Y_{g,t_anchor-1})` — reaching p=0.51 on US-China tariff data. Extends `joint_pretrends_test` with a detrending mode or a separate Eq 18-specific helper. Deferred to Phase 4 replication harness (where the published p=0.51 serves as the parity anchor). | `diff_diff/had_pretests.py::joint_pretrends_test` | Phase 4 | Medium |
 | `HeterogeneousAdoptionDiD` Phase 3 Stute performance: Appendix D vectorized matrix form replaces the per-iteration OLS refit with a single precomputed `M = I - X(X'X)^{-1}X'` applied to `eps * eta`. Functionally identical, ~2x faster. Shipped literal-refit form in Phase 3 to match paper text and keep reviewer surface small. | `diff_diff/had_pretests.py::stute_test` | Phase 3 | Low |
 
@@ -151,6 +151,7 @@
     SurveyDesign,
     SurveyMetadata,
     compute_deff_diagnostics,
+    make_pweight_design,
 )
 from diff_diff.staggered import (
     CallawaySantAnna,
@@ -445,6 +446,7 @@
     "SurveyMetadata",
     "DEFFDiagnostics",
     "compute_deff_diagnostics",
+    "make_pweight_design",
     # Rust backend
     "HAS_RUST_BACKEND",
     # Linear algebra helpers
 
@@ -76,7 +76,13 @@
     BiasCorrectedFit,
     bias_corrected_local_linear,
 )
-from diff_diff.survey import SurveyMetadata, compute_survey_metadata
+from diff_diff.survey import (
+    HAD_DEPRECATION_MSG_SURVEY_KWARG,
+    HAD_DEPRECATION_MSG_WEIGHTS_KWARG_DATA_IN,
+    HAD_DUAL_KNOB_MUTEX_MSG_DATA_IN,
+    SurveyMetadata,
+    compute_survey_metadata,
+)
 from diff_diff.utils import safe_inference
 
 __all__ = [
@@ -2783,6 +2789,8 @@ def fit(
         unit_col: str,
         first_treat_col: Optional[str] = None,
         aggregate: str = "overall",
+        *,
+        survey_design: Any = None,
         survey: Any = None,
         weights: Optional[np.ndarray] = None,
         cband: bool = True,
@@ -2835,7 +2843,7 @@ def fit(
             CIs per horizon; joint cross-horizon covariance is deferred
             to a follow-up PR. Staggered-timing panels are auto-filtered
             to the last-treatment cohort with a ``UserWarning``.
-        survey : SurveyDesign or None
+        survey_design : SurveyDesign or None, keyword-only
             Survey design (sampling weights + optional strata / PSU / FPC)
             for design-based inference on the two continuous-dose paths
             (``continuous_at_zero``, ``continuous_near_d_lower``). Passes
@@ -2847,25 +2855,20 @@ def fit(
             FPC) must be constant within unit (sampling-unit-level
             assignment); within-unit variance raises ``ValueError``.
             Replicate-weight designs raise ``NotImplementedError``
-            (Phase 4.5 C). Phase 4.5 B support matrix: survey / weights
-            are now accepted on ALL design × aggregate combinations
-            (continuous × {overall, event-study}, mass-point × {overall,
-            event-study}); HAD pretests (``qug_test``, ``stute_test``,
-            ``yatchew_hr_test``, joint variants,
-            ``did_had_pretest_workflow``) still don't accept
-            survey/weights — deferred to Phase 4.5 C / C0.
-        weights : np.ndarray or None
-            Per-row sampling weights as a lightweight shortcut equivalent
-            to ``survey=SurveyDesign(weights=<col>)``. Produces the same
-            ATT; the SE uses the analytical weighted HC1 sandwich
-            (continuous: CCT-2014 weighted-robust; mass-point: pweight
-            2SLS sandwich) rather than Binder-TSL. Must be constant
-            within each unit; row-order aligned with ``data`` (index
-            labels are resolved to positional offsets via
-            ``data.index.get_indexer``, so custom non-RangeIndex inputs
-            work as long as ``data.index`` is unique). Mutually
-            exclusive with ``survey=`` — passing both raises
-            ``ValueError``.
+            (Phase 4.5 C). Mutually exclusive with the deprecated
+            ``survey=`` and ``weights=`` aliases.
+        survey : SurveyDesign or None, keyword-only
+            DEPRECATED alias of ``survey_design=``. Will be removed in
+            the next minor release; prefer ``survey_design=``.
+        weights : np.ndarray or None, keyword-only
+            DEPRECATED alias for the per-row pweight shortcut. Prefer
+            adding the weights as a column on ``data`` and passing
+            ``survey_design=SurveyDesign(weights='col_name')`` instead.
+            Will be removed in the next minor release. Currently
+            preserved as the analytical-HC1-sandwich shortcut (continuous:
+            CCT-2014 weighted-robust; mass-point: pweight 2SLS sandwich)
+            with the per-row → per-unit aggregation invariant intact.
+            Mutually exclusive with ``survey_design=`` and ``survey=``.
         cband : bool, default True
             Phase 4.5 B: controls the multiplier-bootstrap simultaneous
             confidence band on the weighted event-study path. When
@@ -2882,19 +2885,43 @@ def fit(
         -------
         HeterogeneousAdoptionDiDResults
         """
-        # ---- aggregate / survey / weights validation ----
+        # ---- aggregate / survey_design / survey / weights validation ----
         if aggregate not in _VALID_AGGREGATES:
             raise ValueError(
                 f"Invalid aggregate={aggregate!r}. Must be one of " f"{_VALID_AGGREGATES}."
             )
-        if survey is not None and weights is not None:
-            raise ValueError(
-                "Pass survey=<SurveyDesign> OR weights=<array>, not both. "
-                "For SurveyDesign-composed inference (PSU, strata, FPC, "
-                "replicate weights), use survey=. For a simple pweight-only "
-                "shortcut, use weights=; it is internally equivalent to "
-                "survey=SurveyDesign(weights=w)."
+        # Three-way mutex on survey_design / survey / weights (data-in pattern).
+        n_set = sum(x is not None for x in (survey_design, survey, weights))
+        if n_set > 1:
+            raise ValueError(HAD_DUAL_KNOB_MUTEX_MSG_DATA_IN)
+
+        # Soft deprecation: route legacy survey=/weights= aliases to
+        # survey_design=. The internal back-end paths (legacy weights= and
+        # survey= routing below) are unchanged; only the entry signature
+        # wraps them. The bit-exact back-compat invariant is preserved
+        # because we only rebind names, not values, and the legacy `survey`
+        # / `weights` variables are re-derived from `survey_design` for
+        # downstream consumption.
+        if survey is not None:
+            warnings.warn(HAD_DEPRECATION_MSG_SURVEY_KWARG, DeprecationWarning, stacklevel=2)
+            survey_design = survey
+        elif weights is not None:
+            warnings.warn(
+                HAD_DEPRECATION_MSG_WEIGHTS_KWARG_DATA_IN,
+                DeprecationWarning,
+                stacklevel=2,
             )
+            # weights= shortcut preserved as-is on the back end (the
+            # downstream `if weights is not None:` branch consumes the
+            # raw array directly via _aggregate_unit_weights). Don't
+            # rebind survey_design here — the array is not a
+            # SurveyDesign and survey_design= cannot accept arrays.
+        else:
+            # Canonical path: survey_design= may be None or a SurveyDesign
+            # instance. Map back to the internal `survey` variable name
+            # so downstream code (legacy `if survey is not None:` branch)
+            # consumes the input transparently.
+            survey = survey_design
         # Dispatch the event-study path to a dedicated method so the
         # single-period path stays unchanged (Phase 2a contract preserved).
         # Note: event_study returns HeterogeneousAdoptionDiDEventStudyResults