Skip to content

Commit 2df79a0

Browse files
igerberclaude
andcommitted
Self-audit: extend to_dataframe(level=by_path) with cband_lower/upper
Cross-surface gap caught in self-audit: OVERALL `to_dataframe(level= "event_study")` includes `cband_lower` / `cband_upper` columns (`chaisemartin_dhaultfoeuille_results.py:1495-1496,1531-1532`) but the per-path table at `level="by_path"` does not — even though per-path now produces `cband_conf_int` writes via the new sup-t propagation block. Cross-surface twin asymmetry the CI reviewer didn't flag; caught by my own grep audit on `cband_conf_int` consumers. Fix: extend `to_dataframe(level="by_path")` to emit the same two columns. Populated for positive-horizon rows of paths with a finite sup-t crit (read from `path_effects[path]["horizons"][l] ["cband_conf_int"]`); NaN for placebo rows (no joint band per the positive-only sup-t spec), unbanded paths, and the requested-but-empty fallback DataFrame (which now includes the columns in its canonical schema). Tests added: - `test_path_sup_t_to_dataframe_emits_cband_columns` — column presence + per-row alignment with the dict surface - `test_path_sup_t_to_dataframe_empty_path_fallback_has_cband_columns` — empty-path fallback DataFrame schema parity Docs updated: - REGISTRY.md: `to_dataframe(level="by_path")` integration note added to the new sup-t Note; canonical column list in the existing `Note (Phase 3 by_path ...)` block extended with `cband_lower / cband_upper` - CHANGELOG entry: surface listing now mentions to_dataframe columns - `by_path` parameter docstring: rendering surface listing extended - `path_sup_t_bands` Attributes docstring: rendering surface listing extended Suite: 263 tests pass (was 261, +2 new tests). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent c0c0d4e commit 2df79a0

5 files changed

Lines changed: 102 additions & 6 deletions

File tree

CHANGELOG.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
99

1010
### Added
1111
- **HAD linearity-family pretests under survey (Phase 4.5 C).** `stute_test`, `yatchew_hr_test`, `stute_joint_pretest`, `joint_pretrends_test`, `joint_homogeneity_test`, and `did_had_pretest_workflow` now accept `weights=` / `survey=` keyword-only kwargs. Stute family uses **PSU-level Mammen multiplier bootstrap** via `bootstrap_utils.generate_survey_multiplier_weights_batch` (the same kernel as PR #363's HAD event-study sup-t bootstrap): each replicate draws an `(n_bootstrap, n_psu)` Mammen multiplier matrix, broadcast to per-obs perturbation `eta_obs[g] = eta_psu[psu(g)]`, weighted OLS refit, weighted CvM via new `_cvm_statistic_weighted` helper. Joint Stute SHARES the multiplier matrix across horizons within each replicate, preserving both the vector-valued empirical-process unit-level dependence AND PSU clustering. Yatchew uses **closed-form weighted OLS + pweight-sandwich variance components** (no bootstrap): `sigma2_lin = sum(w·eps²)/sum(w)`, `sigma2_diff = sum(w_avg·diff²)/(2·sum(w))` with arithmetic-mean pair weights `w_avg_g = (w_g+w_{g-1})/2`, `sigma4_W = sum(w_avg·prod)/sum(w_avg)`, `T_hr = sqrt(sum(w))·(sigma2_lin-sigma2_diff)/sigma2_W`. All three Yatchew components reduce bit-exactly to the unweighted formulas at `w=ones(G)` (locked at `atol=1e-14` by direct helper test). The pweight `weights=` shortcut routes through a synthetic trivial `ResolvedSurveyDesign` (new `survey._make_trivial_resolved` helper) so the same kernel handles both entry paths. `did_had_pretest_workflow(..., survey=, weights=)` removes the Phase 4.5 C0 `NotImplementedError`, dispatches to the survey-aware sub-tests, **skips the QUG step with `UserWarning`** (per C0 deferral), sets `qug=None` on the report, and appends a `"linearity-conditional verdict; QUG-under-survey deferred per Phase 4.5 C0"` suffix to the verdict. `HADPretestReport.qug` retyped from `QUGTestResults` to `Optional[QUGTestResults]`; `summary()` / `to_dict()` / `to_dataframe()` updated to None-tolerant rendering. Replicate-weight survey designs (BRR/Fay/JK1/JKn/SDR) raise `NotImplementedError` at every entry point (defense in depth, reciprocal-guard discipline) — parallel follow-up after this PR. **Stratified designs (`SurveyDesign(strata=...)`) also raise `NotImplementedError` on the Stute family** — the within-stratum demean + `sqrt(n_h/(n_h-1))` correction that the HAD sup-t bootstrap applies to match the Binder-TSL stratified target has not been derived for the Stute CvM functional, so applying raw multipliers from `generate_survey_multiplier_weights_batch` directly to residual perturbations would leave the bootstrap p-value silently miscalibrated. Phase 4.5 C narrows survey support to **pweight-only**, **PSU-only** (`SurveyDesign(weights=, psu=)`), and **FPC-only** (`SurveyDesign(weights=, fpc=)`) designs; stratified is a follow-up after the matching Stute-CvM stratified-correction derivation lands. Strictly positive weights required on Yatchew (the adjacent-difference variance is undefined under contiguous-zero blocks). Per-row `weights=` / `survey=col` aggregated to per-unit via existing HAD helpers `_aggregate_unit_weights` / `_aggregate_unit_resolved_survey` (constant-within-unit invariant enforced). Unweighted code paths preserved bit-exactly. Patch-level addition (additive on stable surfaces). See `docs/methodology/REGISTRY.md` § "QUG Null Test" — Note (Phase 4.5 C) for the full methodology.
12-
- **`ChaisemartinDHaultfoeuille.by_path` + `n_bootstrap > 0` joint sup-t bands** — per-path joint sup-t simultaneous confidence intervals across horizons `1..L_max` within each path. A single shared `(n_bootstrap, n_eligible)` multiplier weight matrix (using the estimator's configured `bootstrap_weights` — Rademacher / Mammen / Webb) is drawn per path and broadcast across all horizons of that path, producing correlated bootstrap distributions across horizons. The path-specific critical value `c_p = quantile(max_l |t_l|, 1 - α)` is used to construct symmetric joint bands `effect_l ± c_p · se_l` per horizon. Surfaced on `results.path_sup_t_bands` (dict keyed by path tuple, each entry with `crit_value / alpha / n_bootstrap / method / n_valid_horizons`) and as `cband_conf_int` per horizon entry on `path_effects[path]["horizons"][l]`. Gates: a path needs `>= 2` valid horizons (finite bootstrap SE > 0) AND a strict majority (more than 50%) of finite sup-t draws to receive a band. Empty-state contract: `path_sup_t_bands is None` when not requested; `{}` when requested but no path passes both gates. **Methodology asymmetry vs OVERALL `event_study_sup_t_bands`:** the per-path sup-t draws a fresh shared weight matrix per path AFTER the per-path SE bootstrap block has already populated `results.path_ses` via independent per-(path, horizon) draws — asymptotically equivalent to OVERALL's self-consistent reuse but NOT bit-identical. Documented intentional choice to preserve RNG-state isolation for existing per-path SE seed-reproducibility tests. Inherits the cross-path cohort-sharing SE deviation from R documented for `path_effects`. **Deviation from R:** `did_multiplegt_dyn` does not provide joint / sup-t bands at any surface — this is a Python-only methodology extension consistent with the existing OVERALL sup-t bands (also Python-only). Bands cover joint inference WITHIN a single path across horizons; they do NOT provide simultaneous coverage across paths. Pre-audit fix bundled: stale "Phase 2 placeholder" docstring on the existing `sup_t_bands` field updated to the actual contract description. Tests at `tests/test_chaisemartin_dhaultfoeuille.py::TestByPathSupTBands` (`@pytest.mark.slow`). See `docs/methodology/REGISTRY.md` §ChaisemartinDHaultfoeuille `Note (Phase 3 by_path per-path joint sup-t bands)` for the full contract.
12+
- **`ChaisemartinDHaultfoeuille.by_path` + `n_bootstrap > 0` joint sup-t bands** — per-path joint sup-t simultaneous confidence intervals across horizons `1..L_max` within each path. A single shared `(n_bootstrap, n_eligible)` multiplier weight matrix (using the estimator's configured `bootstrap_weights` — Rademacher / Mammen / Webb) is drawn per path and broadcast across all horizons of that path, producing correlated bootstrap distributions across horizons. The path-specific critical value `c_p = quantile(max_l |t_l|, 1 - α)` is used to construct symmetric joint bands `effect_l ± c_p · se_l` per horizon. Surfaced on `results.path_sup_t_bands` (dict keyed by path tuple, each entry with `crit_value / alpha / n_bootstrap / method / n_valid_horizons`); as `cband_conf_int` per horizon entry on `path_effects[path]["horizons"][l]`; and as `cband_lower` / `cband_upper` columns on `results.to_dataframe(level="by_path")` (mirrors the OVERALL `level="event_study"` schema; positive-horizon rows of banded paths get populated values, placebo / unbanded / empty-window rows get NaN). Gates: a path needs `>= 2` valid horizons (finite bootstrap SE > 0) AND a strict majority (more than 50%) of finite sup-t draws to receive a band. Empty-state contract: `path_sup_t_bands is None` when not requested; `{}` when requested but no path passes both gates. **Methodology asymmetry vs OVERALL `event_study_sup_t_bands`:** the per-path sup-t draws a fresh shared weight matrix per path AFTER the per-path SE bootstrap block has already populated `results.path_ses` via independent per-(path, horizon) draws — asymptotically equivalent to OVERALL's self-consistent reuse but NOT bit-identical. Documented intentional choice to preserve RNG-state isolation for existing per-path SE seed-reproducibility tests. Inherits the cross-path cohort-sharing SE deviation from R documented for `path_effects`. **Deviation from R:** `did_multiplegt_dyn` does not provide joint / sup-t bands at any surface — this is a Python-only methodology extension consistent with the existing OVERALL sup-t bands (also Python-only). Bands cover joint inference WITHIN a single path across horizons; they do NOT provide simultaneous coverage across paths. Pre-audit fix bundled: stale "Phase 2 placeholder" docstring on the existing `sup_t_bands` field updated to the actual contract description. Tests at `tests/test_chaisemartin_dhaultfoeuille.py::TestByPathSupTBands` (`@pytest.mark.slow`). See `docs/methodology/REGISTRY.md` §ChaisemartinDHaultfoeuille `Note (Phase 3 by_path per-path joint sup-t bands)` for the full contract.
1313
- **`ChaisemartinDHaultfoeuille.by_path` + `placebo=True`** — per-path backward-horizon placebos `DID^{pl}_{path, l}` for `l = 1..L_max`. The same per-path SE convention used for the event-study (joiners/leavers IF precedent: switcher-side contributions zeroed for non-path groups; cohort structure and control pool unchanged; plug-in SE with path-specific divisor `N^{pl}_{l, path}`) is applied to backward horizons via the new `switcher_subset_mask` parameter on `_compute_per_group_if_placebo_horizon`. Surfaced on `results.path_placebo_event_study[path][-l]` (negative-int inner keys mirroring `placebo_event_study`); `summary()` renders the rows alongside per-path event-study horizons; `to_dataframe(level="by_path")` emits negative-horizon rows alongside the existing positive-horizon rows. **Bootstrap** (when `n_bootstrap > 0`) propagates per-`(path, lag)` percentile CI / p-value through the same `_bootstrap_one_target` dispatch as the per-path event-study, with the canonical NaN-on-invalid contract enforced on the new surface (PR #364 library-wide invariant). **SE inherits the cross-path cohort-sharing deviation from R** documented for `path_effects` (full-panel cohort-centered plug-in vs R's per-path re-run): tracks R within tolerance on single-path-cohort panels, diverges materially on cohort-mixed panels — the bootstrap SE is a Monte Carlo analog of the analytical SE and inherits the same deviation. R-parity confirmed at `tests/test_chaisemartin_dhaultfoeuille_parity.py::TestDCDHDynRParityByPathPlacebo` on the new `multi_path_reversible_by_path_placebo` scenario (point estimates exact match; SE within Phase-2 envelope rtol ≤ 5%); positive analytical + bootstrap invariants at `tests/test_chaisemartin_dhaultfoeuille.py::TestByPathPlacebo` (and the gated `::TestBootstrap` subclass). See `docs/methodology/REGISTRY.md` §ChaisemartinDHaultfoeuille `Note (Phase 3 by_path ...)` → "Per-path placebos" for the full contract.
1414
- **Tutorial 19: dCDH for Marketing Pulse Campaigns** (`docs/tutorials/19_dcdh_marketing_pulse.ipynb`) — end-to-end practitioner walkthrough on a 60-market reversible-treatment panel covering the TWFE decomposition diagnostic (`twowayfeweights`), `DCDH` Phase 1 (DID_M, joiners-vs-leavers, single-lag placebo), the `L_max` multi-horizon event study with multiplier bootstrap, a stakeholder communication template, and drift guards. README listing for Tutorial 17 (Brand Awareness Survey) backfilled in the same edit. Cross-link from `docs/practitioner_decision_tree.rst` § "Reversible Treatment" added.
1515

diff_diff/chaisemartin_dhaultfoeuille.py

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -437,9 +437,12 @@ class ChaisemartinDHaultfoeuille(ChaisemartinDHaultfoeuilleBootstrapMixin):
437437
``c_p`` (constructed from a fresh shared-weights multiplier-
438438
bootstrap draw per path) is surfaced at top level as
439439
``results.path_sup_t_bands[path] = {"crit_value", "alpha",
440-
"n_bootstrap", "method", "n_valid_horizons"}`` and applied
440+
"n_bootstrap", "method", "n_valid_horizons"}``, applied
441441
per-horizon as ``cband_conf_int`` on
442-
``path_effects[path]["horizons"][l]``. Bands cover joint
442+
``path_effects[path]["horizons"][l]``, and rendered as
443+
``cband_lower`` / ``cband_upper`` columns on
444+
``results.to_dataframe(level="by_path")`` (mirroring the
445+
OVERALL ``level="event_study"`` schema). Bands cover joint
443446
inference WITHIN a single path across horizons; they do NOT
444447
provide simultaneous coverage across paths. Python-only
445448
library extension; R ``did_multiplegt_dyn`` provides no joint

diff_diff/chaisemartin_dhaultfoeuille_results.py

Lines changed: 20 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -425,7 +425,9 @@ class ChaisemartinDHaultfoeuilleResults:
425425
"method": str, "n_valid_horizons": int}``. Populated when
426426
``by_path`` is a positive int AND ``n_bootstrap > 0``. The
427427
band itself is applied per-horizon as ``cband_conf_int`` on
428-
``path_effects[path]["horizons"][l]``. Empty-state contract:
428+
``path_effects[path]["horizons"][l]`` and rendered as
429+
``cband_lower`` / ``cband_upper`` columns on
430+
``to_dataframe(level="by_path")``. Empty-state contract:
429431
``None`` when not requested (no bootstrap or ``by_path is None``);
430432
``{}`` when requested but no path passed both gates (``>=2``
431433
valid horizons with finite bootstrap SE ``> 0`` AND a strict
@@ -1632,6 +1634,8 @@ def to_dataframe(self, level: str = "overall") -> pd.DataFrame:
16321634
"conf_int_lower",
16331635
"conf_int_upper",
16341636
"n_obs",
1637+
"cband_lower",
1638+
"cband_upper",
16351639
]
16361640
)
16371641
rows = []
@@ -1655,6 +1659,12 @@ def to_dataframe(self, level: str = "overall") -> pd.DataFrame:
16551659
)
16561660
for lag_key in sorted(placebo_horizons.keys()):
16571661
ph_entry = placebo_horizons[lag_key]
1662+
# Placebos do not get joint sup-t bands in this
1663+
# release (only positive event-study horizons do —
1664+
# mirrors OVERALL placebo / event-study sup-t
1665+
# convention). Emit NaN cband columns for schema
1666+
# parity with the OVERALL level="event_study" table.
1667+
ph_cband = ph_entry.get("cband_conf_int", (np.nan, np.nan))
16581668
rows.append(
16591669
{
16601670
"path": path,
@@ -1668,10 +1678,17 @@ def to_dataframe(self, level: str = "overall") -> pd.DataFrame:
16681678
"conf_int_lower": ph_entry["conf_int"][0],
16691679
"conf_int_upper": ph_entry["conf_int"][1],
16701680
"n_obs": ph_entry["n_obs"],
1681+
"cband_lower": ph_cband[0] if ph_cband else np.nan,
1682+
"cband_upper": ph_cband[1] if ph_cband else np.nan,
16711683
}
16721684
)
16731685
for l_h in sorted(horizons.keys()):
16741686
h_entry = horizons[l_h]
1687+
# Per-path joint sup-t band (when populated) mirrors
1688+
# OVERALL `level="event_study"` cband emission. Absent
1689+
# key / missing path entry -> NaN columns. Pinned at
1690+
# `TestByPathSupTBands::test_path_sup_t_to_dataframe_emits_cband_columns`.
1691+
h_cband = h_entry.get("cband_conf_int", (np.nan, np.nan))
16751692
rows.append(
16761693
{
16771694
"path": path,
@@ -1685,6 +1702,8 @@ def to_dataframe(self, level: str = "overall") -> pd.DataFrame:
16851702
"conf_int_lower": h_entry["conf_int"][0],
16861703
"conf_int_upper": h_entry["conf_int"][1],
16871704
"n_obs": h_entry["n_obs"],
1705+
"cband_lower": h_cband[0] if h_cband else np.nan,
1706+
"cband_upper": h_cband[1] if h_cband else np.nan,
16881707
}
16891708
)
16901709
return pd.DataFrame(rows)

0 commit comments

Comments
 (0)