Skip to content

Commit 94083be

Browse files
authored
Merge pull request #374 from igerber/dcdh-by-path-sup-t
Add per-path joint sup-t bands to ChaisemartinDHaultfoeuille.by_path
2 parents 903d3ab + 1febbb1 commit 94083be

7 files changed

Lines changed: 869 additions & 5 deletions

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
99

1010
### Added
1111
- **HAD linearity-family pretests under survey (Phase 4.5 C).** `stute_test`, `yatchew_hr_test`, `stute_joint_pretest`, `joint_pretrends_test`, `joint_homogeneity_test`, and `did_had_pretest_workflow` now accept `weights=` / `survey=` keyword-only kwargs. Stute family uses **PSU-level Mammen multiplier bootstrap** via `bootstrap_utils.generate_survey_multiplier_weights_batch` (the same kernel as PR #363's HAD event-study sup-t bootstrap): each replicate draws an `(n_bootstrap, n_psu)` Mammen multiplier matrix, broadcast to per-obs perturbation `eta_obs[g] = eta_psu[psu(g)]`, weighted OLS refit, weighted CvM via new `_cvm_statistic_weighted` helper. Joint Stute SHARES the multiplier matrix across horizons within each replicate, preserving both the vector-valued empirical-process unit-level dependence AND PSU clustering. Yatchew uses **closed-form weighted OLS + pweight-sandwich variance components** (no bootstrap): `sigma2_lin = sum(w·eps²)/sum(w)`, `sigma2_diff = sum(w_avg·diff²)/(2·sum(w))` with arithmetic-mean pair weights `w_avg_g = (w_g+w_{g-1})/2`, `sigma4_W = sum(w_avg·prod)/sum(w_avg)`, `T_hr = sqrt(sum(w))·(sigma2_lin-sigma2_diff)/sigma2_W`. All three Yatchew components reduce bit-exactly to the unweighted formulas at `w=ones(G)` (locked at `atol=1e-14` by direct helper test). The pweight `weights=` shortcut routes through a synthetic trivial `ResolvedSurveyDesign` (new `survey._make_trivial_resolved` helper) so the same kernel handles both entry paths. `did_had_pretest_workflow(..., survey=, weights=)` removes the Phase 4.5 C0 `NotImplementedError`, dispatches to the survey-aware sub-tests, **skips the QUG step with `UserWarning`** (per C0 deferral), sets `qug=None` on the report, and appends a `"linearity-conditional verdict; QUG-under-survey deferred per Phase 4.5 C0"` suffix to the verdict. `HADPretestReport.qug` retyped from `QUGTestResults` to `Optional[QUGTestResults]`; `summary()` / `to_dict()` / `to_dataframe()` updated to None-tolerant rendering. Replicate-weight survey designs (BRR/Fay/JK1/JKn/SDR) raise `NotImplementedError` at every entry point (defense in depth, reciprocal-guard discipline) — parallel follow-up after this PR. **Stratified designs (`SurveyDesign(strata=...)`) also raise `NotImplementedError` on the Stute family** — the within-stratum demean + `sqrt(n_h/(n_h-1))` correction that the HAD sup-t bootstrap applies to match the Binder-TSL stratified target has not been derived for the Stute CvM functional, so applying raw multipliers from `generate_survey_multiplier_weights_batch` directly to residual perturbations would leave the bootstrap p-value silently miscalibrated. Phase 4.5 C narrows survey support to **pweight-only**, **PSU-only** (`SurveyDesign(weights=, psu=)`), and **FPC-only** (`SurveyDesign(weights=, fpc=)`) designs; stratified is a follow-up after the matching Stute-CvM stratified-correction derivation lands. Strictly positive weights required on Yatchew (the adjacent-difference variance is undefined under contiguous-zero blocks). Per-row `weights=` / `survey=col` aggregated to per-unit via existing HAD helpers `_aggregate_unit_weights` / `_aggregate_unit_resolved_survey` (constant-within-unit invariant enforced). Unweighted code paths preserved bit-exactly. Patch-level addition (additive on stable surfaces). See `docs/methodology/REGISTRY.md` § "QUG Null Test" — Note (Phase 4.5 C) for the full methodology.
12+
- **`ChaisemartinDHaultfoeuille.by_path` + `n_bootstrap > 0` joint sup-t bands** — per-path joint sup-t simultaneous confidence intervals across horizons `1..L_max` within each path. A single shared `(n_bootstrap, n_eligible)` multiplier weight matrix (using the estimator's configured `bootstrap_weights` — Rademacher / Mammen / Webb) is drawn per path and broadcast across all horizons of that path, producing correlated bootstrap distributions across horizons. The path-specific critical value `c_p = quantile(max_l |t_l|, 1 - α)` is used to construct symmetric joint bands `effect_l ± c_p · se_l` per horizon. Surfaced on `results.path_sup_t_bands` (dict keyed by path tuple, each entry with `crit_value / alpha / n_bootstrap / method / n_valid_horizons`); as `cband_conf_int` per horizon entry on `path_effects[path]["horizons"][l]`; and as `cband_lower` / `cband_upper` columns on `results.to_dataframe(level="by_path")` (mirrors the OVERALL `level="event_study"` schema; positive-horizon rows of banded paths get populated values, placebo / unbanded / empty-window rows get NaN). Gates: a path needs `>= 2` valid horizons (finite bootstrap SE > 0) AND a strict majority (more than 50%) of finite sup-t draws to receive a band. Empty-state contract: `path_sup_t_bands is None` when not requested; `{}` when requested but no path passes both gates. **Methodology asymmetry vs OVERALL `event_study_sup_t_bands`:** the per-path sup-t draws a fresh shared weight matrix per path AFTER the per-path SE bootstrap block has already populated `results.path_ses` via independent per-(path, horizon) draws — asymptotically equivalent to OVERALL's self-consistent reuse but NOT bit-identical. Documented intentional choice to preserve RNG-state isolation for existing per-path SE seed-reproducibility tests. Inherits the cross-path cohort-sharing SE deviation from R documented for `path_effects`. **Deviation from R:** `did_multiplegt_dyn` does not provide joint / sup-t bands at any surface — this is a Python-only methodology extension consistent with the existing OVERALL sup-t bands (also Python-only). Bands cover joint inference WITHIN a single path across horizons; they do NOT provide simultaneous coverage across paths. Pre-audit fix bundled: stale "Phase 2 placeholder" docstring on the existing `sup_t_bands` field updated to the actual contract description. Tests at `tests/test_chaisemartin_dhaultfoeuille.py::TestByPathSupTBands` (`@pytest.mark.slow`). See `docs/methodology/REGISTRY.md` §ChaisemartinDHaultfoeuille `Note (Phase 3 by_path per-path joint sup-t bands)` for the full contract.
1213
- **`ChaisemartinDHaultfoeuille.by_path` + `placebo=True`** — per-path backward-horizon placebos `DID^{pl}_{path, l}` for `l = 1..L_max`. The same per-path SE convention used for the event-study (joiners/leavers IF precedent: switcher-side contributions zeroed for non-path groups; cohort structure and control pool unchanged; plug-in SE with path-specific divisor `N^{pl}_{l, path}`) is applied to backward horizons via the new `switcher_subset_mask` parameter on `_compute_per_group_if_placebo_horizon`. Surfaced on `results.path_placebo_event_study[path][-l]` (negative-int inner keys mirroring `placebo_event_study`); `summary()` renders the rows alongside per-path event-study horizons; `to_dataframe(level="by_path")` emits negative-horizon rows alongside the existing positive-horizon rows. **Bootstrap** (when `n_bootstrap > 0`) propagates per-`(path, lag)` percentile CI / p-value through the same `_bootstrap_one_target` dispatch as the per-path event-study, with the canonical NaN-on-invalid contract enforced on the new surface (PR #364 library-wide invariant). **SE inherits the cross-path cohort-sharing deviation from R** documented for `path_effects` (full-panel cohort-centered plug-in vs R's per-path re-run): tracks R within tolerance on single-path-cohort panels, diverges materially on cohort-mixed panels — the bootstrap SE is a Monte Carlo analog of the analytical SE and inherits the same deviation. R-parity confirmed at `tests/test_chaisemartin_dhaultfoeuille_parity.py::TestDCDHDynRParityByPathPlacebo` on the new `multi_path_reversible_by_path_placebo` scenario (point estimates exact match; SE within Phase-2 envelope rtol ≤ 5%); positive analytical + bootstrap invariants at `tests/test_chaisemartin_dhaultfoeuille.py::TestByPathPlacebo` (and the gated `::TestBootstrap` subclass). See `docs/methodology/REGISTRY.md` §ChaisemartinDHaultfoeuille `Note (Phase 3 by_path ...)` → "Per-path placebos" for the full contract.
1314
- **Tutorial 19: dCDH for Marketing Pulse Campaigns** (`docs/tutorials/19_dcdh_marketing_pulse.ipynb`) — end-to-end practitioner walkthrough on a 60-market reversible-treatment panel covering the TWFE decomposition diagnostic (`twowayfeweights`), `DCDH` Phase 1 (DID_M, joiners-vs-leavers, single-lag placebo), the `L_max` multi-horizon event study with multiplier bootstrap, a stakeholder communication template, and drift guards. README listing for Tutorial 17 (Brand Awareness Survey) backfilled in the same edit. Cross-link from `docs/practitioner_decision_tree.rst` § "Reversible Treatment" added.
1415

diff_diff/chaisemartin_dhaultfoeuille.py

Lines changed: 84 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -431,6 +431,24 @@ class ChaisemartinDHaultfoeuille(ChaisemartinDHaultfoeuilleBootstrapMixin):
431431
cross-path cohort-sharing deviation from R is inherited from
432432
the analytical event-study path.
433433
434+
With ``n_bootstrap > 0``, per-path joint sup-t simultaneous
435+
confidence bands are also computed across horizons
436+
``1..L_max`` within each path. A path-specific critical value
437+
``c_p`` (constructed from a fresh shared-weights multiplier-
438+
bootstrap draw per path) is surfaced at top level as
439+
``results.path_sup_t_bands[path] = {"crit_value", "alpha",
440+
"n_bootstrap", "method", "n_valid_horizons"}``, applied
441+
per-horizon as ``cband_conf_int`` on
442+
``path_effects[path]["horizons"][l]``, and rendered as
443+
``cband_lower`` / ``cband_upper`` columns on
444+
``results.to_dataframe(level="by_path")`` (mirroring the
445+
OVERALL ``level="event_study"`` schema). Bands cover joint
446+
inference WITHIN a single path across horizons; they do NOT
447+
provide simultaneous coverage across paths. Python-only
448+
library extension; R ``did_multiplegt_dyn`` provides no joint
449+
bands at any surface. See REGISTRY.md ``Note (Phase 3 by_path
450+
per-path joint sup-t bands)``.
451+
434452
SE convention: per-path IF parallels the joiners / leavers
435453
construction — the switcher-side contribution is zeroed for
436454
groups not in the selected path, and the cohort structure and
@@ -2986,6 +3004,33 @@ def fit(
29863004
path_placebos[path_key][neg_key]["conf_int"] = (np.nan, np.nan)
29873005
path_placebos[path_key][neg_key]["t_stat"] = np.nan
29883006

3007+
# Phase 3: propagate per-path sup-t critical values to per-
3008+
# horizon `cband_conf_int` entries on path_effects (by_path +
3009+
# n_bootstrap > 0). Sibling of the OVERALL event-study cband
3010+
# propagation at `:2865-2875`. For each path with a finite
3011+
# crit, write `cband_conf_int = (eff - c_p*se, eff + c_p*se)`
3012+
# into each horizon's dict whose bootstrap-replaced SE is
3013+
# finite > 0. Mirror the OVERALL absent-key pattern: non-finite
3014+
# SE horizons simply don't get the `cband_conf_int` key.
3015+
if (
3016+
bootstrap_results is not None
3017+
and bootstrap_results.path_cband_crit_values is not None
3018+
and path_effects is not None
3019+
):
3020+
for path_key, crit in bootstrap_results.path_cband_crit_values.items():
3021+
if path_key not in path_effects:
3022+
continue
3023+
if not np.isfinite(crit):
3024+
continue
3025+
for l_h, h_dict in path_effects[path_key]["horizons"].items():
3026+
se = h_dict.get("se", np.nan)
3027+
eff = h_dict.get("effect", np.nan)
3028+
if np.isfinite(se) and se > 0:
3029+
h_dict["cband_conf_int"] = (
3030+
eff - crit * se,
3031+
eff + crit * se,
3032+
)
3033+
29893034
# When L_max >= 1 and the per-group path is active, sync
29903035
# overall_* from event_study_effects[1] AFTER bootstrap propagation
29913036
# so that bootstrap SE/p/CI flow to the top-level surface.
@@ -3618,6 +3663,45 @@ def fit(
36183663
),
36193664
path_effects=path_effects,
36203665
path_placebo_event_study=path_placebos,
3666+
path_sup_t_bands=(
3667+
# When by_path + n_bootstrap > 0 is active, surface a
3668+
# dict (possibly empty) — preserving the documented
3669+
# `None` (not requested) vs `{}` (requested but empty)
3670+
# contract that mirrors `path_effects` / `path_placebo_
3671+
# event_study` empty-state behavior. The empty case
3672+
# arises in two ways:
3673+
# 1. `path_effects == {}` — no observed path has a
3674+
# complete window; the per-path bootstrap collector
3675+
# is skipped upstream and `path_cband_crit_values`
3676+
# stays `None`. We materialize `{}` here.
3677+
# 2. Bootstrap ran but no path passed both gates
3678+
# (>=2 valid horizons AND a strict majority — more
3679+
# than 50% — of finite sup-t draws);
3680+
# `path_cband_crit_values == {}` — passes through.
3681+
{
3682+
path_key: {
3683+
"crit_value": crit,
3684+
"alpha": self.alpha,
3685+
"n_bootstrap": self.n_bootstrap,
3686+
"method": "multiplier_bootstrap",
3687+
"n_valid_horizons": (
3688+
bootstrap_results.path_cband_n_valid_horizons.get(path_key, 0)
3689+
if bootstrap_results is not None
3690+
and bootstrap_results.path_cband_n_valid_horizons is not None
3691+
else 0
3692+
),
3693+
}
3694+
for path_key, crit in (
3695+
bootstrap_results.path_cband_crit_values
3696+
if bootstrap_results is not None
3697+
and bootstrap_results.path_cband_crit_values is not None
3698+
else {}
3699+
).items()
3700+
if np.isfinite(crit)
3701+
}
3702+
if (self.by_path is not None and self.n_bootstrap > 0)
3703+
else None
3704+
),
36213705
survey_metadata=survey_metadata,
36223706
_estimator_ref=self,
36233707
)

diff_diff/chaisemartin_dhaultfoeuille_bootstrap.py

Lines changed: 88 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -778,6 +778,94 @@ def _compute_dcdh_bootstrap(
778778
results.path_placebo_cis = path_pl_cis
779779
results.path_placebo_p_values = path_pl_pvals
780780

781+
# --- Phase 3: Per-path joint sup-t (by_path + n_bootstrap > 0) ---
782+
# Sibling of the OVERALL event-study sup-t at the multi-horizon
783+
# block above (`:599-614`). Per-path joint simultaneous
784+
# confidence bands across horizons 1..L_max within each path:
785+
# one shared (n_bootstrap, n_eligible) multiplier weight matrix
786+
# (using `self.bootstrap_weights` — Rademacher / Mammen / Webb)
787+
# per path is broadcast across all valid horizons of that path,
788+
# producing correlated bootstrap distributions across horizons.
789+
# The path-specific critical value
790+
# `c_p = quantile(max_l |t_l|, 1-alpha)` is the band half-width
791+
# multiplier applied to each horizon's bootstrap SE in fit().
792+
#
793+
# Note (asymmetry vs OVERALL): this draws a FRESH shared-weights
794+
# matrix per path AFTER the per-path SE block above has populated
795+
# results.path_ses via independent per-(path, horizon) draws.
796+
# Numerator: fresh shared draws; denominator: bootstrap SEs from
797+
# the earlier independent draws. Asymptotically equivalent to
798+
# OVERALL's self-consistent reuse, but NOT bit-identical. The
799+
# fresh draw is intentional: it preserves RNG-state isolation
800+
# for existing per-path SE seed-reproducibility tests.
801+
#
802+
# Gates: a path needs >=2 valid horizons (finite bootstrap SE>0)
803+
# AND a strict majority (>50%) of finite sup-t draws to receive
804+
# a band. Otherwise the path is absent from
805+
# path_cband_crit_values (mirrors OVERALL absent-key pattern at
806+
# `:605,612`; the strict-majority gate matches the OVERALL
807+
# `finite_mask.sum() > 0.5 * n_bootstrap` semantics — exactly
808+
# half finite is NOT enough).
809+
if path_bootstrap_inputs is not None and results.path_ses:
810+
path_cband_crits: Dict[Tuple[int, ...], float] = {}
811+
path_cband_n_valid: Dict[Tuple[int, ...], int] = {}
812+
813+
for path_key, horizon_inputs in path_bootstrap_inputs.items():
814+
bs_ses_for_path = results.path_ses.get(path_key, {})
815+
valid_horizons = []
816+
for l_h, (u_h, n_h, eff_h, _u_pp_h) in sorted(horizon_inputs.items()):
817+
if u_h.size == 0 or n_h <= 0:
818+
continue
819+
bs_se = bs_ses_for_path.get(l_h, np.nan)
820+
if not np.isfinite(bs_se) or bs_se <= 0:
821+
continue
822+
valid_horizons.append((l_h, u_h, n_h, eff_h, bs_se))
823+
824+
if len(valid_horizons) < 2:
825+
continue
826+
827+
# All horizons within a path use the same n_eligible
828+
# (variance-eligible group ordering enforced by
829+
# _collect_path_bootstrap_inputs's use of
830+
# eligible_mask_var for cohort-recentering); use the
831+
# first valid horizon's IF size as the shared dim.
832+
n_dim = valid_horizons[0][1].size
833+
map_path = _map_for_target(
834+
n_dim,
835+
group_id_to_psu_code,
836+
eligible_group_ids,
837+
)
838+
with np.errstate(invalid="ignore", divide="ignore"):
839+
shared_weights = _generate_psu_or_group_weights(
840+
n_bootstrap=self.n_bootstrap,
841+
n_groups_target=n_dim,
842+
weight_type=self.bootstrap_weights,
843+
rng=rng,
844+
group_to_psu_map=map_path,
845+
)
846+
es_dists_path = []
847+
for _l_h, u_h, n_h, eff_h, _bs_se in valid_horizons:
848+
deviations = (shared_weights @ u_h) / n_h
849+
es_dists_path.append(eff_h + deviations)
850+
boot_matrix = np.asarray(es_dists_path)
851+
effects_vec = np.array([v[3] for v in valid_horizons])
852+
ses_vec = np.array([v[4] for v in valid_horizons])
853+
t_stats = np.abs((boot_matrix - effects_vec[:, None]) / ses_vec[:, None])
854+
sup_t_dist = np.max(t_stats, axis=0)
855+
finite_mask = np.isfinite(sup_t_dist)
856+
if finite_mask.sum() <= 0.5 * self.n_bootstrap:
857+
continue
858+
crit_p = float(np.quantile(sup_t_dist[finite_mask], 1.0 - self.alpha))
859+
860+
if not np.isfinite(crit_p):
861+
continue
862+
863+
path_cband_crits[path_key] = crit_p
864+
path_cband_n_valid[path_key] = len(valid_horizons)
865+
866+
results.path_cband_crit_values = path_cband_crits
867+
results.path_cband_n_valid_horizons = path_cband_n_valid
868+
781869
return results
782870

783871

0 commit comments

Comments
 (0)