Skip to content

Commit 62f64db

Browse files
igerberclaude
andcommitted
Phase 4.5 C: stratified survey-design support for HAD Stute family
Lifts the NotImplementedError gate on SurveyDesign(strata=...) in stute_test and stute_joint_pretest (and by inheritance in joint_pretrends_test, joint_homogeneity_test, did_had_pretest_workflow). Implements the standard stratified clustered wild bootstrap correction (Cameron-Gelbach-Miller 2008 / Davidson-Flachaire 2008 / Djogbenou- MacKinnon-Nielsen 2019 / Kreiss-Lahiri 2012): within-stratum demean + sqrt(n_h/(n_h-1)) Bessel rescale (Wu 1986; Liu 1988) applied to the PSU multipliers BEFORE the per-obs broadcast in the wild-residual loop. New shared helper bootstrap_utils.apply_stratum_centering is called from BOTH the new Stute path (psu_axis=1 on the multiplier matrix) AND the existing HAD sup-t event-study cband bootstrap (psu_axis=0 on the PSU-aggregated influence tensor, refactored bit-exactly from had.py: 2172-2204). Locks the algebraic identity architecturally. Non-strata Stute paths now also apply the Bessel correction uniformly (single implicit stratum), mirroring HAD sup-t at had.py:2199-2204. This is a deliberate calibration improvement (~1-2% p-value shift for typical n_psu); the pre-PR path was under-corrected by exactly the sqrt(n_psu/(n_psu-1)) factor. Methodology derivation in REGISTRY § "Note (Stute stratified survey- bootstrap calibration)". Remaining deferrals: lonely_psu='adjust' + singleton-strata (pseudo-stratum centering transform) and replicate- weight designs (Rao-Wu / JKn bootstrap composition). Tests: 16 new tests across helper unit suite (bit-parity vs pre- refactor inline code at atol=1e-14), strata positive smokes, trivial-stratum reduction, calibration-shift direction pin, MC oracle consistency under stratified null (200 draws, Type I in [0, 0.10]), MC power under stratified known alternative (200 draws, rejection > 0.50). 575 tests pass across full touched-file sweep; 29 existing HAD sup-t cband tests pass bit-exactly post-refactor. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent b56e232 commit 62f64db

8 files changed

Lines changed: 903 additions & 113 deletions

File tree

CHANGELOG.md

Lines changed: 4 additions & 0 deletions
Large diffs are not rendered by default.

diff_diff/bootstrap_utils.py

Lines changed: 132 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@
1717
"generate_bootstrap_weights_batch",
1818
"generate_bootstrap_weights_batch_numpy",
1919
"generate_survey_multiplier_weights_batch",
20+
"apply_stratum_centering",
2021
"generate_rao_wu_weights",
2122
"generate_rao_wu_weights_batch",
2223
"compute_percentile_ci",
@@ -654,6 +655,137 @@ def generate_survey_multiplier_weights_batch(
654655
return weights, psu_ids
655656

656657

658+
def apply_stratum_centering(
659+
tensor: np.ndarray,
660+
resolved_survey: "ResolvedSurveyDesign",
661+
psu_ids: np.ndarray,
662+
psu_axis: int = 0,
663+
) -> np.ndarray:
664+
"""Within-stratum demean + sqrt(n_h/(n_h-1)) Bessel rescale along the
665+
PSU axis of a tensor. Mutates ``tensor`` in place AND returns it.
666+
667+
Shared by the HAD sup-t event-study bootstrap
668+
(``had._sup_t_multiplier_bootstrap``, PSU axis = 0 on the
669+
PSU-aggregated influence tensor ``Psi_psu`` of shape ``(n_psu,
670+
n_horizons)``) AND the HAD Stute survey-bootstrap family
671+
(``stute_test``, ``stute_joint_pretest``, PSU axis = 1 on the
672+
multiplier matrix ``psu_mults`` of shape ``(n_bootstrap, n_psu)``).
673+
Same algebra applied at different points in the two pipelines:
674+
675+
- HAD sup-t is a multiplier bootstrap on a precomputed influence
676+
tensor. The correction is applied to the tensor before
677+
``perturbations = psu_weights @ Psi_psu`` — see ``had.py:2151-2204``.
678+
- Stute is a wild residual bootstrap with refit-in-loop and a
679+
nonlinear functional. The correction is applied to the multipliers
680+
before the per-obs broadcast ``eta_obs = psu_mults[b,
681+
psu_col_idx]`` — see ``had_pretests.py:1988-2007``.
682+
683+
Locks the algebraic identity architecturally (so future drift in
684+
one site cannot silently diverge from the other) and makes both
685+
sites consume the same battle-tested code path.
686+
687+
The combined correction makes ``Var_xi[xi @ Psi_psu]`` match the
688+
analytical Binder-TSL stratified target
689+
``V = sum_h (1 - f_h) (n_h / (n_h - 1)) sum_j (psi_hj - psi_h_bar)²``
690+
exactly (the ``(1 - f_h)`` factor is already baked into
691+
``psu_mults`` by :func:`generate_survey_multiplier_weights_batch`;
692+
this helper bakes the remaining ``(n_h / (n_h - 1))`` factor and
693+
enforces the within-stratum-zero centering required for cluster
694+
wild bootstrap consistency under stratification).
695+
696+
See REGISTRY § HeterogeneousAdoptionDiD —
697+
"Note (Stute stratified survey-bootstrap calibration)" for the
698+
derivation.
699+
700+
Parameters
701+
----------
702+
tensor : np.ndarray
703+
Tensor with the PSU dimension on ``psu_axis``. Modified
704+
in-place.
705+
resolved_survey : ResolvedSurveyDesign
706+
Resolved survey design. Provides ``.psu`` (per-unit PSU IDs;
707+
may be None for the implicit-PSU case where each unit is its
708+
own PSU) and ``.strata`` (per-unit stratum labels; may be None
709+
for the single-implicit-stratum case).
710+
psu_ids : np.ndarray, shape ``(n_psu,)``
711+
Unique PSU identifiers aligned to the ``psu_axis`` of
712+
``tensor``. Output of
713+
:func:`generate_survey_multiplier_weights_batch`.
714+
psu_axis : int, default 0
715+
Axis of ``tensor`` along which PSUs are indexed. 0 for HAD
716+
sup-t ``Psi_psu`` (shape ``(n_psu, n_horizons)``); 1 for Stute
717+
``psu_mults`` (shape ``(n_bootstrap, n_psu)``).
718+
719+
Returns
720+
-------
721+
np.ndarray
722+
Same object as ``tensor`` (in-place mutation; returned for
723+
chaining). Singleton strata under ``lonely_psu='remove'`` /
724+
``'certainty'`` have all-zero entries along ``psu_axis``
725+
(set by :func:`generate_survey_multiplier_weights_batch`'s
726+
lonely-PSU handling); the centering here skips them to avoid
727+
a divide-by-zero on ``sqrt(n_h / 0)``.
728+
729+
Notes
730+
-----
731+
Under ``strata=None``, the correction is applied uniformly with a
732+
single implicit stratum (``n_h = n_psu``) — demean across all PSUs
733+
along ``psu_axis`` and rescale by ``sqrt(n_psu / (n_psu - 1))``.
734+
This is the standard small-sample correction for an iid cluster
735+
wild bootstrap (Wu 1986; Liu 1988) and matches the HAD sup-t
736+
convention at ``had.py:2199-2204``.
737+
738+
The Stute call site has historically NOT applied this correction
739+
(pre-PR Phase 4.5 C). Lifting the gate on stratified designs +
740+
introducing this shared helper makes the non-strata Stute path
741+
apply the correction uniformly, which is a deliberate calibration
742+
improvement (~1-2% p-value shift for typical ``n_psu``) — see
743+
REGISTRY § "Note (Stute stratified survey-bootstrap calibration)".
744+
"""
745+
n_psu = tensor.shape[psu_axis]
746+
747+
if resolved_survey.strata is not None:
748+
strata = np.asarray(resolved_survey.strata)
749+
# Build PSU -> stratum map. Constant-within-PSU by the
750+
# SurveyDesign.resolve contract — assert defensively in tests.
751+
psu_id_to_col = {int(p): c for c, p in enumerate(psu_ids)}
752+
psu_stratum = np.empty(n_psu, dtype=strata.dtype)
753+
if resolved_survey.psu is not None:
754+
unit_psu = np.asarray(resolved_survey.psu)
755+
seen = np.zeros(n_psu, dtype=bool)
756+
for i in range(len(unit_psu)):
757+
col = psu_id_to_col[int(unit_psu[i])]
758+
if not seen[col]:
759+
psu_stratum[col] = strata[i]
760+
seen[col] = True
761+
else:
762+
# Each unit is its own PSU; psu_ids = arange(n_psu).
763+
psu_stratum = strata.copy()
764+
765+
for h in np.unique(psu_stratum):
766+
mask_h = psu_stratum == h
767+
n_h = int(mask_h.sum())
768+
if n_h < 2:
769+
# Singleton / empty stratum: caller's lonely_psu logic
770+
# has already zeroed the corresponding multipliers (or
771+
# the contribution is zero by construction). Skip to
772+
# avoid a divide-by-zero on sqrt(n_h / 0).
773+
continue
774+
slc = [slice(None)] * tensor.ndim
775+
slc[psu_axis] = mask_h
776+
slc = tuple(slc)
777+
tensor[slc] -= tensor[slc].mean(axis=psu_axis, keepdims=True)
778+
tensor[slc] *= np.sqrt(n_h / (n_h - 1))
779+
else:
780+
# Single implicit stratum — demean across all PSUs along
781+
# psu_axis, rescale by sqrt(n_psu / (n_psu - 1)).
782+
if n_psu >= 2:
783+
tensor -= tensor.mean(axis=psu_axis, keepdims=True)
784+
tensor *= np.sqrt(n_psu / (n_psu - 1))
785+
786+
return tensor
787+
788+
657789
def generate_rao_wu_weights(
658790
resolved_survey: "ResolvedSurveyDesign",
659791
rng: np.random.Generator,

diff_diff/had.py

Lines changed: 8 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -2084,6 +2084,7 @@ def _sup_t_multiplier_bootstrap(
20842084
caller.
20852085
"""
20862086
from diff_diff.bootstrap_utils import (
2087+
apply_stratum_centering,
20872088
generate_bootstrap_weights_batch,
20882089
generate_survey_multiplier_weights_batch,
20892090
)
@@ -2169,39 +2170,13 @@ def _sup_t_multiplier_bootstrap(
21692170
# Each unit is its own PSU (psu_ids = np.arange(n_units)).
21702171
Psi_psu = influence_matrix.copy()
21712172

2172-
if resolved_survey.strata is not None:
2173-
strata = np.asarray(resolved_survey.strata)
2174-
# Build PSU -> stratum map (strata constant-within-PSU by
2175-
# SurveyDesign.resolve contract).
2176-
psu_stratum = np.empty(n_psu, dtype=strata.dtype)
2177-
if resolved_survey.psu is not None:
2178-
seen = np.zeros(n_psu, dtype=bool)
2179-
unit_psu = np.asarray(resolved_survey.psu)
2180-
for i in range(n_units):
2181-
col = psu_id_to_col[int(unit_psu[i])]
2182-
if not seen[col]:
2183-
psu_stratum[col] = strata[i]
2184-
seen[col] = True
2185-
else:
2186-
psu_stratum = strata.copy()
2187-
2188-
for h in np.unique(psu_stratum):
2189-
mask_h = psu_stratum == h
2190-
n_h = int(mask_h.sum())
2191-
if n_h < 2:
2192-
# Singleton / empty stratum contributes 0 variance
2193-
# regardless; the helper's lonely-PSU logic already
2194-
# zeros those multipliers. Skip centering to avoid
2195-
# a divide-by-zero on sqrt(n_h / (n_h - 1)).
2196-
continue
2197-
Psi_psu[mask_h] -= Psi_psu[mask_h].mean(axis=0, keepdims=True)
2198-
Psi_psu[mask_h] *= np.sqrt(n_h / (n_h - 1))
2199-
else:
2200-
# Single implicit stratum — demean across all PSUs, scale by
2201-
# sqrt(n_psu / (n_psu - 1)).
2202-
if n_psu >= 2:
2203-
Psi_psu -= Psi_psu.mean(axis=0, keepdims=True)
2204-
Psi_psu *= np.sqrt(n_psu / (n_psu - 1))
2173+
# Stratum centering + Bessel rescale on the PSU-aggregated
2174+
# influence tensor. Shared with the HAD Stute survey-bootstrap
2175+
# family (had_pretests.stute_test / stute_joint_pretest) which
2176+
# applies the same algebra to PSU multipliers instead — see
2177+
# ``apply_stratum_centering`` docstring and REGISTRY
2178+
# § "Note (Stute stratified survey-bootstrap calibration)".
2179+
apply_stratum_centering(Psi_psu, resolved_survey, psu_ids, psu_axis=0)
22052180

22062181
# PSU-level perturbations: (B, H) = (B, n_psu) @ (n_psu, H).
22072182
# No (1/n) prefactor — Psi_psu_scaled is already on the θ̂-scale

diff_diff/had_pretests.py

Lines changed: 46 additions & 50 deletions
Original file line numberDiff line numberDiff line change
@@ -76,7 +76,10 @@
7676
import pandas as pd
7777
from scipy import stats
7878

79-
from diff_diff.bootstrap_utils import generate_survey_multiplier_weights_batch
79+
from diff_diff.bootstrap_utils import (
80+
apply_stratum_centering,
81+
generate_survey_multiplier_weights_batch,
82+
)
8083
from diff_diff.had import (
8184
_aggregate_first_difference,
8285
_aggregate_unit_resolved_survey,
@@ -1912,36 +1915,19 @@ def stute_test(
19121915
# CvM recompute. Routes via synthetic trivial ResolvedSurveyDesign
19131916
# for the weights= shortcut to share the same kernel.
19141917
resolved_for_boot = survey if survey is not None else make_pweight_design(w_arr)
1915-
# R10 P1: reject stratified designs explicitly until a derived
1916-
# Stute-specific correction lands. The HAD sup-t bootstrap
1917-
# (had.py:2120+) applies a within-stratum demean +
1918-
# sqrt(n_h/(n_h-1)) small-sample correction AFTER
1919-
# generate_survey_multiplier_weights_batch returns, to make the
1920-
# bootstrap variance match the Binder-TSL stratified target.
1921-
# That same correction has NOT been derived for the Stute CvM
1922-
# functional, so applying the helper's raw multipliers directly
1923-
# to residual perturbations on stratified designs leaves the
1924-
# bootstrap p-value silently miscalibrated. Pweight-only,
1925-
# PSU-only, and FPC-only designs are still supported (the
1926-
# helper's output is appropriately scaled for those).
1927-
if resolved_for_boot.strata is not None:
1928-
raise NotImplementedError(
1929-
"stute_test: SurveyDesign(strata=...) with stratified "
1930-
"sampling is not yet supported. The Stute CvM bootstrap "
1931-
"calibration on stratified designs requires a within-"
1932-
"stratum demean + sqrt(n_h/(n_h-1)) small-sample "
1933-
"correction analogous to the HAD sup-t bootstrap, but "
1934-
"the matching derivation for the Stute functional has "
1935-
"not been completed. Pweight-only or PSU-only "
1936-
"(SurveyDesign(weights=..., psu=...)) designs are "
1937-
"supported; pre-process stratified designs to remove "
1938-
"the strata column or wait for the derivation in a "
1939-
"follow-up PR."
1940-
)
1941-
# R5 P1: reject lonely_psu='adjust' singleton-strata designs
1942-
# explicitly (now redundant with the strata guard above; kept
1943-
# for defense in depth and for residual non-stratified
1944-
# singleton-strata edge cases).
1918+
# Stratified designs are supported via the standard stratified
1919+
# clustered wild-bootstrap correction on the PSU multipliers
1920+
# (within-stratum demean + sqrt(n_h/(n_h-1)) Bessel rescale),
1921+
# applied uniformly before the per-obs broadcast eta_obs =
1922+
# psu_mults[b, psu_col_idx] below. See REGISTRY
1923+
# § "Note (Stute stratified survey-bootstrap calibration)" and
1924+
# ``apply_stratum_centering`` (bootstrap_utils.py) for the
1925+
# derivation; the same helper backs the HAD sup-t event-study
1926+
# bootstrap at had.py:2151+.
1927+
# R5 P1: reject lonely_psu='adjust' singleton-strata designs.
1928+
# This pseudo-stratum centering transform has not been derived
1929+
# for the Stute CvM (same gap as the HAD sup-t deviation at
1930+
# REGISTRY § 'Note (HAD sup-t lonely_psu="adjust") deviation').
19451931
if _has_lonely_psu_adjust_singletons(resolved_for_boot):
19461932
raise NotImplementedError(
19471933
"stute_test: SurveyDesign(lonely_psu='adjust') with "
@@ -1988,6 +1974,15 @@ def stute_test(
19881974
psu_mults, psu_ids = generate_survey_multiplier_weights_batch(
19891975
n_bootstrap, resolved_for_boot, weight_type="mammen", rng=rng
19901976
)
1977+
# Stratum centering + Bessel rescale on the PSU multipliers
1978+
# before broadcast. Same algebra as the HAD sup-t bootstrap at
1979+
# had.py:2151+ (applied to the influence tensor there), but
1980+
# applied here to ``psu_mults`` because the Stute bootstrap is a
1981+
# wild-residual / refit-in-loop bootstrap (no precomputed
1982+
# influence tensor exists). See REGISTRY § "Note (Stute
1983+
# stratified survey-bootstrap calibration)" for the derivation
1984+
# and the non-strata calibration shift it introduces.
1985+
apply_stratum_centering(psu_mults, resolved_for_boot, psu_ids, psu_axis=1)
19911986
# Build per-obs PSU-column index. When psu is None (trivial path),
19921987
# each obs is its own PSU and psu_ids = arange(G) - so psu_col_idx
19931988
# is just arange(G).
@@ -3253,25 +3248,18 @@ def stute_joint_pretest(
32533248
# vector-valued empirical-process unit-level dependence (paper
32543249
# convention) AND PSU clustering (Krieger-Pfeffermann 1997).
32553250
resolved_for_boot = survey if survey is not None else make_pweight_design(w_arr)
3256-
# R10 P1: reject stratified designs explicitly until a derived
3257-
# Stute-specific correction lands (mirrors stute_test
3258-
# single-horizon).
3259-
if resolved_for_boot.strata is not None:
3260-
raise NotImplementedError(
3261-
"stute_joint_pretest: SurveyDesign(strata=...) with "
3262-
"stratified sampling is not yet supported. The Stute "
3263-
"CvM bootstrap calibration on stratified designs "
3264-
"requires a within-stratum demean + sqrt(n_h/(n_h-1)) "
3265-
"small-sample correction analogous to the HAD sup-t "
3266-
"bootstrap, but the matching derivation for the joint "
3267-
"Stute functional has not been completed. Pweight-only "
3268-
"or PSU-only designs are supported; pre-process "
3269-
"stratified designs to remove the strata column or wait "
3270-
"for the derivation in a follow-up PR."
3271-
)
3272-
# R5 P1: reject lonely_psu='adjust' singleton-strata designs
3273-
# explicitly (now redundant with the strata guard above; kept
3274-
# for defense in depth).
3251+
# Stratified designs are supported via the standard stratified
3252+
# clustered wild-bootstrap correction on the PSU multipliers
3253+
# (within-stratum demean + sqrt(n_h/(n_h-1)) Bessel rescale),
3254+
# applied uniformly before the per-obs broadcast eta_obs =
3255+
# psu_mults[b, psu_col_idx] below. The joint variant shares the
3256+
# SAME multiplier row across horizons within each replicate, so
3257+
# the stratum correction applies once and inherits across
3258+
# horizons (preserving cross-horizon empirical-process
3259+
# dependence per Hlávka & Huškova 2020 § 3). See REGISTRY
3260+
# § "Note (Stute stratified survey-bootstrap calibration)".
3261+
# R5 P1: reject lonely_psu='adjust' singleton-strata designs.
3262+
# Same pseudo-stratum centering gap as stute_test / HAD sup-t.
32753263
if _has_lonely_psu_adjust_singletons(resolved_for_boot):
32763264
raise NotImplementedError(
32773265
"stute_joint_pretest: SurveyDesign(lonely_psu='adjust') "
@@ -3314,6 +3302,14 @@ def stute_joint_pretest(
33143302
psu_mults, psu_ids = generate_survey_multiplier_weights_batch(
33153303
n_bootstrap, resolved_for_boot, weight_type="mammen", rng=rng
33163304
)
3305+
# Stratum centering + Bessel rescale on the PSU multipliers
3306+
# before broadcast. Single application here (shared with the
3307+
# per-horizon loop below) propagates the same centered
3308+
# multipliers across all horizons in each replicate, preserving
3309+
# the joint Stute's cross-horizon empirical-process dependence.
3310+
# See REGISTRY § "Note (Stute stratified survey-bootstrap
3311+
# calibration)".
3312+
apply_stratum_centering(psu_mults, resolved_for_boot, psu_ids, psu_axis=1)
33173313
if resolved_for_boot.psu is None:
33183314
psu_col_idx = np.arange(G)
33193315
else:

0 commit comments

Comments
 (0)