Skip to content

Commit 4147406

Browse files
committed
Fix CI review R6: warning gate on eligible groups + docstring updates
Addresses PR #311 AI review R6 (2 × P3 cleanups). P3 #1: Warning gate was computed from raw positive-weight groups, not the post-filter eligible-group set used to build the bootstrap PSU map. Panels where upstream dCDH filtering drops groups that share PSUs with kept groups could emit a misleading "PSU coarser than group" warning even when the effective bootstrap is one group per PSU. Fix: count PSUs and groups from `_eligible_group_ids` (the same set feeding `group_id_to_psu_code_bootstrap`), preserving the within- group-constant-PSU invariant by taking each eligible group's first positive-weight PSU label. P3 #2: Two docstrings said the bootstrap is "clustered at the group level" only — now incomplete after the PSU-level survey path: - `diff_diff/chaisemartin_dhaultfoeuille.py` class docstring: extended to note PSU-level Hall-Mammen wild clustering under `survey_design` with coarser PSU. - `diff_diff/chaisemartin_dhaultfoeuille_bootstrap.py` module docstring: documents the identity-map fast path (auto-inject `psu=group`), the PSU-level broadcast when PSU is strictly coarser, and points to REGISTRY.md for the full contract. Full regression: 318 passing.
1 parent abdfea4 commit 4147406

2 files changed

Lines changed: 44 additions & 17 deletions

File tree

diff_diff/chaisemartin_dhaultfoeuille.py

Lines changed: 32 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -307,7 +307,10 @@ class ChaisemartinDHaultfoeuille(ChaisemartinDHaultfoeuilleBootstrapMixin):
307307
default; gate via ``placebo=False``)
308308
- Analytical SE via the cohort-recentered plug-in formula from
309309
Web Appendix Section 3.7.3 of the dynamic paper
310-
- Optional multiplier bootstrap clustered at the group level
310+
- Optional multiplier bootstrap, clustered at the group level by
311+
default; under ``survey_design`` with a strictly-coarser PSU, the
312+
bootstrap switches to PSU-level Hall-Mammen wild clustering (see
313+
REGISTRY.md ChaisemartinDHaultfoeuille Note on survey + bootstrap)
311314
- Optional TWFE decomposition diagnostic from Theorem 1 of AER 2020
312315
(per-cell weights, fraction negative, ``sigma_fe``)
313316
@@ -1970,23 +1973,40 @@ def fit(
19701973
# psu=<group_col>), groups and PSUs coincide so
19711974
# Hall-Mammen wild PSU bootstrap equals group-level
19721975
# multiplier bootstrap — no need to warn.
1976+
# Count groups/PSUs on the POST-FILTER eligible set
1977+
# (`_eligible_group_ids`) — the same set the bootstrap
1978+
# map is built from below. Using raw positive-weight
1979+
# groups here would emit a misleading warning on
1980+
# panels where upstream dCDH filtering drops groups
1981+
# that happen to share PSUs with kept groups.
19731982
psu_arr_warn = getattr(resolved_survey, "psu", None)
19741983
if psu_arr_warn is None or _obs_survey_info is None:
19751984
# No PSU info — can't compare to group count.
19761985
n_psu_eff_warn, n_groups_eff_warn = -1, -1
19771986
else:
1978-
pos_mask_warn = (
1979-
np.asarray(
1980-
_obs_survey_info["weights"], dtype=np.float64
1981-
)
1982-
> 0
1987+
obs_gids_warn = np.asarray(_obs_survey_info["group_ids"])
1988+
obs_ws_warn = np.asarray(
1989+
_obs_survey_info["weights"], dtype=np.float64
1990+
)
1991+
pos_mask_warn = obs_ws_warn > 0
1992+
psu_codes_warn = np.asarray(psu_arr_warn)
1993+
# Collect the PSU label for each variance-eligible
1994+
# group (within-group-constant PSU is validated
1995+
# upstream, so the first positive-weight label
1996+
# represents the whole group).
1997+
eligible_psu_labels: List[Any] = []
1998+
for gid in _eligible_group_ids:
1999+
mask_g = (obs_gids_warn == gid) & pos_mask_warn
2000+
if mask_g.any():
2001+
eligible_psu_labels.append(
2002+
psu_codes_warn[mask_g][0]
2003+
)
2004+
n_groups_eff_warn = len(eligible_psu_labels)
2005+
n_psu_eff_warn = (
2006+
int(len(np.unique(np.asarray(eligible_psu_labels))))
2007+
if eligible_psu_labels
2008+
else -1
19832009
)
1984-
psu_eff = np.asarray(psu_arr_warn)[pos_mask_warn]
1985-
gids_eff_warn = np.asarray(
1986-
_obs_survey_info["group_ids"]
1987-
)[pos_mask_warn]
1988-
n_psu_eff_warn = int(len(np.unique(psu_eff)))
1989-
n_groups_eff_warn = int(len(np.unique(gids_eff_warn)))
19902010
if 0 <= n_psu_eff_warn < n_groups_eff_warn:
19912011
warnings.warn(
19922012
f"Bootstrap with survey_design uses Hall-Mammen "

diff_diff/chaisemartin_dhaultfoeuille_bootstrap.py

Lines changed: 12 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -4,11 +4,18 @@
44
55
The dCDH papers prescribe only the analytical cohort-recentered plug-in
66
variance from Web Appendix Section 3.7.3 of the dynamic companion paper.
7-
This module adds an opt-in multiplier bootstrap clustered at the group
8-
level, matching the inference convention used by ``CallawaySantAnna``,
9-
``ImputationDiD``, and ``TwoStageDiD``. The bootstrap is a library
10-
extension, not a paper requirement, and is documented as such in
11-
``REGISTRY.md``.
7+
This module adds an opt-in multiplier bootstrap, clustered at the group
8+
level by default (matching the inference convention used by
9+
``CallawaySantAnna``, ``ImputationDiD``, and ``TwoStageDiD``). Under
10+
``survey_design`` with an explicitly-coarser PSU, the bootstrap switches
11+
to PSU-level Hall-Mammen wild clustering: each PSU draws a single
12+
multiplier and all groups within that PSU share it
13+
(see ``_generate_psu_or_group_weights`` and ``_map_for_target`` below,
14+
plus the REGISTRY.md ``ChaisemartinDHaultfoeuille`` Note on survey +
15+
bootstrap). Under the default auto-inject ``psu=group`` each group is
16+
its own PSU and the identity-map fast path reproduces the original
17+
group-level behavior bit-for-bit. The bootstrap is a library extension,
18+
not a paper requirement, and is documented as such in ``REGISTRY.md``.
1219
1320
The mixin operates on **pre-computed cohort-centered influence-function
1421
values**: the main estimator class computes per-group ``U^G_g`` values

0 commit comments

Comments
 (0)