Commit ee0cc54
Densify PSU codes over eligible subset + always populate per-cell tensor
Addresses two P0 correctness regressions in the PR-4 bootstrap PSU-map
plumbing flagged by CI review.
**P0 #1 - valid_map gate discarded the per-cell tensor too eagerly.**
When any variance-eligible group had no positive-weight cells (all-
sentinel row in psu_codes_per_cell), the old code set valid_map=False
and left BOTH group_id_to_psu_code_bootstrap AND
psu_codes_per_cell_bootstrap as None. The bootstrap then silently
dropped to unclustered group-level instead of excluding only that
group's empty row. Fix: always populate psu_codes_per_cell_bootstrap
once the tensor is built; the cell-level path already masks out -1
cells at unroll time. Always populate group_id_to_psu_code_bootstrap
with a per-group code (use placeholder 0 for all-sentinel rows since
those groups have no IF mass and the multiplier they receive is
irrelevant on either the legacy or the cell-level path).
**P0 #2 - dense PSU codes factorized over non-eligible subset.**
`np.unique(obs_psu_codes[pos_mask_boot])` previously included PSU
labels from groups that were filtered out of _eligible_group_ids
(e.g., singleton-baseline-excluded groups). The excluded groups'
PSUs contributed dense codes that formed gaps in the eligible
subset's map. Downstream `_generate_psu_or_group_weights` computes
`n_psu = max(code) + 1` and triggers the identity fast path when
`n_psu >= n_groups_target`. A gapped map like `[1, 1]` or `[0, 2, 2]`
silently activated independent-draws clustering for eligible groups
that should have shared a multiplier. Fix: restrict the np.unique
factorization to the eligible-subset positive-weight obs only
(`elig_obs_mask = pos_mask_boot & (g_idx_arr >= 0) & (t_idx_arr >=
0)`), so the dense code domain exactly matches the PSUs actually
used by variance-eligible groups.
Tests:
- `test_bootstrap_zero_weight_group_equivalent_to_removing_it`:
fit with vs without an all-zero-weight eligible group must
produce byte-identical bootstrap SE at the same seed (byte-
identity would have failed before P0 #1 fix because valid_map
flipped the PSU-aware path off for the with-zero-group fit).
- `test_bootstrap_dense_codes_under_singleton_baseline_excluded_group`:
spies on the group_id_to_psu_code dict passed to
`_compute_dcdh_bootstrap` under a fixture with an always-treated
singleton-baseline group and strictly-coarser PSU among eligible
groups. Asserts the dict's values form a contiguous `[0,
n_unique-1]` range (no gaps from the excluded group's PSU), and
that eligible groups sharing a PSU label receive the same dense
code.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>1 parent 77ce297 commit ee0cc54
2 files changed
Lines changed: 213 additions & 56 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2314 | 2314 | | |
2315 | 2315 | | |
2316 | 2316 | | |
| 2317 | + | |
| 2318 | + | |
| 2319 | + | |
| 2320 | + | |
| 2321 | + | |
| 2322 | + | |
| 2323 | + | |
| 2324 | + | |
| 2325 | + | |
| 2326 | + | |
| 2327 | + | |
| 2328 | + | |
| 2329 | + | |
| 2330 | + | |
2317 | 2331 | | |
2318 | | - | |
2319 | | - | |
2320 | | - | |
2321 | | - | |
| 2332 | + | |
| 2333 | + | |
| 2334 | + | |
| 2335 | + | |
| 2336 | + | |
| 2337 | + | |
| 2338 | + | |
| 2339 | + | |
| 2340 | + | |
| 2341 | + | |
| 2342 | + | |
| 2343 | + | |
2322 | 2344 | | |
2323 | | - | |
2324 | | - | |
2325 | | - | |
| 2345 | + | |
| 2346 | + | |
| 2347 | + | |
2326 | 2348 | | |
2327 | | - | |
| 2349 | + | |
2328 | 2350 | | |
2329 | 2351 | | |
2330 | 2352 | | |
2331 | | - | |
| 2353 | + | |
2332 | 2354 | | |
2333 | 2355 | | |
2334 | | - | |
| 2356 | + | |
| 2357 | + | |
| 2358 | + | |
| 2359 | + | |
| 2360 | + | |
| 2361 | + | |
| 2362 | + | |
2335 | 2363 | | |
2336 | | - | |
2337 | | - | |
2338 | | - | |
2339 | | - | |
2340 | | - | |
2341 | | - | |
2342 | 2364 | | |
2343 | 2365 | | |
2344 | 2366 | | |
2345 | | - | |
2346 | | - | |
2347 | | - | |
2348 | | - | |
2349 | | - | |
2350 | | - | |
2351 | | - | |
2352 | | - | |
2353 | | - | |
2354 | | - | |
2355 | | - | |
2356 | | - | |
2357 | | - | |
2358 | 2367 | | |
2359 | | - | |
2360 | | - | |
2361 | | - | |
2362 | | - | |
2363 | | - | |
2364 | | - | |
2365 | | - | |
2366 | | - | |
2367 | | - | |
2368 | | - | |
| 2368 | + | |
| 2369 | + | |
| 2370 | + | |
| 2371 | + | |
| 2372 | + | |
| 2373 | + | |
| 2374 | + | |
| 2375 | + | |
| 2376 | + | |
| 2377 | + | |
| 2378 | + | |
| 2379 | + | |
| 2380 | + | |
| 2381 | + | |
| 2382 | + | |
| 2383 | + | |
2369 | 2384 | | |
2370 | | - | |
2371 | 2385 | | |
2372 | 2386 | | |
2373 | 2387 | | |
2374 | 2388 | | |
2375 | | - | |
2376 | | - | |
2377 | | - | |
2378 | | - | |
2379 | | - | |
2380 | | - | |
2381 | | - | |
2382 | | - | |
2383 | | - | |
2384 | | - | |
2385 | | - | |
2386 | | - | |
2387 | | - | |
2388 | | - | |
2389 | | - | |
| 2389 | + | |
| 2390 | + | |
| 2391 | + | |
| 2392 | + | |
| 2393 | + | |
| 2394 | + | |
| 2395 | + | |
2390 | 2396 | | |
2391 | | - | |
| 2397 | + | |
| 2398 | + | |
| 2399 | + | |
| 2400 | + | |
2392 | 2401 | | |
2393 | 2402 | | |
2394 | 2403 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2001 | 2001 | | |
2002 | 2002 | | |
2003 | 2003 | | |
| 2004 | + | |
| 2005 | + | |
| 2006 | + | |
| 2007 | + | |
| 2008 | + | |
| 2009 | + | |
| 2010 | + | |
| 2011 | + | |
| 2012 | + | |
| 2013 | + | |
| 2014 | + | |
| 2015 | + | |
| 2016 | + | |
| 2017 | + | |
| 2018 | + | |
| 2019 | + | |
| 2020 | + | |
| 2021 | + | |
| 2022 | + | |
| 2023 | + | |
| 2024 | + | |
| 2025 | + | |
| 2026 | + | |
| 2027 | + | |
| 2028 | + | |
| 2029 | + | |
| 2030 | + | |
| 2031 | + | |
| 2032 | + | |
| 2033 | + | |
| 2034 | + | |
| 2035 | + | |
| 2036 | + | |
| 2037 | + | |
| 2038 | + | |
| 2039 | + | |
| 2040 | + | |
| 2041 | + | |
| 2042 | + | |
| 2043 | + | |
| 2044 | + | |
| 2045 | + | |
| 2046 | + | |
| 2047 | + | |
| 2048 | + | |
| 2049 | + | |
| 2050 | + | |
| 2051 | + | |
| 2052 | + | |
| 2053 | + | |
| 2054 | + | |
| 2055 | + | |
| 2056 | + | |
| 2057 | + | |
| 2058 | + | |
| 2059 | + | |
| 2060 | + | |
| 2061 | + | |
| 2062 | + | |
| 2063 | + | |
| 2064 | + | |
| 2065 | + | |
| 2066 | + | |
| 2067 | + | |
| 2068 | + | |
| 2069 | + | |
| 2070 | + | |
| 2071 | + | |
| 2072 | + | |
| 2073 | + | |
| 2074 | + | |
| 2075 | + | |
| 2076 | + | |
| 2077 | + | |
| 2078 | + | |
| 2079 | + | |
| 2080 | + | |
| 2081 | + | |
| 2082 | + | |
| 2083 | + | |
| 2084 | + | |
| 2085 | + | |
| 2086 | + | |
| 2087 | + | |
| 2088 | + | |
| 2089 | + | |
| 2090 | + | |
| 2091 | + | |
| 2092 | + | |
| 2093 | + | |
| 2094 | + | |
| 2095 | + | |
| 2096 | + | |
| 2097 | + | |
| 2098 | + | |
| 2099 | + | |
| 2100 | + | |
| 2101 | + | |
| 2102 | + | |
| 2103 | + | |
| 2104 | + | |
| 2105 | + | |
| 2106 | + | |
| 2107 | + | |
| 2108 | + | |
| 2109 | + | |
| 2110 | + | |
| 2111 | + | |
| 2112 | + | |
| 2113 | + | |
| 2114 | + | |
| 2115 | + | |
| 2116 | + | |
| 2117 | + | |
| 2118 | + | |
| 2119 | + | |
| 2120 | + | |
| 2121 | + | |
| 2122 | + | |
| 2123 | + | |
| 2124 | + | |
| 2125 | + | |
| 2126 | + | |
| 2127 | + | |
| 2128 | + | |
| 2129 | + | |
| 2130 | + | |
| 2131 | + | |
| 2132 | + | |
| 2133 | + | |
| 2134 | + | |
| 2135 | + | |
| 2136 | + | |
| 2137 | + | |
| 2138 | + | |
| 2139 | + | |
| 2140 | + | |
| 2141 | + | |
| 2142 | + | |
| 2143 | + | |
| 2144 | + | |
| 2145 | + | |
| 2146 | + | |
| 2147 | + | |
| 2148 | + | |
| 2149 | + | |
| 2150 | + | |
| 2151 | + | |
0 commit comments