Skip to content

Commit def6503

Browse files
igerberclaude
andcommitted
Round-4 CI: document terminal-missingness carve-out + end-to-end regression
Addresses the CI round-3 P1 (docs overclaiming support relative to the newly-added `_unroll_target_to_cells` guard) and P2 (no end-to-end fit() regression for terminal missingness + varying PSU + bootstrap). **P1 (docs contract consistency):** The cell-level bootstrap now hard-fails on the terminal-missingness mass-leak regime, but the surrounding docs still advertised full support. Added a "Bootstrap + terminal-missingness scope note" to: - `REGISTRY.md` "terminal missingness retained" Note — describes the cohort-recentering leakage mechanism and directs users to `n_bootstrap=0` for affected panels. - `REGISTRY.md` Survey + bootstrap contract Note — same carve-out, also clarifies that PSU-within-group-constant regimes are unaffected (dispatcher routes to the legacy path). - `CHANGELOG.md` (PR-4 entry) — explicit scope note after the "closes the last NotImplementedError gate" claim. - `fit()` docstring `survey_design` paragraph — scope note directs users to `n_bootstrap=0` as the documented workaround. **P2 (end-to-end fit() regression):** Added `test_bootstrap_fit_raises_on_terminal_missingness_with_varying_psu` in TestBootstrapCellPeriod. Fixture: 10 groups with joiners cohort (at period 3), leavers cohort (at period 4), and never-treated controls; group 2 is terminally missing at periods 4-5. At period 4 the other joiners serve as stable_1 controls for the leavers, producing non-zero cohort mean in cohort A — `_cohort_recenter_per_period` leaks `-col_mean` onto group 2's missing cell. Varying PSU (period parity per group) routes the bootstrap to the cell-level path. Test asserts: - `fit(..., n_bootstrap=50)` raises `ValueError` with the documented "no positive-weight observations" message. - `fit(..., n_bootstrap=0)` succeeds on the same panel — analytical TSL supports this regime (the contract the scope note preserves). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 9ebb682 commit def6503

4 files changed

Lines changed: 92 additions & 8 deletions

File tree

CHANGELOG.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
1010
### Changed
1111
- Add Zenodo DOI badge to README; upgrade the BibTeX citation block with the concept DOI (`10.5281/zenodo.19646175`) and list author as Isaac Gerber (matching `CITATION.cff`). Add `doi:` and `identifiers:` entries (concept + versioned) to `CITATION.cff`. DOI was minted by Zenodo when v3.1.3 was released.
1212
- **`ChaisemartinDHaultfoeuille` heterogeneity + within-group-varying PSU/strata now supported under Binder TSL** - `fit(heterogeneity=..., survey_design=...)` no longer raises `NotImplementedError` when the resolved design's PSU or strata vary across the cells of a group. On the **Binder TSL** branch (`compute_survey_if_variance`), the heterogeneity WLS coefficient IF is expanded to observation level via the cell-period allocator `ψ_i = ψ_g * (w_i / W_{g, out_idx})` on the post-period cell — the DID_l post-period single-cell convention shipped in v3.1.x. Under PSU=group the PSU-level Binder TSL variance is byte-identical to the previous release (PSU-level aggregate telescopes to `ψ_g`); under within-group-varying PSU, mass lands in the post-period PSU of the transition. The **Rao-Wu replicate-weight** branch (`compute_replicate_if_variance`) retains the legacy group-level allocator `ψ_i = ψ_g * (w_i / W_g)`: replicate variance computes `θ_r = sum_i ratio_ir * ψ_i` at observation level and is therefore not PSU-telescoping, so the cell-period allocator would silently change the replicate SE whenever a replicate column's ratios vary within group (e.g., per-row replicate matrices). Replicate + heterogeneity fits therefore produce byte-identical SE to the previous release, and the newly-unblocked `heterogeneity=` + within-group-varying PSU combination is unreachable under replicate designs by construction (`SurveyDesign` rejects `replicate_weights` combined with explicit `strata/psu/fpc`).
13-
- **`ChaisemartinDHaultfoeuille.fit(survey_design=..., n_bootstrap > 0)` now supports within-group-varying PSU** — the PSU-level Hall-Mammen wild multiplier bootstrap has been extended from a group-level PSU map (one multiplier per group) to a cell-level PSU map (one multiplier per `(g, t)` cell's PSU). A dispatcher in `_compute_dcdh_bootstrap` detects PSU-within-group-constant regimes (including PSU=group auto-inject and strictly-coarser PSU with within-group constancy) and routes them through the legacy group-level path so the bootstrap SE is bit-identical to the previous release (guarded by the new `test_bootstrap_se_matches_pre_pr4_baseline` and the pre-existing `test_auto_inject_bit_identical_to_group_level`). Under within-group-varying PSU, a group contributing cells to multiple PSUs receives independent multiplier draws per PSU — the correct Hall-Mammen wild PSU clustering at cell granularity. Multi-horizon bootstraps draw a single shared `(n_bootstrap, n_psu)` PSU-level weight matrix per block and broadcast per-horizon via each horizon's cell-to-PSU map, so the sup-t simultaneous confidence band remains a valid joint distribution. Closes the last `NotImplementedError` gate in the dCDH survey contract; replicate-weight variance and `n_bootstrap > 0` remain mutually exclusive by construction.
13+
- **`ChaisemartinDHaultfoeuille.fit(survey_design=..., n_bootstrap > 0)` now supports within-group-varying PSU** — the PSU-level Hall-Mammen wild multiplier bootstrap has been extended from a group-level PSU map (one multiplier per group) to a cell-level PSU map (one multiplier per `(g, t)` cell's PSU). A dispatcher in `_compute_dcdh_bootstrap` detects PSU-within-group-constant regimes (including PSU=group auto-inject and strictly-coarser PSU with within-group constancy) and routes them through the legacy group-level path so the bootstrap SE is bit-identical to the previous release (guarded by the new `test_bootstrap_se_matches_pre_pr4_baseline` and the pre-existing `test_auto_inject_bit_identical_to_group_level`). Under within-group-varying PSU, a group contributing cells to multiple PSUs receives independent multiplier draws per PSU — the correct Hall-Mammen wild PSU clustering at cell granularity. Multi-horizon bootstraps draw a single shared `(n_bootstrap, n_psu)` PSU-level weight matrix per block and broadcast per-horizon via each horizon's cell-to-PSU map, so the sup-t simultaneous confidence band remains a valid joint distribution. Closes the last `NotImplementedError` gate in the dCDH survey contract; replicate-weight variance and `n_bootstrap > 0` remain mutually exclusive by construction. **Scope note:** when a panel has *terminal missingness* (groups observed only through an early period) combined with within-group-varying PSU, the cell-level bootstrap raises a targeted `ValueError` — cohort-recentering leaks centered IF mass onto cells with no positive-weight observations, which the cell-level bootstrap cannot allocate to any PSU. Use `n_bootstrap=0` (analytical TSL variance, which supports that regime) on such panels. PSU-within-group-constant regimes (including PSU=group auto-inject) are unaffected.
1414

1515
## [3.1.3] - 2026-04-18
1616

diff_diff/chaisemartin_dhaultfoeuille.py

Lines changed: 12 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -666,11 +666,18 @@ def fit(
666666
bootstrap uses a cell-level wild PSU allocator — a group
667667
contributing cells to multiple PSUs receives independent
668668
multiplier draws per PSU (see the Survey + bootstrap
669-
contract Note in REGISTRY.md). **Replicate weights with
670-
``n_bootstrap > 0`` raises ``NotImplementedError``**
671-
(replicate variance is closed-form; bootstrap would
672-
double-count variance). See REGISTRY.md
673-
``ChaisemartinDHaultfoeuille`` Notes for the full contract.
669+
contract Note in REGISTRY.md). **Scope note (terminal
670+
missingness):** on panels with terminally-missing groups
671+
(early exit / right-censoring) combined with within-group-
672+
varying PSU, the cell-level bootstrap raises
673+
``ValueError`` because cohort-recentering leaks centered
674+
IF mass onto cells with no positive-weight obs. Use
675+
``n_bootstrap=0`` for analytical TSL variance on those
676+
panels. **Replicate weights with ``n_bootstrap > 0``
677+
raises ``NotImplementedError``** (replicate variance is
678+
closed-form; bootstrap would double-count variance). See
679+
REGISTRY.md ``ChaisemartinDHaultfoeuille`` Notes for the
680+
full contract.
674681
675682
Returns
676683
-------

0 commit comments

Comments
 (0)