igerber
diff --git a/‎CHANGELOG.md‎
Lines changed: 1 addition & 1 deletion b/‎CHANGELOG.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎diff_diff/chaisemartin_dhaultfoeuille.py‎
Lines changed: 43 additions & 7 deletions b/‎diff_diff/chaisemartin_dhaultfoeuille.py‎
Lines changed: 43 additions & 7 deletions
@@ -10,7 +10,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 ### Changed
 - Add Zenodo DOI badge to README; upgrade the BibTeX citation block with the concept DOI (`10.5281/zenodo.19646175`) and list author as Isaac Gerber (matching `CITATION.cff`). Add `doi:` and `identifiers:` entries (concept + versioned) to `CITATION.cff`. DOI was minted by Zenodo when v3.1.3 was released.
 - **`ChaisemartinDHaultfoeuille` heterogeneity + within-group-varying PSU/strata now supported under Binder TSL** - `fit(heterogeneity=..., survey_design=...)` no longer raises `NotImplementedError` when the resolved design's PSU or strata vary across the cells of a group. On the **Binder TSL** branch (`compute_survey_if_variance`), the heterogeneity WLS coefficient IF is expanded to observation level via the cell-period allocator `ψ_i = ψ_g * (w_i / W_{g, out_idx})` on the post-period cell — the DID_l post-period single-cell convention shipped in v3.1.x. Under PSU=group the PSU-level Binder TSL variance is byte-identical to the previous release (PSU-level aggregate telescopes to `ψ_g`); under within-group-varying PSU, mass lands in the post-period PSU of the transition. The **Rao-Wu replicate-weight** branch (`compute_replicate_if_variance`) retains the legacy group-level allocator `ψ_i = ψ_g * (w_i / W_g)`: replicate variance computes `θ_r = sum_i ratio_ir * ψ_i` at observation level and is therefore not PSU-telescoping, so the cell-period allocator would silently change the replicate SE whenever a replicate column's ratios vary within group (e.g., per-row replicate matrices). Replicate + heterogeneity fits therefore produce byte-identical SE to the previous release, and the newly-unblocked `heterogeneity=` + within-group-varying PSU combination is unreachable under replicate designs by construction (`SurveyDesign` rejects `replicate_weights` combined with explicit `strata/psu/fpc`).
-- **`ChaisemartinDHaultfoeuille.fit(survey_design=..., n_bootstrap > 0)` now supports within-group-varying PSU** — the PSU-level Hall-Mammen wild multiplier bootstrap has been extended from a group-level PSU map (one multiplier per group) to a cell-level PSU map (one multiplier per `(g, t)` cell's PSU). A dispatcher in `_compute_dcdh_bootstrap` detects PSU-within-group-constant regimes (including PSU=group auto-inject and strictly-coarser PSU with within-group constancy) and routes them through the legacy group-level path so the bootstrap SE is bit-identical to the previous release (guarded by the new `test_bootstrap_se_matches_pre_pr4_baseline` and the pre-existing `test_auto_inject_bit_identical_to_group_level`). Under within-group-varying PSU, a group contributing cells to multiple PSUs receives independent multiplier draws per PSU — the correct Hall-Mammen wild PSU clustering at cell granularity. Multi-horizon bootstraps draw a single shared `(n_bootstrap, n_psu)` PSU-level weight matrix per block and broadcast per-horizon via each horizon's cell-to-PSU map, so the sup-t simultaneous confidence band remains a valid joint distribution. Closes the last `NotImplementedError` gate in the dCDH survey contract; replicate-weight variance and `n_bootstrap > 0` remain mutually exclusive by construction. **Scope note:** when a panel has *terminal missingness* (groups observed only through an early period) combined with within-group-varying PSU, the cell-level bootstrap raises a targeted `ValueError` — cohort-recentering leaks centered IF mass onto cells with no positive-weight observations, which the cell-level bootstrap cannot allocate to any PSU. Use `n_bootstrap=0` (analytical TSL variance, which supports that regime) on such panels. PSU-within-group-constant regimes (including PSU=group auto-inject) are unaffected.
+- **`ChaisemartinDHaultfoeuille.fit(survey_design=..., n_bootstrap > 0)` now supports within-group-varying PSU** — the PSU-level Hall-Mammen wild multiplier bootstrap has been extended from a group-level PSU map (one multiplier per group) to a cell-level PSU map (one multiplier per `(g, t)` cell's PSU). A dispatcher in `_compute_dcdh_bootstrap` detects PSU-within-group-constant regimes (including PSU=group auto-inject and strictly-coarser PSU with within-group constancy) and routes them through the legacy group-level path so the bootstrap SE is bit-identical to the previous release (guarded by the new `test_bootstrap_se_matches_pre_pr4_baseline` and the pre-existing `test_auto_inject_bit_identical_to_group_level`). Under within-group-varying PSU, a group contributing cells to multiple PSUs receives independent multiplier draws per PSU — the correct Hall-Mammen wild PSU clustering at cell granularity. Multi-horizon bootstraps draw a single shared `(n_bootstrap, n_psu)` PSU-level weight matrix per block and broadcast per-horizon via each horizon's cell-to-PSU map, so the sup-t simultaneous confidence band remains a valid joint distribution. Closes the last `NotImplementedError` gate in the dCDH survey contract; replicate-weight variance and `n_bootstrap > 0` remain mutually exclusive by construction. **Scope note:** under survey designs with within-group-varying PSU, panels with *terminal missingness* (groups observed only through an early period) where the terminally-missing group is in a cohort whose other groups still contribute at the missing period now raise a targeted `ValueError` on **both** the cell-level bootstrap and the analytical TSL path. Cohort-recentering leaks centered IF mass onto cells with no positive-weight observations, and both paths share the cell-period allocator that cannot allocate that mass. The analytical guard is new in this release and closes a silent mass-drop bug introduced by the cell-period allocator in v3.1.x; pre-processing the panel (drop late-exit groups or trim to a balanced sub-panel) or using an explicit `psu=<group_col>` so the dispatcher routes through the legacy group-level path is the documented workaround. PSU-within-group-constant regimes (including PSU=group auto-inject) are unaffected.
 
 ## [3.1.3] - 2026-04-18
 
 
@@ -667,13 +667,18 @@ def fit(
             contributing cells to multiple PSUs receives independent
             multiplier draws per PSU (see the Survey + bootstrap
             contract Note in REGISTRY.md). **Scope note (terminal
-            missingness):** on panels with terminally-missing groups
-            (early exit / right-censoring) combined with within-group-
-            varying PSU, the cell-level bootstrap raises
-            ``ValueError`` because cohort-recentering leaks centered
-            IF mass onto cells with no positive-weight obs. Use
-            ``n_bootstrap=0`` for analytical TSL variance on those
-            panels. **Replicate weights with ``n_bootstrap > 0``
+            missingness + within-group-varying PSU):** on panels
+            where a terminally-missing group is in a cohort whose
+            other groups still contribute at the missing period,
+            **both** the cell-level bootstrap and the analytical TSL
+            path raise a targeted ``ValueError``. Cohort-recentering
+            leaks centered IF mass onto cells with no positive-
+            weight obs, which the cell-period allocator cannot
+            allocate to any observation or PSU. Pre-process the
+            panel (drop late-exit groups or trim to a balanced
+            sub-panel), or use an explicit ``psu=<group_col>`` so
+            the dispatcher routes through the legacy group-level
+            path. **Replicate weights with ``n_bootstrap > 0``
             raises ``NotImplementedError``** (replicate variance is
             closed-form; bootstrap would double-count variance). See
             REGISTRY.md ``ChaisemartinDHaultfoeuille`` Notes for the
@@ -5885,6 +5890,37 @@ def _survey_se_from_group_if(
             (elig_idx_eff[valid_cell], col_idx_eff[valid_cell]),
             w_eff[valid_cell],
         )
+        # Sentinel-mass guard (mirror of `_unroll_target_to_cells` on
+        # the bootstrap path). Under terminal missingness,
+        # `_cohort_recenter_per_period` subtracts cohort column means
+        # across the full period grid, so a group with no observation
+        # at period t can acquire non-zero centered mass at that cell.
+        # The cell-level expansion `psi_i = U[g,t] * (w_i / W_{g,t})`
+        # has no observation to attach that mass to (W_{g,t} = 0), so
+        # silently dropping it would understate the SE. Raise a
+        # targeted ValueError instead (consistent with the cell-level
+        # bootstrap's `_unroll_target_to_cells` guard).
+        missing_cell_mask = W_cell == 0
+        if missing_cell_mask.any():
+            leaked = U_centered_per_period[missing_cell_mask]
+            if leaked.size > 0 and bool(
+                np.any(np.abs(leaked) > 1e-12)
+            ):
+                raise ValueError(
+                    "Analytical survey SE cannot be computed on this "
+                    "panel: cohort-recentered IF mass landed on (g, t) "
+                    "cells with no positive-weight observations "
+                    "(W_{g, t} = 0). This typically occurs when "
+                    "terminal missingness combines with within-group-"
+                    "varying PSU: _cohort_recenter_per_period subtracts "
+                    "column means across the full period grid, so a "
+                    "group with no observation at period t acquires "
+                    "non-zero centered mass there, which the cell-level "
+                    "analytical expansion cannot allocate to any "
+                    "observation. Pre-process the panel to remove "
+                    "terminal missingness (drop late-exit groups or "
+                    "trim to a balanced sub-panel) before fitting."
+                )
         # Lookup U_centered_per_period and W_cell per row.
         u_obs_cell = np.zeros(w_eff.shape[0], dtype=np.float64)
         u_obs_cell[valid_cell] = U_centered_per_period[