Skip to content

Commit ffd2e50

Browse files
igerberclaude
andcommitted
Address PR #365 R5 P1 + P3: zero-variance vs NaN; lonely_psu contract; REGISTRY docs
P1 (Methodology — zero computed variance conflated with undefined): ``_jackknife_se_survey`` previously collapsed ``total_variance <= 0.0`` into ``SE=NaN`` with an "every stratum was skipped" warning. That is correct for the "no stratum contributed" branch (undefined per Rust & Rao) but wrong for legitimate zero-variance outcomes: full-census FPC (``fpc[h] == n_h`` → ``f_h = 1`` → ``(1 - f_h) = 0`` zeros every stratum contribution even when within-stratum dispersion is non-zero) and exact-zero within-stratum dispersion both give ``total_variance = 0`` by construction, not by "undefined". Fix: split the terminal branch. Return ``SE=NaN`` only when no stratum contributed; otherwise return ``SE = sqrt(max(total_variance, 0.0))``. The ``max(..., 0.0)`` protects against sub-FP-epsilon negatives and preserves the legitimate zero case at bit precision. New regression ``test_jackknife_full_design_full_census_fpc_returns_zero_se``: fits on ``sdid_survey_data_jk_well_formed`` with ``fpc=3`` (n_h=3 per stratum → f_h=1 → zero SE by design). Asserts ``result.se == 0.0`` (not NaN). P1 (Methodology — lonely_psu silently ignored on jackknife path): The full-design jackknife always skipped singleton strata (``n_h < 2``) unconditionally, regardless of the user's ``SurveyDesign(lonely_psu=...)`` choice. ``"certainty"`` and ``"adjust"`` were silently degraded to ``"remove"``, which understates SE when the user intended ``"certainty"`` (equivalent to skip on jackknife) or flips what should be a zero-variance certainty case into NaN otherwise. Fix: validate ``resolved_survey_unit.lonely_psu`` at fit-time on the survey jackknife path. ``"remove"`` and ``"certainty"`` are both accepted (they produce the same SE on this path — singleton strata contribute 0 variance under both, matching canonical Rust & Rao / ``survey::svyjkn`` behavior for JKn). ``"adjust"`` (R's overall-mean fallback for singleton strata) is rejected with ``NotImplementedError`` and a targeted message pointing to bootstrap as the unconstrained alternative. Two regressions: * ``test_jackknife_full_design_lonely_psu_adjust_raises`` — verifies the rejection message. * ``test_jackknife_full_design_lonely_psu_certainty_equivalent_to_remove`` — asserts ``SE_remove == SE_certainty`` at ``rel=1e-14`` on the well-formed fixture. P3 (Documentation — REGISTRY lag): * Placebo feasibility Notes documented Cases B and C but missed Case D (the exact-count degeneracy guard added in R4). Split the "Fit-time feasibility guards" paragraph into an explicit 3-case enumeration (B: zero-control-stratum; C: undersupplied stratum; D: all-exact- count strata → single allocation). * ``get_loo_effects_df()`` description still said "Requires variance_method='jackknife'; raises ValueError otherwise." after R2 taught it to also raise ``NotImplementedError`` on PSU-level survey jackknife. Rewrote to distinguish unit-level (available) vs PSU- level (blocked, with pointer to ``result.placebo_effects``). * Added a Zero-variance-vs-undefined distinction paragraph and a "lonely_psu contract" paragraph to the jackknife survey Note, matching the shipped behavior from the two P1 fixes above. Verification: 93 passed (3 new regressions). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent f039e2f commit ffd2e50

3 files changed

Lines changed: 158 additions & 5 deletions

File tree

diff_diff/synthetic_did.py

Lines changed: 35 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -935,6 +935,31 @@ def fit( # type: ignore[override]
935935
if _jackknife_use_survey_path:
936936
# PSU-level LOO + stratum aggregation (Rust & Rao 1996).
937937
assert w_control is not None and w_treated is not None
938+
# R5 P1 fix: validate ``lonely_psu`` mode. The survey
939+
# jackknife currently skips singleton strata (n_h < 2)
940+
# unconditionally — equivalent to R ``survey::svyjkn``'s
941+
# ``"remove"`` and ``"certainty"`` modes (both zero-
942+
# contribution for singleton strata). ``"adjust"`` (use
943+
# overall mean for singleton strata) is not implemented
944+
# for SDID jackknife; reject upfront rather than silently
945+
# treating it as ``"remove"``.
946+
_lonely_psu_mode = getattr(
947+
resolved_survey_unit, "lonely_psu", "remove"
948+
)
949+
if _lonely_psu_mode not in ("remove", "certainty"):
950+
raise NotImplementedError(
951+
f"SurveyDesign(lonely_psu={_lonely_psu_mode!r}) is "
952+
"not supported on the SDID jackknife survey path. "
953+
"'remove' and 'certainty' are equivalent here "
954+
"(both contribute 0 variance for singleton strata, "
955+
"which is the canonical Rust & Rao 1996 behavior). "
956+
"'adjust' requires an overall-mean fallback per "
957+
"stratum that is not yet implemented for SDID "
958+
"jackknife; use variance_method='bootstrap' (which "
959+
"supports all three ``lonely_psu`` modes via the "
960+
"weighted-FW + Rao-Wu path) or switch the design "
961+
"to lonely_psu='remove'."
962+
)
938963
# Unstratified designs use the synthesized single stratum
939964
# (``_strata_*_eff``) so the loop reduces to classical
940965
# JK1 (single-stratum PSU-LOO).
@@ -2431,7 +2456,7 @@ def _jackknife_se_survey(
24312456
stacklevel=3,
24322457
)
24332458
return np.nan, tau_loo_arr
2434-
if not any_stratum_contributed or total_variance <= 0.0:
2459+
if not any_stratum_contributed:
24352460
warnings.warn(
24362461
"Jackknife survey SE is undefined because every stratum "
24372462
"was skipped (insufficient PSUs per stratum for variance "
@@ -2443,7 +2468,15 @@ def _jackknife_se_survey(
24432468
)
24442469
return np.nan, tau_loo_arr
24452470

2446-
return float(np.sqrt(total_variance)), tau_loo_arr
2471+
# R5 P1 fix: legitimate zero variance (e.g., full-census FPC with
2472+
# f_h = 1 for every contributing stratum → (1 - f_h) = 0 factor
2473+
# zeros the contribution even when within-stratum dispersion is
2474+
# non-zero; or exact-zero within-stratum dispersion when all
2475+
# LOOs produce identical τ̂). Rust & Rao gives V_J = 0, not
2476+
# undefined. Reserve NaN for the "all strata skipped" /
2477+
# undefined-replicate cases above; compute SE = 0 otherwise.
2478+
variance_nonneg = max(total_variance, 0.0)
2479+
return float(np.sqrt(variance_nonneg)), tau_loo_arr
24472480

24482481
def get_params(self) -> Dict[str, Any]:
24492482
"""Get estimator parameters."""

docs/methodology/REGISTRY.md

Lines changed: 12 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1585,11 +1585,16 @@ Convergence criterion: stop when objective decrease < min_decrease² (default mi
15851585
3. Weighted Frank-Wolfe re-estimates ω and λ on the pseudo-panel using `compute_sdid_unit_weights_survey(rw_control=w_control[pseudo_control_idx], ...)` and `compute_time_weights_survey(...)`. Post-optimization composition `ω_eff = rw·ω/Σ(rw·ω)` with zero-mass retry.
15861586
4. SDID estimator on the pseudo-panel; Algorithm 4 SE `sqrt((r-1)/r)·std(placebo_estimates, ddof=1)`.
15871587

1588-
**Fit-time feasibility guards** (per `feedback_front_door_over_retry_swallow.md`): for each stratum `h` containing treated units, require `n_controls_h >= n_treated_h`. Case B (`n_controls_h == 0`) and Case C (`0 < n_controls_h < n_treated_h`) both raise `ValueError` with distinct targeted messages *before* entering the retry loop. Partial-permutation fallback is rejected — it would silently change the null distribution and produce an incoherent test.
1588+
**Fit-time feasibility guards** (per `feedback_front_door_over_retry_swallow.md`): three distinct failure cases are rejected *before* entering the retry loop, each with a targeted `ValueError`:
1589+
* **Case B** (`n_controls_h == 0` for some treated-containing stratum): the stratum has treated units but no controls — no pseudo-treated set can be drawn.
1590+
* **Case C** (`0 < n_controls_h < n_treated_h`): the stratum has fewer controls than treated units, so exact-count without-replacement sampling is impossible.
1591+
* **Case D** (`n_controls_h == n_treated_h` for *every* treated stratum): the permutation support is `∏_h C(n_c_h, n_t_h) = 1` — only one allocation is possible, every placebo draw reproduces the same pseudo-treated set, and the null distribution collapses to a single point (SE = FP noise ~1e-16). At least one treated stratum must satisfy `n_c_h > n_t_h` for the test to have ≥2 distinct allocations.
1592+
1593+
Partial-permutation fallback is rejected for all three cases — it would silently change the null distribution and produce an incoherent test.
15891594

15901595
**Scope note — what is NOT randomized:** the stratum marginal is preserved exactly by construction (each draw pulls the same count per treated stratum). The PSU axis is not randomized (permutation is unit-level within strata). This is conservative under clustering (ignores within-stratum PSU correlation in the null) but aligns with the classical stratified permutation test literature. See Pesarin (2001) *Multivariate Permutation Tests*, Ch. 3-4; Pesarin & Salmaso (2010) *Permutation Tests for Complex Data*.
15911596

1592-
**Validation:** no external R/Julia parity anchor (neither package defines survey-weighted SDID placebo). Correctness rests on: (a) stratum-membership contract enforced by construction + monkeypatch regression test, (b) Case B/C front-door guards with targeted-message regression tests, (c) SE-differs-from-pweight-only cross-surface sanity, (d) deterministic-dispatch regression.
1597+
**Validation:** no external R/Julia parity anchor (neither package defines survey-weighted SDID placebo). Correctness rests on: (a) stratum-membership contract enforced by construction + monkeypatch regression test, (b) Case B / Case C / Case D front-door guards with targeted-message regression tests, (c) SE-differs-from-pweight-only cross-surface sanity, (d) deterministic-dispatch regression.
15931598

15941599
- **Note (survey + jackknife composition):** PSU-level leave-one-out with stratum aggregation (Rust & Rao 1996). For a design with strata `h = 1..H` and PSUs `j = 1..n_h` within each stratum:
15951600

@@ -1603,6 +1608,10 @@ Convergence criterion: stop when objective decrease < min_decrease² (default mi
16031608

16041609
**Undefined-replicate handling** (return NaN, do NOT silently skip): the Rust & Rao formula requires `τ̂_{(h,j)}` be defined for every PSU `j` in every contributing stratum. If any single LOO in a contributing stratum (`n_h ≥ 2`) is not computable — (a) deletion removes all treated units (e.g., all treated in one PSU), (b) `ω_eff_kept.sum() ≤ 0` after composition, (c) `w_treated_kept.sum() ≤ 0`, (d) the SDID estimator raises or returns non-finite τ̂ — the overall SE is **undefined** and the method returns `SE=NaN` with a targeted `UserWarning` naming the stratum / PSU / reason. Silently skipping the missing LOO while still applying the `(n_h-1)/n_h` factor would systematically under-scale variance (silently wrong SE). Users needing a variance estimator that accommodates PSU-deletion infeasibility should use `variance_method="bootstrap"`, whose pairs-bootstrap has no per-LOO feasibility constraint.
16051610

1611+
**Zero-variance vs undefined distinction:** when every stratum contributes but `total_variance == 0.0` by legitimate design — full-census FPC (`f_h = 1``(1 - f_h) = 0` zeros the contribution even when within-stratum dispersion is non-zero) or exact-zero within-stratum dispersion — the jackknife SE is **zero**, not undefined. `_jackknife_se_survey` returns `SE = 0.0` in that case. `SE = NaN` is reserved for the truly-undefined cases documented above (all strata skipped; any undefined delete-one replicate).
1612+
1613+
**`lonely_psu` contract:** `SurveyDesign(lonely_psu="remove")` (default) and `"certainty"` are both accepted — each treats singleton strata (`n_h < 2`) as contributing 0 to the total variance, matching the canonical Rust & Rao (1996) / R `survey::svyjkn` behavior for single-PSU strata. `lonely_psu="adjust"` (R's overall-mean fallback) is **not yet supported** on the SDID jackknife path and raises `NotImplementedError` at fit-time; users needing that semantic should pick `variance_method="bootstrap"` (which supports all three modes via the weighted-FW + Rao-Wu path) or switch the design to `"remove"` / `"certainty"`.
1614+
16061615
**Stratum-skip handling** (silent, documented): strata with `n_h < 2` are silently skipped (stratum-level variance unidentified — the `lonely-PSU` case in R `survey::svyjkn`). If every stratum is skipped, returns `SE=NaN` with a separate `UserWarning`. PSU-None designs: each unit is treated as its own PSU within its stratum (matches the implicit-PSU convention established in PR #355 R8 P1). Unstratified single-PSU short-circuits to `SE=NaN`.
16071616

16081617
**Scope note — what is NOT randomized:** stratum membership and PSU composition are fixed by design. The formula only captures within-stratum variation; between-stratum variance is absorbed into the analytical-TSL / design assumption. This is canonical survey-jackknife behavior (Rust & Rao 1996) and matches R's `survey::svyjkn` under stratified designs.
@@ -1644,7 +1653,7 @@ Convergence criterion: stop when objective decrease < min_decrease² (default mi
16441653
*Validation diagnostics (post-fit methods on `SyntheticDiDResults`):*
16451654

16461655
- **Trajectories** (`synthetic_pre_trajectory`, `synthetic_post_trajectory`, `treated_pre_trajectory`, `treated_post_trajectory`): retained on results to support plotting and custom fit metrics. `synthetic_pre_trajectory = Y_pre_control @ ω_eff`; `treated_pre_trajectory` is the survey-weighted treated mean (matches the Frank-Wolfe target). `pre_treatment_fit` is recoverable as `RMSE(treated_pre_trajectory, synthetic_pre_trajectory)`.
1647-
- **`get_loo_effects_df()`**: user-facing join of the jackknife leave-one-out pseudo-values (stored in `placebo_effects`) to the underlying unit identities. First `n_control` positions map to `control_unit_ids`, next `n_treated` to `treated_unit_ids` — positional ordering that mirrors `_jackknife_se`. `att_loo` is NaN when the zero-sum composed-weight guard fired for that unit; `delta_from_full = att_loo - att`. Requires `variance_method='jackknife'`; raises `ValueError` otherwise.
1656+
- **`get_loo_effects_df()`**: user-facing join of the jackknife leave-one-out pseudo-values (stored in `placebo_effects`) to the underlying unit identities. **Unit-level LOO only** — available on the non-survey and pweight-only jackknife paths (classical Algorithm 3: one LOO per unit, first `n_control` positions map to `control_unit_ids`, next `n_treated` to `treated_unit_ids`; `att_loo` is NaN when the zero-sum composed-weight guard fired for that unit; `delta_from_full = att_loo - att`). Under the full-design survey jackknife path (PSU-level LOO with stratum aggregation, Rust & Rao 1996), the underlying replicates are PSU-level rather than unit-level — the accessor raises `NotImplementedError` pointing to `result.placebo_effects` for the raw PSU-level replicate array. Dispatch is gated by an explicit `_loo_granularity` flag set at fit-time (`"unit"` vs `"psu"`). Requires `variance_method='jackknife'`; raises `ValueError` otherwise.
16481657
- **`get_weight_concentration(top_k=5)`**: returns `effective_n = 1/Σω²` (inverse Herfindahl), `herfindahl`, `top_k_share`, `top_k`. Operates on `self.unit_weights` which stores the composed `ω_eff`; for survey-weighted fits the metrics reflect the population-weighted concentration, not the raw Frank-Wolfe solution.
16491658
- **`in_time_placebo(fake_treatment_periods=None, zeta_omega_override=None, zeta_lambda_override=None)`**: re-slices the pre-window at each fake treatment period and re-fits both ω and λ via Frank-Wolfe. Default sweeps every feasible pre-period (position index `i ≥ 2` so ≥2 pre-fake periods remain for weight estimation, `i ≤ n_pre - 1` so ≥1 post-fake period exists). Credible designs produce near-zero placebo ATTs; departures indicate pre-treatment dynamics the estimator is picking up.
16501659
- **Note:** Regularization reuses `self.zeta_omega` / `self.zeta_lambda` from the original fit (matches R `synthdid` convention of treating regularization as a property of the fit). `*_override` re-fits with new values.

tests/test_survey_phase5.py

Lines changed: 111 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1161,6 +1161,117 @@ def test_get_loo_effects_df_works_on_pweight_only_jackknife(
11611161
assert set(df.columns) == {"unit", "role", "att_loo", "delta_from_full"}
11621162
assert set(df["role"].unique()) <= {"control", "treated"}
11631163

1164+
def test_jackknife_full_design_full_census_fpc_returns_zero_se(
1165+
self, sdid_survey_data_jk_well_formed
1166+
):
1167+
"""R5 P1 fix: full-census FPC → SE=0, not NaN.
1168+
1169+
Rust & Rao's stratified jackknife formula has an explicit
1170+
``(1 - f_h)`` factor. When ``fpc[h] == n_h`` for every
1171+
contributing stratum, ``f_h = 1``, ``(1 - f_h) = 0``, and every
1172+
stratum contribution is zero → ``total_variance = 0`` by
1173+
legitimate design, not by "every stratum skipped". The correct
1174+
jackknife SE in that case is **zero** (full census: no sampling
1175+
variance), not NaN. Reserve NaN for the truly-undefined cases
1176+
(all strata skipped, undefined PSU-LOO replicate).
1177+
"""
1178+
df = sdid_survey_data_jk_well_formed.copy()
1179+
# Each stratum has n_h=3 PSUs. Setting fpc=3 gives f_h=1 and
1180+
# (1 - f_h) = 0 — the formula collapses the stratum contribution
1181+
# to zero for legitimate design reasons.
1182+
df["fpc_full_census"] = 3.0
1183+
1184+
sd = SurveyDesign(
1185+
weights="weight",
1186+
strata="stratum",
1187+
psu="psu",
1188+
fpc="fpc_full_census",
1189+
)
1190+
est = SyntheticDiD(variance_method="jackknife", seed=42)
1191+
result = est.fit(
1192+
df,
1193+
outcome="outcome",
1194+
treatment="treated",
1195+
unit="unit",
1196+
time="time",
1197+
post_periods=[6, 7, 8, 9],
1198+
survey_design=sd,
1199+
)
1200+
# SE must be exactly zero (legitimate full-census no-sampling
1201+
# variance), not NaN (undefined) and not a tiny positive number.
1202+
assert np.isfinite(result.se)
1203+
assert result.se == 0.0
1204+
1205+
def test_jackknife_full_design_lonely_psu_adjust_raises(
1206+
self, sdid_survey_data_jk_well_formed
1207+
):
1208+
"""R5 P1 fix: ``SurveyDesign(lonely_psu='adjust')`` on the jackknife
1209+
survey path raises NotImplementedError rather than silently being
1210+
treated as ``"remove"``.
1211+
1212+
``"remove"`` and ``"certainty"`` both contribute 0 variance for
1213+
singleton strata on the jackknife path, matching canonical R
1214+
``survey::svyjkn`` behavior. ``"adjust"`` requires an overall-
1215+
mean fallback per stratum that is not yet implemented; rejecting
1216+
upfront prevents silent variance miscomputation.
1217+
"""
1218+
sd = SurveyDesign(
1219+
weights="weight",
1220+
strata="stratum",
1221+
psu="psu",
1222+
lonely_psu="adjust",
1223+
)
1224+
est = SyntheticDiD(variance_method="jackknife", seed=42)
1225+
with pytest.raises(
1226+
NotImplementedError,
1227+
match=r"lonely_psu='adjust'.*not supported on the SDID jackknife",
1228+
):
1229+
est.fit(
1230+
sdid_survey_data_jk_well_formed,
1231+
outcome="outcome",
1232+
treatment="treated",
1233+
unit="unit",
1234+
time="time",
1235+
post_periods=[6, 7, 8, 9],
1236+
survey_design=sd,
1237+
)
1238+
1239+
def test_jackknife_full_design_lonely_psu_certainty_equivalent_to_remove(
1240+
self, sdid_survey_data_jk_well_formed
1241+
):
1242+
"""``lonely_psu='certainty'`` is accepted and produces the same SE
1243+
as ``lonely_psu='remove'`` (both contribute 0 for singleton
1244+
strata on the jackknife path).
1245+
"""
1246+
sd_remove = SurveyDesign(
1247+
weights="weight", strata="stratum", psu="psu", lonely_psu="remove"
1248+
)
1249+
sd_certainty = SurveyDesign(
1250+
weights="weight", strata="stratum", psu="psu", lonely_psu="certainty"
1251+
)
1252+
1253+
est1 = SyntheticDiD(variance_method="jackknife", seed=42)
1254+
result_remove = est1.fit(
1255+
sdid_survey_data_jk_well_formed,
1256+
outcome="outcome",
1257+
treatment="treated",
1258+
unit="unit",
1259+
time="time",
1260+
post_periods=[6, 7, 8, 9],
1261+
survey_design=sd_remove,
1262+
)
1263+
est2 = SyntheticDiD(variance_method="jackknife", seed=42)
1264+
result_certainty = est2.fit(
1265+
sdid_survey_data_jk_well_formed,
1266+
outcome="outcome",
1267+
treatment="treated",
1268+
unit="unit",
1269+
time="time",
1270+
post_periods=[6, 7, 8, 9],
1271+
survey_design=sd_certainty,
1272+
)
1273+
assert result_remove.se == pytest.approx(result_certainty.se, rel=1e-14)
1274+
11641275
def test_jackknife_full_design_undefined_replicate_returns_nan(
11651276
self, sdid_survey_data_full_design
11661277
):

0 commit comments

Comments
 (0)