Skip to content

Commit a17c8a0

Browse files
igerberclaude
andcommitted
Address PR #365 R11 P1: drop FPC pre-resolve on placebo + Case D effective-support guard
P1 #1 (FPC validator in SurveyDesign.resolve fires on placebo with explicit psu): The R10 fix gated the in-fit implicit-PSU FPC validator on bootstrap/jackknife only, but ``SurveyDesign.resolve()`` itself enforces ``FPC >= n_PSU`` design-validity (survey.py:349-368) before ``synthetic_did.fit()`` even sees the resolved object. So a placebo fit with explicit ``psu`` and low ``fpc`` would still raise — same parameter-interaction problem one layer earlier in resolution. Fix: when ``variance_method == "placebo"`` and ``survey_design.fpc is not None``, construct an FPC-stripped copy of the SurveyDesign (``dataclasses.replace(survey_design, fpc=None)``) BEFORE calling ``_resolve_survey_for_fit``. Emit the FPC no-op ``UserWarning`` at the same time. The original ``survey_design`` object is preserved (caller's reference unchanged); the resolved unit-level survey design carries no FPC on placebo, so the in-fit validators (and the downstream FPC-related dispatch flags) all correctly skip FPC handling. The duplicate downstream FPC no-op warning (added in R8 keyed on ``resolved_survey_unit.fpc``) becomes unreachable on placebo and is removed. New regression ``test_placebo_low_fpc_with_explicit_psu_skips_resolve_validator``: asserts (a) placebo with explicit psu + ``fpc < n_PSU`` succeeds + emits no-op warning, (b) SE matches the no-FPC fit at ``rel=1e-12``, (c) bootstrap on the same low-FPC design still raises ``"FPC (2.0) is less than the number of PSUs"`` from ``SurveyDesign.resolve()`` — validator-skip is correctly variance- method-gated. P1 #2 (Case D missed effective single-support): The Case D guard for placebo degeneracy keyed on raw control counts (``n_c_h > n_t_h`` for at least one stratum). It missed the case where ``n_c_h_positive < 2`` for every treated stratum: rows allow multiple subsets, but every successful pseudo-treated mean reduces to the unique positive-weight control's outcome (zero-weight cohabitants contribute 0 to numerator and denominator, R11 P1). The placebo null collapses to a single point and SE = FP noise. Fix: extend the non-degeneracy invariant to require **both** ``n_c_h > n_t_h`` AND ``n_c_h_positive >= 2`` for at least one treated stratum. The classical Case D shape (raw exact-count ``n_c_h == n_t_h``) and the new "effective single-support" shape (positive-weight controls < 2 even with extra zero-weight rows) both trigger Case D. Updated the Case D error message to enumerate ``n_c_positive`` alongside ``n_c`` / ``n_t`` per stratum. New regression ``test_placebo_full_design_raises_on_effective_single_support``: constructs a fixture with 1 treated unit + 1 positive-weight control + 9 zero-weight controls in stratum 0; raw guards (B/C/E) pass but Case D fires with the new "single distinct positive-mass pseudo-treated mean" message. Updated existing ``test_placebo_full_design_raises_on_exact_count_stratum`` regex to match the new message (same Case D path, slightly different wording). REGISTRY §SyntheticDiD Case enumeration updated: Case D now documents both the classical (``n_c == n_t``) and effective single- support (``n_c_positive < 2``) shapes, with the combined non- degeneracy invariant. Verification: 98 passed (2 new regressions; existing Case B/C/E/D- classical guards still fire on their fixtures). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 312f78f commit a17c8a0

3 files changed

Lines changed: 209 additions & 45 deletions

File tree

diff_diff/synthetic_did.py

Lines changed: 84 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -330,8 +330,41 @@ def fit( # type: ignore[override]
330330
_validate_unit_constant_survey,
331331
)
332332

333+
# R11 P1 fix: FPC is a documented no-op on placebo (Pesarin 2001
334+
# §1.5 — permutation tests condition on the observed sample), but
335+
# ``SurveyDesign.resolve()`` itself enforces ``FPC >= n_PSU``
336+
# design-validity constraints (survey.py:349-368). On placebo,
337+
# those constraints would block legitimate fits for a design
338+
# element that doesn't enter the placebo math. Drop FPC from a
339+
# copy of the survey design before resolution so placebo
340+
# bypasses the validator entirely; emit the FPC no-op warning
341+
# at the same time. The original survey_design object is
342+
# preserved (caller's reference unchanged).
343+
survey_design_for_resolve = survey_design
344+
if (
345+
self.variance_method == "placebo"
346+
and survey_design is not None
347+
and getattr(survey_design, "fpc", None) is not None
348+
):
349+
import dataclasses as _dc
350+
warnings.warn(
351+
"SurveyDesign(fpc=...) is a no-op on "
352+
"variance_method='placebo': permutation tests are "
353+
"conditional on the observed sample (Pesarin 2001 §1.5), "
354+
"so the sampling fraction does not enter Algorithm 4 or "
355+
"its stratified-permutation survey extension. The FPC "
356+
"column is dropped from the resolved survey design for "
357+
"the placebo fit (this also bypasses the FPC >= n_PSU "
358+
"design-validity check in SurveyDesign.resolve()). Use "
359+
"variance_method='bootstrap' or 'jackknife' if you need "
360+
"FPC to participate in the variance computation.",
361+
UserWarning,
362+
stacklevel=2,
363+
)
364+
survey_design_for_resolve = _dc.replace(survey_design, fpc=None)
365+
333366
resolved_survey, survey_weights, survey_weight_type, survey_metadata = (
334-
_resolve_survey_for_fit(survey_design, data, "analytical")
367+
_resolve_survey_for_fit(survey_design_for_resolve, data, "analytical")
335368
)
336369
# Reject replicate-weight designs — SyntheticDiD has no replicate-
337370
# weight variance path. Analytical (pweight / strata / PSU / FPC)
@@ -822,25 +855,11 @@ def fit( # type: ignore[override]
822855
or resolved_survey_unit.psu is not None
823856
)
824857
)
825-
if (
826-
self.variance_method == "placebo"
827-
and resolved_survey_unit is not None
828-
and resolved_survey_unit.fpc is not None
829-
):
830-
warnings.warn(
831-
"SurveyDesign(fpc=...) is a no-op on "
832-
"variance_method='placebo': permutation tests are "
833-
"conditional on the observed sample (Pesarin 2001 §1.5), "
834-
"so the sampling fraction does not enter Algorithm 4 or "
835-
"its stratified-permutation survey extension. The FPC "
836-
"column is preserved in the design metadata for other "
837-
"purposes but the placebo SE is computed as if FPC were "
838-
"absent. Use variance_method='bootstrap' or 'jackknife' "
839-
"if you need FPC to participate in the variance "
840-
"computation.",
841-
UserWarning,
842-
stacklevel=2,
843-
)
858+
# NOTE: the FPC no-op warning for placebo is emitted earlier
859+
# (before ``_resolve_survey_for_fit``); ``resolved_survey_unit.fpc``
860+
# is already None on the placebo path because the FPC column is
861+
# dropped from a copy of the survey design pre-resolve. No
862+
# duplicate warning here.
844863

845864
# Jackknife routes to the survey allocator whenever PSU or FPC or
846865
# strata is declared. PSU-without-strata is treated as a single
@@ -929,38 +948,60 @@ def fit( # type: ignore[override]
929948
"same full survey design via weighted-FW + Rao-Wu "
930949
"without a per-draw positive-mass constraint)."
931950
)
932-
if n_c_h > int(n_t_h):
951+
# Non-degenerate iff this stratum yields ≥2 distinct
952+
# positive-mass pseudo-treated draws. Two necessary
953+
# conditions, both required:
954+
# * ``n_c_h > n_t_h`` — raw without-replacement count
955+
# allows multiple subsets (otherwise only the
956+
# "all-controls-as-pseudo-treated" subset exists,
957+
# regardless of weights — Case D classical shape).
958+
# * ``n_c_h_positive >= 2`` — at least 2 distinct
959+
# positive-mass means are reachable. With only 1
960+
# positive-weight control, every successful pick
961+
# reduces to that single control's mean (zero-
962+
# weight cohabitants contribute 0 to numerator and
963+
# denominator), regardless of how many subsets the
964+
# raw allocator can construct (Case D effective
965+
# single-support shape, R11 P1).
966+
if n_c_h > int(n_t_h) and n_c_h_positive >= 2:
933967
has_nondegenerate_stratum = True
934-
# Case D: every treated stratum is exact-count
935-
# (``n_c_h == n_t_h``). The stratified permutation support
936-
# collapses to a single allocation — every placebo draw
937-
# reproduces the same pseudo-treated set, giving a degenerate
938-
# null (SE ≈ 0 up to FP noise, no meaningful sampling
939-
# distribution). Reject at fit-time rather than silently
940-
# reporting a near-zero SE; the overall permutation support is
941-
# ``∏_h C(n_c_h, n_t_h)``, so at least one treated stratum must
942-
# satisfy ``n_c_h > n_t_h`` for the test to have ≥2 distinct
943-
# allocations.
968+
# Case D: every treated stratum is effectively single-
969+
# support, so the placebo null collapses to a single
970+
# positive-mass allocation. Two paths into this:
971+
# * Raw exact-count (``n_c_h == n_t_h`` for every treated
972+
# stratum, R4 P1): the without-replacement permutation
973+
# yields a single subset, every draw is identical.
974+
# * Effective single-support (``n_c_h_positive < 2`` for
975+
# every treated stratum, R11 P1): positive-mass picks
976+
# reduce to a single distinct mean even when raw count
977+
# counts are larger, because zero-weight controls
978+
# contribute 0 to numerator and denominator. Successful
979+
# draws all collapse to the unique positive-weight
980+
# subset.
981+
# Both shapes produce SE = FP noise (~1e-16) — reject up
982+
# front rather than silently reporting a near-zero SE.
944983
if not has_nondegenerate_stratum:
945984
detail = ", ".join(
946985
f"stratum {h}: n_c={int(np.sum(_strata_control_eff == h))}, "
986+
f"n_c_positive={int(np.sum(w_control[_strata_control_eff == h] > 0))}, "
947987
f"n_t={int(n_t_h)}"
948988
for h, n_t_h in zip(unique_treated_strata, treated_counts)
949989
)
950990
raise ValueError(
951991
"Stratified-permutation placebo support is degenerate: "
952-
"every treated-containing stratum has exactly "
953-
"n_controls == n_treated, so the within-stratum "
954-
"permutation yields a single allocation across all "
955-
f"draws ({detail}). The resulting placebo distribution "
956-
"collapses to one point and SE is not a meaningful "
957-
"null estimate. At least one treated stratum must "
958-
"have n_controls > n_treated for the permutation to "
959-
"have ≥2 distinct allocations. Either rebalance the "
960-
"panel, or use variance_method='bootstrap' (which "
961-
"supports the same full survey design via weighted-FW "
962-
"+ Rao-Wu without a permutation-feasibility "
963-
"constraint)."
992+
"every treated-containing stratum has fewer than 2 "
993+
"positive-weight controls, so within-stratum "
994+
"permutation yields a single distinct positive-mass "
995+
f"pseudo-treated mean across all draws ({detail}). "
996+
"The resulting placebo distribution collapses to one "
997+
"point and SE is not a meaningful null estimate. At "
998+
"least one treated stratum must have ≥2 positive-"
999+
"weight controls (and n_c_positive > n_t for the "
1000+
"test to have ≥2 distinct allocations). Either "
1001+
"rebalance the panel, or use "
1002+
"variance_method='bootstrap' (which supports the "
1003+
"same full survey design via weighted-FW + Rao-Wu "
1004+
"without a permutation-feasibility constraint)."
9641005
)
9651006

9661007
# Compute standard errors on normalized Y, rescale to original units.

docs/methodology/REGISTRY.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1589,7 +1589,7 @@ Convergence criterion: stop when objective decrease < min_decrease² (default mi
15891589
* **Case B** (`n_controls_h == 0` for some treated-containing stratum): the stratum has treated units but no controls — no pseudo-treated set can be drawn.
15901590
* **Case C** (`0 < n_controls_h < n_treated_h`): the stratum has fewer controls than treated units, so exact-count without-replacement sampling is impossible.
15911591
* **Case E** (row-count guards passed but `n_positive_weight_controls_h < n_treated_h`): the stratum has enough raw controls but too few have positive survey weight. Since the pseudo-treated mean uses `np.average(Y, weights=w_control[idx])`, draws can pick all-zero-weight subsets (ZeroDivisionError on np.average) and the retry loop would swallow them as a generic ``n_successful=0`` warning + ``SE=0.0``.
1592-
* **Case D** (`n_controls_h == n_treated_h` for *every* treated stratum): the permutation support is `∏_h C(n_c_h, n_t_h) = 1`only one allocation is possible, every placebo draw reproduces the same pseudo-treated set, and the null distribution collapses to a single point (SE = FP noise ~1e-16). At least one treated stratum must satisfy `n_c_h > n_t_h` for the test to have ≥2 distinct allocations.
1592+
* **Case D** (effective single-support — *every* treated stratum collapses to one positive-mass mean): two shapes trigger this. **(D-classical)** `n_controls_h == n_treated_h` so the without-replacement permutation has only one subset. **(D-effective)** `n_c_h > n_t_h` (raw count allows multiple subsets) but `n_positive_weight_controls_h < 2` — every successful pseudo-treated mean reduces to the unique positive-weight control's outcome (zero-weight cohabitants contribute 0 to numerator and denominator). Both shapes give a degenerate null (SE = FP noise ~1e-16). Non-degeneracy requires **both** `n_c_h > n_t_h` AND `n_positive_weight_controls_h >= 2` for at least one treated stratum.
15931593

15941594
Partial-permutation fallback is rejected for all four cases — it would silently change the null distribution and produce an incoherent test.
15951595

tests/test_survey_phase5.py

Lines changed: 124 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -841,6 +841,53 @@ def test_placebo_full_design_raises_on_zero_control_stratum(
841841
survey_design=sd,
842842
)
843843

844+
def test_placebo_full_design_raises_on_effective_single_support(
845+
self, sdid_survey_data_full_design
846+
):
847+
"""R11 P1 fix: Case D effective single-support — n_t_h == 1 with
848+
only one positive-weight control + zero-weight cohabitants.
849+
850+
Row-count guards pass (``n_c_h > n_t_h``) and Case E passes
851+
(``n_c_h_positive == n_t_h``), but every successful pseudo-
852+
treated draw collapses to the single positive-weight control's
853+
outcome (zero-weight cohabitants contribute 0 to numerator and
854+
denominator). The placebo distribution is degenerate: SE ≈ 0
855+
from FP noise across identical means, not a meaningful null.
856+
857+
Without this guard, the previous code marked the stratum as
858+
non-degenerate based on ``n_c_h > n_t_h`` (raw count), and the
859+
retry loop would silently succeed on any positive-mass pick
860+
with the same effective mean → ``SE = 0.0``.
861+
"""
862+
# Build a fixture where stratum 0 has 1 treated + 1 positive-
863+
# weight control + multiple zero-weight controls; stratum 1
864+
# has only controls. This sets up effective single-support
865+
# in stratum 0 even though raw n_c_h > n_t_h.
866+
# Reuse sdid_survey_data_full_design but trim to 1 treated and
867+
# zero out most stratum-0 controls' weights.
868+
df = sdid_survey_data_full_design.copy()
869+
# Drop treated units 1-4, keep unit 0 as the sole treated.
870+
df = df[~df["unit"].isin([1, 2, 3, 4])].copy()
871+
# Stratum 0 controls (5-14): keep unit 5 with positive weight,
872+
# zero out 6-14.
873+
df.loc[df["unit"].isin(range(6, 15)), "weight"] = 0.0
874+
875+
sd = SurveyDesign(weights="weight", strata="stratum", psu="psu")
876+
est = SyntheticDiD(variance_method="placebo", n_bootstrap=50, seed=42)
877+
with pytest.raises(
878+
ValueError,
879+
match=r"single distinct positive-mass pseudo-treated mean",
880+
):
881+
est.fit(
882+
df,
883+
outcome="outcome",
884+
treatment="treated",
885+
unit="unit",
886+
time="time",
887+
post_periods=[6, 7, 8, 9],
888+
survey_design=sd,
889+
)
890+
844891
def test_placebo_full_design_raises_on_zero_weight_controls_in_stratum(
845892
self, sdid_survey_data_full_design
846893
):
@@ -897,7 +944,7 @@ def test_placebo_full_design_raises_on_exact_count_stratum(
897944
est = SyntheticDiD(variance_method="placebo", n_bootstrap=50, seed=42)
898945
with pytest.raises(
899946
ValueError,
900-
match=r"permutation yields a single allocation across all draws",
947+
match=r"single distinct positive-mass pseudo-treated mean across all draws",
901948
):
902949
est.fit(
903950
sdid_survey_data,
@@ -1031,6 +1078,82 @@ def test_placebo_fpc_alone_no_op_warns_and_matches_pweight_only(
10311078
assert r_fpc.se == pytest.approx(r_pw.se, rel=1e-12)
10321079
assert r_fpc.att == pytest.approx(r_pw.att, abs=1e-12)
10331080

1081+
def test_placebo_low_fpc_with_explicit_psu_skips_resolve_validator(
1082+
self, sdid_survey_data_full_design
1083+
):
1084+
"""R11 P1 fix: ``SurveyDesign.resolve()`` itself enforces
1085+
``FPC >= n_PSU`` design-validity, but FPC is a placebo no-op
1086+
(Pesarin 2001 §1.5). On the placebo path, FPC is dropped from
1087+
a copy of the SurveyDesign before resolution so the
1088+
resolve-time validator never fires; the user sees the
1089+
documented FPC no-op warning and the fit succeeds.
1090+
1091+
Test: explicit ``psu`` + low ``fpc`` (below the per-stratum
1092+
``n_PSU`` threshold) — would normally raise inside
1093+
``SurveyDesign.resolve()`` with "FPC must be >= n_PSU". On
1094+
placebo, it succeeds with the no-op warning. Bootstrap on the
1095+
same design still raises (validator-skip is variance-method-
1096+
gated).
1097+
"""
1098+
df = sdid_survey_data_full_design.copy()
1099+
# Each stratum has 3 PSUs in the well-formed-jackknife layout,
1100+
# but sdid_survey_data_full_design has stratum 0 with PSUs
1101+
# {0,1,2} and stratum 1 with PSUs {3,4,5} — 3 PSUs each. fpc=2
1102+
# is below the 3-PSU threshold per stratum.
1103+
df["fpc_low"] = 2.0
1104+
1105+
sd_low_fpc_psu = SurveyDesign(
1106+
weights="weight", strata="stratum", psu="psu", fpc="fpc_low"
1107+
)
1108+
sd_no_fpc = SurveyDesign(weights="weight", strata="stratum", psu="psu")
1109+
1110+
est_fpc = SyntheticDiD(variance_method="placebo", n_bootstrap=50, seed=42)
1111+
with pytest.warns(
1112+
UserWarning,
1113+
match=r"SurveyDesign\(fpc=\.\.\.\) is a no-op on variance_method='placebo'",
1114+
):
1115+
r_fpc = est_fpc.fit(
1116+
df,
1117+
outcome="outcome",
1118+
treatment="treated",
1119+
unit="unit",
1120+
time="time",
1121+
post_periods=[6, 7, 8, 9],
1122+
survey_design=sd_low_fpc_psu,
1123+
)
1124+
1125+
est_no_fpc = SyntheticDiD(variance_method="placebo", n_bootstrap=50, seed=42)
1126+
r_no_fpc = est_no_fpc.fit(
1127+
df,
1128+
outcome="outcome",
1129+
treatment="treated",
1130+
unit="unit",
1131+
time="time",
1132+
post_periods=[6, 7, 8, 9],
1133+
survey_design=sd_no_fpc,
1134+
)
1135+
1136+
# FPC truly is a no-op for placebo even with explicit psu: SE
1137+
# matches the no-FPC fit at machine precision.
1138+
assert r_fpc.se == pytest.approx(r_no_fpc.se, rel=1e-12)
1139+
assert r_fpc.att == pytest.approx(r_no_fpc.att, abs=1e-12)
1140+
# Bootstrap on the same low-FPC design still raises the resolve-
1141+
# time validator error (validator-skip stays placebo-only).
1142+
est_boot = SyntheticDiD(variance_method="bootstrap", n_bootstrap=20, seed=42)
1143+
with pytest.raises(
1144+
ValueError,
1145+
match=r"FPC \(2\.0\) is less than the number of PSUs",
1146+
):
1147+
est_boot.fit(
1148+
df,
1149+
outcome="outcome",
1150+
treatment="treated",
1151+
unit="unit",
1152+
time="time",
1153+
post_periods=[6, 7, 8, 9],
1154+
survey_design=sd_low_fpc_psu,
1155+
)
1156+
10341157
def test_placebo_low_fpc_no_psu_warns_no_validator_block(
10351158
self, sdid_survey_data_full_design
10361159
):

0 commit comments

Comments
 (0)