Address PR #365 R11 P1: drop FPC pre-resolve on placebo + Case D effective-support guard

igerber · claude · igerber · commit a17c8a04524a · 2026-04-24T20:35:04.000-04:00
P1 #1 (FPC validator in SurveyDesign.resolve fires on placebo with explicit psu): The R10 fix gated the in-fit implicit-PSU FPC validator on bootstrap/jackknife only, but ``SurveyDesign.resolve()`` itself enforces ``FPC >= n_PSU`` design-validity (survey.py:349-368) before ``synthetic_did.fit()`` even sees the resolved object. So a placebo fit with explicit ``psu`` and low ``fpc`` would still raise — same parameter-interaction problem one layer earlier in resolution. Fix: when ``variance_method == "placebo"`` and ``survey_design.fpc is not None``, construct an FPC-stripped copy of the SurveyDesign (``dataclasses.replace(survey_design, fpc=None)``) BEFORE calling ``_resolve_survey_for_fit``. Emit the FPC no-op ``UserWarning`` at the same time. The original ``survey_design`` object is preserved (caller's reference unchanged); the resolved unit-level survey design carries no FPC on placebo, so the in-fit validators (and the downstream FPC-related dispatch flags) all correctly skip FPC handling. The duplicate downstream FPC no-op warning (added in R8 keyed on ``resolved_survey_unit.fpc``) becomes unreachable on placebo and is removed. New regression ``test_placebo_low_fpc_with_explicit_psu_skips_resolve_validator``: asserts (a) placebo with explicit psu + ``fpc < n_PSU`` succeeds + emits no-op warning, (b) SE matches the no-FPC fit at ``rel=1e-12``, (c) bootstrap on the same low-FPC design still raises ``"FPC (2.0) is less than the number of PSUs"`` from ``SurveyDesign.resolve()`` — validator-skip is correctly variance- method-gated. P1 #2 (Case D missed effective single-support): The Case D guard for placebo degeneracy keyed on raw control counts (``n_c_h > n_t_h`` for at least one stratum). It missed the case where ``n_c_h_positive < 2`` for every treated stratum: rows allow multiple subsets, but every successful pseudo-treated mean reduces to the unique positive-weight control's outcome (zero-weight cohabitants contribute 0 to numerator and denominator, R11 P1). The placebo null collapses to a single point and SE = FP noise. Fix: extend the non-degeneracy invariant to require **both** ``n_c_h > n_t_h`` AND ``n_c_h_positive >= 2`` for at least one treated stratum. The classical Case D shape (raw exact-count ``n_c_h == n_t_h``) and the new "effective single-support" shape (positive-weight controls < 2 even with extra zero-weight rows) both trigger Case D. Updated the Case D error message to enumerate ``n_c_positive`` alongside ``n_c`` / ``n_t`` per stratum. New regression ``test_placebo_full_design_raises_on_effective_single_support``: constructs a fixture with 1 treated unit + 1 positive-weight control + 9 zero-weight controls in stratum 0; raw guards (B/C/E) pass but Case D fires with the new "single distinct positive-mass pseudo-treated mean" message. Updated existing ``test_placebo_full_design_raises_on_exact_count_stratum`` regex to match the new message (same Case D path, slightly different wording). REGISTRY §SyntheticDiD Case enumeration updated: Case D now documents both the classical (``n_c == n_t``) and effective single- support (``n_c_positive < 2``) shapes, with the combined non- degeneracy invariant. Verification: 98 passed (2 new regressions; existing Case B/C/E/D- classical guards still fire on their fixtures). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
diff --git a/diff_diff/synthetic_did.py b/diff_diff/synthetic_did.py
@@ -330,8 +330,41 @@ def fit(  # type: ignore[override]
             _validate_unit_constant_survey,
         )
 
+        # R11 P1 fix: FPC is a documented no-op on placebo (Pesarin 2001
+        # §1.5 — permutation tests condition on the observed sample), but
+        # ``SurveyDesign.resolve()`` itself enforces ``FPC >= n_PSU``
+        # design-validity constraints (survey.py:349-368). On placebo,
+        # those constraints would block legitimate fits for a design
+        # element that doesn't enter the placebo math. Drop FPC from a
+        # copy of the survey design before resolution so placebo
+        # bypasses the validator entirely; emit the FPC no-op warning
+        # at the same time. The original survey_design object is
+        # preserved (caller's reference unchanged).
+        survey_design_for_resolve = survey_design
+        if (
+            self.variance_method == "placebo"
+            and survey_design is not None
+            and getattr(survey_design, "fpc", None) is not None
+        ):
+            import dataclasses as _dc
+            warnings.warn(
+                "SurveyDesign(fpc=...) is a no-op on "
+                "variance_method='placebo': permutation tests are "
+                "conditional on the observed sample (Pesarin 2001 §1.5), "
+                "so the sampling fraction does not enter Algorithm 4 or "
+                "its stratified-permutation survey extension. The FPC "
+                "column is dropped from the resolved survey design for "
+                "the placebo fit (this also bypasses the FPC >= n_PSU "
+                "design-validity check in SurveyDesign.resolve()). Use "
+                "variance_method='bootstrap' or 'jackknife' if you need "
+                "FPC to participate in the variance computation.",
+                UserWarning,
+                stacklevel=2,
+            )
+            survey_design_for_resolve = _dc.replace(survey_design, fpc=None)
+
         resolved_survey, survey_weights, survey_weight_type, survey_metadata = (
-            _resolve_survey_for_fit(survey_design, data, "analytical")
+            _resolve_survey_for_fit(survey_design_for_resolve, data, "analytical")
         )
         # Reject replicate-weight designs — SyntheticDiD has no replicate-
         # weight variance path. Analytical (pweight / strata / PSU / FPC)
@@ -822,25 +855,11 @@ def fit(  # type: ignore[override]
                 or resolved_survey_unit.psu is not None
             )
         )
-        if (
-            self.variance_method == "placebo"
-            and resolved_survey_unit is not None
-            and resolved_survey_unit.fpc is not None
-        ):
-            warnings.warn(
-                "SurveyDesign(fpc=...) is a no-op on "
-                "variance_method='placebo': permutation tests are "
-                "conditional on the observed sample (Pesarin 2001 §1.5), "
-                "so the sampling fraction does not enter Algorithm 4 or "
-                "its stratified-permutation survey extension. The FPC "
-                "column is preserved in the design metadata for other "
-                "purposes but the placebo SE is computed as if FPC were "
-                "absent. Use variance_method='bootstrap' or 'jackknife' "
-                "if you need FPC to participate in the variance "
-                "computation.",
-                UserWarning,
-                stacklevel=2,
-            )
+        # NOTE: the FPC no-op warning for placebo is emitted earlier
+        # (before ``_resolve_survey_for_fit``); ``resolved_survey_unit.fpc``
+        # is already None on the placebo path because the FPC column is
+        # dropped from a copy of the survey design pre-resolve. No
+        # duplicate warning here.
 
         # Jackknife routes to the survey allocator whenever PSU or FPC or
         # strata is declared. PSU-without-strata is treated as a single
@@ -929,38 +948,60 @@ def fit(  # type: ignore[override]
                         "same full survey design via weighted-FW + Rao-Wu "
                         "without a per-draw positive-mass constraint)."
                     )
-                if n_c_h > int(n_t_h):
+                # Non-degenerate iff this stratum yields ≥2 distinct
+                # positive-mass pseudo-treated draws. Two necessary
+                # conditions, both required:
+                #   * ``n_c_h > n_t_h`` — raw without-replacement count
+                #     allows multiple subsets (otherwise only the
+                #     "all-controls-as-pseudo-treated" subset exists,
+                #     regardless of weights — Case D classical shape).
+                #   * ``n_c_h_positive >= 2`` — at least 2 distinct
+                #     positive-mass means are reachable. With only 1
+                #     positive-weight control, every successful pick
+                #     reduces to that single control's mean (zero-
+                #     weight cohabitants contribute 0 to numerator and
+                #     denominator), regardless of how many subsets the
+                #     raw allocator can construct (Case D effective
+                #     single-support shape, R11 P1).
+                if n_c_h > int(n_t_h) and n_c_h_positive >= 2:
                     has_nondegenerate_stratum = True
-            # Case D: every treated stratum is exact-count
-            # (``n_c_h == n_t_h``). The stratified permutation support
-            # collapses to a single allocation — every placebo draw
-            # reproduces the same pseudo-treated set, giving a degenerate
-            # null (SE ≈ 0 up to FP noise, no meaningful sampling
-            # distribution). Reject at fit-time rather than silently
-            # reporting a near-zero SE; the overall permutation support is
-            # ``∏_h C(n_c_h, n_t_h)``, so at least one treated stratum must
-            # satisfy ``n_c_h > n_t_h`` for the test to have ≥2 distinct
-            # allocations.
+            # Case D: every treated stratum is effectively single-
+            # support, so the placebo null collapses to a single
+            # positive-mass allocation. Two paths into this:
+            #   * Raw exact-count (``n_c_h == n_t_h`` for every treated
+            #     stratum, R4 P1): the without-replacement permutation
+            #     yields a single subset, every draw is identical.
+            #   * Effective single-support (``n_c_h_positive < 2`` for
+            #     every treated stratum, R11 P1): positive-mass picks
+            #     reduce to a single distinct mean even when raw count
+            #     counts are larger, because zero-weight controls
+            #     contribute 0 to numerator and denominator. Successful
+            #     draws all collapse to the unique positive-weight
+            #     subset.
+            # Both shapes produce SE = FP noise (~1e-16) — reject up
+            # front rather than silently reporting a near-zero SE.
             if not has_nondegenerate_stratum:
                 detail = ", ".join(
                     f"stratum {h}: n_c={int(np.sum(_strata_control_eff == h))}, "
+                    f"n_c_positive={int(np.sum(w_control[_strata_control_eff == h] > 0))}, "
                     f"n_t={int(n_t_h)}"
                     for h, n_t_h in zip(unique_treated_strata, treated_counts)
                 )
                 raise ValueError(
                     "Stratified-permutation placebo support is degenerate: "
-                    "every treated-containing stratum has exactly "
-                    "n_controls == n_treated, so the within-stratum "
-                    "permutation yields a single allocation across all "
-                    f"draws ({detail}). The resulting placebo distribution "
-                    "collapses to one point and SE is not a meaningful "
-                    "null estimate. At least one treated stratum must "
-                    "have n_controls > n_treated for the permutation to "
-                    "have ≥2 distinct allocations. Either rebalance the "
-                    "panel, or use variance_method='bootstrap' (which "
-                    "supports the same full survey design via weighted-FW "
-                    "+ Rao-Wu without a permutation-feasibility "
-                    "constraint)."
+                    "every treated-containing stratum has fewer than 2 "
+                    "positive-weight controls, so within-stratum "
+                    "permutation yields a single distinct positive-mass "
+                    f"pseudo-treated mean across all draws ({detail}). "
+                    "The resulting placebo distribution collapses to one "
+                    "point and SE is not a meaningful null estimate. At "
+                    "least one treated stratum must have ≥2 positive-"
+                    "weight controls (and n_c_positive > n_t for the "
+                    "test to have ≥2 distinct allocations). Either "
+                    "rebalance the panel, or use "
+                    "variance_method='bootstrap' (which supports the "
+                    "same full survey design via weighted-FW + Rao-Wu "
+                    "without a permutation-feasibility constraint)."
                 )
 
         # Compute standard errors on normalized Y, rescale to original units.
diff --git a/docs/methodology/REGISTRY.md b/docs/methodology/REGISTRY.md
@@ -1589,7 +1589,7 @@ Convergence criterion: stop when objective decrease < min_decrease² (default mi
   * **Case B** (`n_controls_h == 0` for some treated-containing stratum): the stratum has treated units but no controls — no pseudo-treated set can be drawn.
   * **Case C** (`0 < n_controls_h < n_treated_h`): the stratum has fewer controls than treated units, so exact-count without-replacement sampling is impossible.
   * **Case E** (row-count guards passed but `n_positive_weight_controls_h < n_treated_h`): the stratum has enough raw controls but too few have positive survey weight. Since the pseudo-treated mean uses `np.average(Y, weights=w_control[idx])`, draws can pick all-zero-weight subsets (ZeroDivisionError on np.average) and the retry loop would swallow them as a generic ``n_successful=0`` warning + ``SE=0.0``.
-  * **Case D** (`n_controls_h == n_treated_h` for *every* treated stratum): the permutation support is `∏_h C(n_c_h, n_t_h) = 1` — only one allocation is possible, every placebo draw reproduces the same pseudo-treated set, and the null distribution collapses to a single point (SE = FP noise ~1e-16). At least one treated stratum must satisfy `n_c_h > n_t_h` for the test to have ≥2 distinct allocations.
+  * **Case D** (effective single-support — *every* treated stratum collapses to one positive-mass mean): two shapes trigger this. **(D-classical)** `n_controls_h == n_treated_h` so the without-replacement permutation has only one subset. **(D-effective)** `n_c_h > n_t_h` (raw count allows multiple subsets) but `n_positive_weight_controls_h < 2` — every successful pseudo-treated mean reduces to the unique positive-weight control's outcome (zero-weight cohabitants contribute 0 to numerator and denominator). Both shapes give a degenerate null (SE = FP noise ~1e-16). Non-degeneracy requires **both** `n_c_h > n_t_h` AND `n_positive_weight_controls_h >= 2` for at least one treated stratum.
 
   Partial-permutation fallback is rejected for all four cases — it would silently change the null distribution and produce an incoherent test.
 
diff --git a/tests/test_survey_phase5.py b/tests/test_survey_phase5.py
@@ -841,6 +841,53 @@ def test_placebo_full_design_raises_on_zero_control_stratum(
                 survey_design=sd,
             )
 
+    def test_placebo_full_design_raises_on_effective_single_support(
+        self, sdid_survey_data_full_design
+    ):
+        """R11 P1 fix: Case D effective single-support — n_t_h == 1 with
+        only one positive-weight control + zero-weight cohabitants.
+
+        Row-count guards pass (``n_c_h > n_t_h``) and Case E passes
+        (``n_c_h_positive == n_t_h``), but every successful pseudo-
+        treated draw collapses to the single positive-weight control's
+        outcome (zero-weight cohabitants contribute 0 to numerator and
+        denominator). The placebo distribution is degenerate: SE ≈ 0
+        from FP noise across identical means, not a meaningful null.
+
+        Without this guard, the previous code marked the stratum as
+        non-degenerate based on ``n_c_h > n_t_h`` (raw count), and the
+        retry loop would silently succeed on any positive-mass pick
+        with the same effective mean → ``SE = 0.0``.
+        """
+        # Build a fixture where stratum 0 has 1 treated + 1 positive-
+        # weight control + multiple zero-weight controls; stratum 1
+        # has only controls. This sets up effective single-support
+        # in stratum 0 even though raw n_c_h > n_t_h.
+        # Reuse sdid_survey_data_full_design but trim to 1 treated and
+        # zero out most stratum-0 controls' weights.
+        df = sdid_survey_data_full_design.copy()
+        # Drop treated units 1-4, keep unit 0 as the sole treated.
+        df = df[~df["unit"].isin([1, 2, 3, 4])].copy()
+        # Stratum 0 controls (5-14): keep unit 5 with positive weight,
+        # zero out 6-14.
+        df.loc[df["unit"].isin(range(6, 15)), "weight"] = 0.0
+
+        sd = SurveyDesign(weights="weight", strata="stratum", psu="psu")
+        est = SyntheticDiD(variance_method="placebo", n_bootstrap=50, seed=42)
+        with pytest.raises(
+            ValueError,
+            match=r"single distinct positive-mass pseudo-treated mean",
+        ):
+            est.fit(
+                df,
+                outcome="outcome",
+                treatment="treated",
+                unit="unit",
+                time="time",
+                post_periods=[6, 7, 8, 9],
+                survey_design=sd,
+            )
+
     def test_placebo_full_design_raises_on_zero_weight_controls_in_stratum(
         self, sdid_survey_data_full_design
     ):
@@ -897,7 +944,7 @@ def test_placebo_full_design_raises_on_exact_count_stratum(
         est = SyntheticDiD(variance_method="placebo", n_bootstrap=50, seed=42)
         with pytest.raises(
             ValueError,
-            match=r"permutation yields a single allocation across all draws",
+            match=r"single distinct positive-mass pseudo-treated mean across all draws",
         ):
             est.fit(
                 sdid_survey_data,
@@ -1031,6 +1078,82 @@ def test_placebo_fpc_alone_no_op_warns_and_matches_pweight_only(
         assert r_fpc.se == pytest.approx(r_pw.se, rel=1e-12)
         assert r_fpc.att == pytest.approx(r_pw.att, abs=1e-12)
 
+    def test_placebo_low_fpc_with_explicit_psu_skips_resolve_validator(
+        self, sdid_survey_data_full_design
+    ):
+        """R11 P1 fix: ``SurveyDesign.resolve()`` itself enforces
+        ``FPC >= n_PSU`` design-validity, but FPC is a placebo no-op
+        (Pesarin 2001 §1.5). On the placebo path, FPC is dropped from
+        a copy of the SurveyDesign before resolution so the
+        resolve-time validator never fires; the user sees the
+        documented FPC no-op warning and the fit succeeds.
+
+        Test: explicit ``psu`` + low ``fpc`` (below the per-stratum
+        ``n_PSU`` threshold) — would normally raise inside
+        ``SurveyDesign.resolve()`` with "FPC must be >= n_PSU". On
+        placebo, it succeeds with the no-op warning. Bootstrap on the
+        same design still raises (validator-skip is variance-method-
+        gated).
+        """
+        df = sdid_survey_data_full_design.copy()
+        # Each stratum has 3 PSUs in the well-formed-jackknife layout,
+        # but sdid_survey_data_full_design has stratum 0 with PSUs
+        # {0,1,2} and stratum 1 with PSUs {3,4,5} — 3 PSUs each. fpc=2
+        # is below the 3-PSU threshold per stratum.
+        df["fpc_low"] = 2.0
+
+        sd_low_fpc_psu = SurveyDesign(
+            weights="weight", strata="stratum", psu="psu", fpc="fpc_low"
+        )
+        sd_no_fpc = SurveyDesign(weights="weight", strata="stratum", psu="psu")
+
+        est_fpc = SyntheticDiD(variance_method="placebo", n_bootstrap=50, seed=42)
+        with pytest.warns(
+            UserWarning,
+            match=r"SurveyDesign\(fpc=\.\.\.\) is a no-op on variance_method='placebo'",
+        ):
+            r_fpc = est_fpc.fit(
+                df,
+                outcome="outcome",
+                treatment="treated",
+                unit="unit",
+                time="time",
+                post_periods=[6, 7, 8, 9],
+                survey_design=sd_low_fpc_psu,
+            )
+
+        est_no_fpc = SyntheticDiD(variance_method="placebo", n_bootstrap=50, seed=42)
+        r_no_fpc = est_no_fpc.fit(
+            df,
+            outcome="outcome",
+            treatment="treated",
+            unit="unit",
+            time="time",
+            post_periods=[6, 7, 8, 9],
+            survey_design=sd_no_fpc,
+        )
+
+        # FPC truly is a no-op for placebo even with explicit psu: SE
+        # matches the no-FPC fit at machine precision.
+        assert r_fpc.se == pytest.approx(r_no_fpc.se, rel=1e-12)
+        assert r_fpc.att == pytest.approx(r_no_fpc.att, abs=1e-12)
+        # Bootstrap on the same low-FPC design still raises the resolve-
+        # time validator error (validator-skip stays placebo-only).
+        est_boot = SyntheticDiD(variance_method="bootstrap", n_bootstrap=20, seed=42)
+        with pytest.raises(
+            ValueError,
+            match=r"FPC \(2\.0\) is less than the number of PSUs",
+        ):
+            est_boot.fit(
+                df,
+                outcome="outcome",
+                treatment="treated",
+                unit="unit",
+                time="time",
+                post_periods=[6, 7, 8, 9],
+                survey_design=sd_low_fpc_psu,
+            )
+
     def test_placebo_low_fpc_no_psu_warns_no_validator_block(
         self, sdid_survey_data_full_design
     ):