Skip to content

Commit cdb42fe

Browse files
igerberclaude
andcommitted
Address PR #365 R8 P1: drop FPC from placebo dispatch + document FPC no-op contract
P1 (Methodology — placebo dispatch flipped on FPC alone, but FPC plays no role in placebo math): The dispatcher gated placebo's survey-path routing on ``_full_design_survey = strata is not None OR psu is not None OR fpc is not None``. Adding an ``fpc=`` column to a SurveyDesign therefore silently switched dispatch from the non-survey placebo path (unweighted-FW + post-hoc ω composition) to the weighted-FW survey placebo path — different numerics — even though permutation tests are conditional on the observed sample (Pesarin 2001 §1.5) and the sampling fraction never enters Algorithm 4 or its stratified- permutation survey extension. The reviewer correctly flagged this as an undocumented methodology mismatch on a public variance method. Fix: * Gate ``_placebo_use_survey_path`` on ``strata is not None OR psu is not None`` (FPC dropped from the trigger). FPC alone now keeps placebo on the non-survey path with no numerical drift relative to the no-FPC fit. * Emit a ``UserWarning`` whenever ``fpc`` is set with ``variance_method="placebo"``, regardless of whether ``strata`` or ``psu`` are also set, so users get an explicit signal that the FPC column is preserved in design metadata but does not enter placebo math. Recommends ``variance_method="bootstrap"`` or ``"jackknife"`` for FPC participation. * REGISTRY §SyntheticDiD "Note (survey support matrix)" placebo bullet rewritten to spell out the contract: "for designs with explicit ``strata`` and/or ``psu`` … FPC is a documented no-op for placebo — permutation tests are conditional on the observed sample (Pesarin 2001 §1.5)." * survey-theory.md placebo bullet picks up the same FPC no-op language plus the Case B/C/D guard enumeration from R5. New regression ``test_placebo_fpc_alone_no_op_warns_and_matches_pweight_only`` asserts both contracts: (a) ``UserWarning`` fires when fpc is set on placebo, (b) SE under ``SurveyDesign(weights, fpc)`` matches SE under ``SurveyDesign(weights)`` at ``rel=1e-12`` (true no-op, not a silent dispatch flip introducing weighted-FW drift). Bootstrap and jackknife paths unchanged — they use FPC legitimately (Rao-Wu rescaling for bootstrap, ``(1 - f_h)`` factor in the Rust & Rao 1996 jackknife formula). Only placebo's contract narrows. Verification: 95 passed (1 new FPC no-op regression). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 0bcda79 commit cdb42fe

4 files changed

Lines changed: 120 additions & 21 deletions

File tree

diff_diff/synthetic_did.py

Lines changed: 41 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -789,15 +789,49 @@ def fit( # type: ignore[override]
789789
_fpc_control = None
790790
_fpc_treated = None
791791

792-
# Placebo routes to the survey allocator whenever strata or PSU
793-
# or FPC is declared. For PSU/FPC-without-strata designs, the
794-
# whole panel is synthesized as a single stratum (stratified
795-
# permutation degenerates to global within-stratum permutation,
796-
# still dispatched through the weighted-FW path for methodology
797-
# consistency with the documented full-design contract).
792+
# Placebo routes to the survey allocator whenever **strata or
793+
# PSU** is declared (FPC alone does NOT flip dispatch). For
794+
# PSU-without-strata designs, the whole panel is synthesized
795+
# as a single stratum (stratified permutation degenerates to
796+
# global within-stratum permutation, still dispatched through
797+
# the weighted-FW path).
798+
#
799+
# FPC handling on placebo (R8 P1 fix): permutation tests are
800+
# conditional on the observed sample (Pesarin 2001 §1.5), so
801+
# the sampling fraction does not enter Algorithm 4 or its
802+
# stratified-permutation extension. Including FPC in the
803+
# dispatch trigger would silently switch numerics (weighted-FW
804+
# vs unweighted-FW + post-hoc composition) on a survey design
805+
# element that has no place in the placebo math. Drop FPC from
806+
# the dispatch condition; emit a ``UserWarning`` below if FPC
807+
# is set with placebo to surface the no-op contract.
798808
_placebo_use_survey_path = (
799-
_full_design_survey and self.variance_method == "placebo"
809+
self.variance_method == "placebo"
810+
and resolved_survey_unit is not None
811+
and (
812+
resolved_survey_unit.strata is not None
813+
or resolved_survey_unit.psu is not None
814+
)
800815
)
816+
if (
817+
self.variance_method == "placebo"
818+
and resolved_survey_unit is not None
819+
and resolved_survey_unit.fpc is not None
820+
):
821+
warnings.warn(
822+
"SurveyDesign(fpc=...) is a no-op on "
823+
"variance_method='placebo': permutation tests are "
824+
"conditional on the observed sample (Pesarin 2001 §1.5), "
825+
"so the sampling fraction does not enter Algorithm 4 or "
826+
"its stratified-permutation survey extension. The FPC "
827+
"column is preserved in the design metadata for other "
828+
"purposes but the placebo SE is computed as if FPC were "
829+
"absent. Use variance_method='bootstrap' or 'jackknife' "
830+
"if you need FPC to participate in the variance "
831+
"computation.",
832+
UserWarning,
833+
stacklevel=2,
834+
)
801835

802836
# Jackknife routes to the survey allocator whenever PSU or FPC or
803837
# strata is declared. PSU-without-strata is treated as a single

docs/methodology/REGISTRY.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1561,7 +1561,7 @@ Convergence criterion: stop when objective decrease < min_decrease² (default mi
15611561

15621562
**Bootstrap survey path** (PR #355): for pweight-only the per-draw FW uses constant `rw = w_control`; for full design (strata/PSU/FPC) the per-draw `rw = generate_rao_wu_weights(resolved_survey, rng)` rescaling is composed with the same weighted-FW kernel. See "Note (survey + bootstrap composition)" below for the full objective and the argmin-set caveat.
15631563

1564-
**Placebo survey path**: for pweight-only the existing Algorithm 4 flow applies with survey-weighted pseudo-treated means + post-hoc ω_eff composition. For full design (strata/PSU/FPC) the allocator switches to **stratified permutation** (Pesarin 2001): pseudo-treated indices are drawn within each stratum containing actual treated units; weighted-FW re-estimates ω and λ per draw with per-control survey weights threaded into both loss and regularization. See "Note (survey + placebo composition)" below.
1564+
**Placebo survey path**: for pweight-only the existing Algorithm 4 flow applies with survey-weighted pseudo-treated means + post-hoc ω_eff composition. For designs with explicit `strata` and/or `psu` the allocator switches to **stratified permutation** (Pesarin 2001): pseudo-treated indices are drawn within each stratum containing actual treated units; weighted-FW re-estimates ω and λ per draw with per-control survey weights threaded into both loss and regularization. See "Note (survey + placebo composition)" below. **FPC is a documented no-op for placebo** — permutation tests are conditional on the observed sample (Pesarin 2001 §1.5), so the sampling fraction does not enter Algorithm 4 or its survey extension; an `fpc=` column on a placebo fit emits a `UserWarning` and is preserved in the design metadata but never enters the variance computation. Routing is gated on `strata` / `psu` only — FPC alone does not flip dispatch from the non-survey to the survey placebo path.
15651565

15661566
**Jackknife survey path**: for pweight-only the existing Algorithm 3 flow applies (unit-level LOO with subset + rw-composed-renormalized ω; λ fixed). For full design the allocator switches to **PSU-level LOO with stratum aggregation** (Rust & Rao 1996): leave out one PSU at a time within each stratum, aggregate as `SE² = Σ_h (1-f_h)·(n_h-1)/n_h·Σ_{j∈h}(τ̂_{(h,j)} - τ̄_h)²`. See "Note (survey + jackknife composition)" below.
15671567

docs/methodology/survey-theory.md

Lines changed: 20 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -749,20 +749,27 @@ Two bootstrap strategies interact with survey designs:
749749
for the full objective and the argmin-set caveat.
750750

751751
- **Stratified permutation placebo** (SyntheticDiD): SDID's full-design
752-
placebo variance allocator. For each placebo draw, pseudo-treated
753-
indices are sampled uniformly without replacement from controls
754-
*within each stratum containing actual treated units* (classical
755-
stratified permutation test — Pesarin 2001). Pseudo-treated means
756-
are survey-weighted; weighted-FW re-estimates ω and λ per draw with
757-
``rw_control`` threaded into both loss and regularization. Post-
758-
optimization composition ``ω_eff = rw · ω / Σ(rw · ω)`` with zero-
759-
mass retry. SE follows Arkhangelsky Algorithm 4:
752+
placebo variance allocator (triggered when ``strata`` and/or ``psu``
753+
is declared on the ``SurveyDesign``). For each placebo draw,
754+
pseudo-treated indices are sampled uniformly without replacement
755+
from controls *within each stratum containing actual treated units*
756+
(classical stratified permutation test — Pesarin 2001).
757+
Pseudo-treated means are survey-weighted; weighted-FW re-estimates
758+
ω and λ per draw with ``rw_control`` threaded into both loss and
759+
regularization. Post-optimization composition
760+
``ω_eff = rw · ω / Σ(rw · ω)`` with zero-mass retry. SE follows
761+
Arkhangelsky Algorithm 4:
760762
``sqrt((r-1)/r) · std(placebo_estimates, ddof=1)``. Fit-time
761-
feasibility guards raise ``ValueError`` when a treated-containing
762-
stratum has 0 controls or fewer controls than treated units (the
763-
permutation allocator requires ``n_controls_h ≥ n_treated_h`` by
764-
construction). See REGISTRY.md §SyntheticDiD ``Note (survey +
765-
placebo composition)``.
763+
feasibility guards raise ``ValueError`` on three failure cases:
764+
Case B (treated stratum has 0 controls), Case C (fewer controls
765+
than treated in a treated stratum), and Case D (every treated
766+
stratum is exact-count ``n_c == n_t`` → permutation support = 1).
767+
``SurveyDesign(fpc=...)`` is a documented no-op for placebo —
768+
permutation tests are conditional on the observed sample (Pesarin
769+
2001 §1.5), so the sampling fraction does not enter Algorithm 4 or
770+
its survey extension. An ``fpc=`` column emits a ``UserWarning`` and
771+
is not part of the placebo dispatch trigger. See REGISTRY.md
772+
§SyntheticDiD ``Note (survey + placebo composition)``.
766773

767774
- **PSU-level leave-one-out with stratum aggregation** (SyntheticDiD):
768775
SDID's full-design jackknife variance allocator, matching the

tests/test_survey_phase5.py

Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -894,6 +894,64 @@ def test_placebo_full_design_se_differs_from_pweight_only(
894894
assert result_pw.att == pytest.approx(result_full.att, abs=1e-10)
895895
assert result_pw.se != pytest.approx(result_full.se, abs=1e-6)
896896

897+
def test_placebo_fpc_alone_no_op_warns_and_matches_pweight_only(
898+
self, sdid_survey_data_full_design
899+
):
900+
"""R8 P1 fix: ``fpc=`` alone does not flip placebo dispatch.
901+
902+
Permutation tests condition on the observed sample (Pesarin 2001
903+
§1.5), so FPC's sampling-fraction adjustment doesn't enter
904+
Algorithm 4 or its stratified-permutation survey extension. The
905+
previous dispatcher routed any ``fpc is not None`` design through
906+
``_placebo_variance_se_survey`` (weighted-FW per draw), silently
907+
changing numerics relative to the no-FPC fit even though FPC
908+
played no role in the math.
909+
910+
The fix gates placebo's survey-path dispatch on
911+
``strata is not None OR psu is not None`` only, and emits a
912+
``UserWarning`` whenever FPC is set on a placebo fit. This test
913+
asserts both: (a) the warning fires and (b) ``SE`` matches the
914+
pweight-only-no-FPC fit at ``rel=1e-12`` (FPC truly is a no-op).
915+
"""
916+
df = sdid_survey_data_full_design.copy()
917+
df["fpc_col"] = 1000.0 # any positive value — no-op on placebo
918+
919+
sd_fpc_only = SurveyDesign(weights="weight", fpc="fpc_col")
920+
sd_pweight_only = SurveyDesign(weights="weight")
921+
922+
est_fpc = SyntheticDiD(variance_method="placebo", n_bootstrap=50, seed=42)
923+
with pytest.warns(
924+
UserWarning,
925+
match=r"SurveyDesign\(fpc=\.\.\.\) is a no-op on variance_method='placebo'",
926+
):
927+
r_fpc = est_fpc.fit(
928+
df,
929+
outcome="outcome",
930+
treatment="treated",
931+
unit="unit",
932+
time="time",
933+
post_periods=[6, 7, 8, 9],
934+
survey_design=sd_fpc_only,
935+
)
936+
937+
est_pw = SyntheticDiD(variance_method="placebo", n_bootstrap=50, seed=42)
938+
r_pw = est_pw.fit(
939+
df,
940+
outcome="outcome",
941+
treatment="treated",
942+
unit="unit",
943+
time="time",
944+
post_periods=[6, 7, 8, 9],
945+
survey_design=sd_pweight_only,
946+
)
947+
948+
# FPC is documented as no-op for placebo: the SE under FPC must
949+
# exactly match the SE without FPC (same dispatch path, no
950+
# numerical drift from the routing flip the dispatcher used to
951+
# introduce on `fpc is not None`).
952+
assert r_fpc.se == pytest.approx(r_pw.se, rel=1e-12)
953+
assert r_fpc.att == pytest.approx(r_pw.att, abs=1e-12)
954+
897955
def test_placebo_full_design_psu_only_routes_through_survey_path(
898956
self, sdid_survey_data_jk_well_formed
899957
):

0 commit comments

Comments
 (0)