Skip to content

Commit 5515fbe

Browse files
igerberclaude
andcommitted
Address PR #355 R7 P1 + P3: fit-time positive-mass guard + doc wording fix
R7 P1: the per-draw zero-mass retry in ``_bootstrap_se`` (PR #355 R2 P0) only covers bootstrap draws, not the fit-time ATT. Survey weights are non-negative post-resolve() but all-zero mass on either arm is a valid input that encodes an unidentified target population. Without a fit- time guard the downstream ``np.average(Y, weights=w_treated)`` and ``omega_eff = unit_weights * w_control`` normalizations would hit 0/0 and silently propagate NaN through the bootstrap / placebo / jackknife dispatchers. Front-door the case: after ``w_control`` / ``w_treated`` are sourced from the resolved unit-level design, raise ``ValueError`` if either arm's total mass is <= 0. Covers both pweight-only and strata/PSU/FPC paths. Three regression tests added: ``test_fit_raises_on_zero_total_treated_survey_mass``, ``test_fit_raises_on_zero_total_control_survey_mass``, and ``test_fit_raises_on_zero_treated_mass_under_full_design``. R7 P3: the SDID row in ``docs/choosing_estimator.rst`` said "pweight only (placebo / jackknife); full (bootstrap)" in the **Weights** column, conflating weight-type support (fweight / aweight / pweight) with design-element support (strata / PSU / FPC). The code still hard- rejects non-pweight survey designs on every variance method. Narrow the wording to "pweight only" and leave "Via bootstrap" in the Strata/PSU/FPC column to describe design-element support. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 6f7eb8e commit 5515fbe

3 files changed

Lines changed: 96 additions & 1 deletion

File tree

diff_diff/synthetic_did.py

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -456,6 +456,34 @@ def fit( # type: ignore[override]
456456
w_treated = resolved_survey_unit.weights[n_control_for_split:].astype(
457457
np.float64
458458
)
459+
# Front-door positive-mass guard (PR #355 R7 P1). Survey weights
460+
# are non-negative post-resolve() (survey.py L171-L176 rejects
461+
# negatives), but all-zero mass on either arm is reachable — the
462+
# user can assign unit survey weights of 0 to every treated or
463+
# every control unit, which encodes an unidentified target
464+
# population. The fit-time ATT formulas downstream
465+
# (``np.average(..., weights=w_treated)`` around L551-L582 and
466+
# ``omega_eff = unit_weights * w_control`` in the bootstrap /
467+
# placebo / jackknife dispatchers) would otherwise hit 0/0
468+
# normalization or propagate NaNs silently. The bootstrap loop
469+
# already has per-draw zero-mass retries for degenerate resamples
470+
# (PR #355 R2 P0); this guard is the fit-time analogue.
471+
if w_control.sum() <= 0:
472+
raise ValueError(
473+
"Survey-weighted control arm has zero total mass "
474+
f"(sum of w_control = {w_control.sum():.3g}). "
475+
"Every control unit has survey weight 0, so the target "
476+
"population is unidentified. Drop units with zero weight, "
477+
"or omit survey_design if unweighted estimation is intended."
478+
)
479+
if w_treated.sum() <= 0:
480+
raise ValueError(
481+
"Survey-weighted treated arm has zero total mass "
482+
f"(sum of w_treated = {w_treated.sum():.3g}). "
483+
"Every treated unit has survey weight 0, so the target "
484+
"population is unidentified. Drop units with zero weight, "
485+
"or omit survey_design if unweighted estimation is intended."
486+
)
459487
else:
460488
w_treated = None
461489
w_control = None

docs/choosing_estimator.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -783,7 +783,7 @@ estimation. The depth of support varies by estimator:
783783
- Full (analytical)
784784
- Multiplier at PSU
785785
* - ``SyntheticDiD``
786-
- pweight only (placebo / jackknife); full (bootstrap)
786+
- pweight only
787787
- Via bootstrap
788788
- --
789789
- Hybrid pairs-bootstrap + Rao-Wu rescaled (bootstrap only)

tests/test_methodology_sdid.py

Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -842,6 +842,73 @@ def capturing_helper(Y_pre_c, Y_pre_t_mean, rw, *args, **kwargs):
842842
),
843843
)
844844

845+
def test_fit_raises_on_zero_total_treated_survey_mass(self):
846+
"""Fit-time positive-mass guard: zero treated survey mass raises.
847+
848+
``SurveyDesign.resolve()`` accepts non-negative unit weights
849+
(``survey.py`` L171-L176), so a user can legitimately assign unit
850+
survey weights of 0 to every treated unit — encoding an
851+
unidentified target population. Without the front-door guard, the
852+
fit-time survey-weighted ATT (``np.average(Y, weights=w_treated)``)
853+
would hit ``0/0`` and silently propagate NaN into the bootstrap
854+
loop, defeating the per-draw zero-mass retry (PR #355 R2 P0).
855+
Regression against PR #355 R7 P1: the guard must fire before the
856+
bootstrap is even dispatched.
857+
"""
858+
from diff_diff.survey import SurveyDesign
859+
860+
df = _make_panel(n_control=10, n_treated=3, seed=42)
861+
# Every treated unit gets weight 0; controls keep positive weight.
862+
df["wt"] = np.where(df["treated"] == 1, 0.0, 1.0)
863+
with pytest.raises(ValueError, match=r"treated arm has zero total mass"):
864+
SyntheticDiD(variance_method="bootstrap", n_bootstrap=20, seed=1).fit(
865+
df, outcome="outcome", treatment="treated",
866+
unit="unit", time="period",
867+
post_periods=[5, 6, 7],
868+
survey_design=SurveyDesign(weights="wt"),
869+
)
870+
871+
def test_fit_raises_on_zero_total_control_survey_mass(self):
872+
"""Fit-time positive-mass guard: zero control survey mass raises.
873+
874+
Mirror of the treated-arm case (PR #355 R7 P1). Downstream
875+
``omega_eff = unit_weights * w_control / (unit_weights * w_control).sum()``
876+
would hit 0/0; the guard front-doors.
877+
"""
878+
from diff_diff.survey import SurveyDesign
879+
880+
df = _make_panel(n_control=10, n_treated=3, seed=42)
881+
df["wt"] = np.where(df["treated"] == 0, 0.0, 1.0)
882+
with pytest.raises(ValueError, match=r"control arm has zero total mass"):
883+
SyntheticDiD(variance_method="bootstrap", n_bootstrap=20, seed=1).fit(
884+
df, outcome="outcome", treatment="treated",
885+
unit="unit", time="period",
886+
post_periods=[5, 6, 7],
887+
survey_design=SurveyDesign(weights="wt"),
888+
)
889+
890+
def test_fit_raises_on_zero_treated_mass_under_full_design(self):
891+
"""Fit-time positive-mass guard fires under full strata/PSU/FPC too.
892+
893+
The guard sources w_control / w_treated from the **resolved
894+
unit-level** design (PR #355 R4 P0), so zero total treated mass
895+
under a strata/PSU/FPC configuration must fire the same front-door
896+
ValueError as the pweight-only case (PR #355 R7 P1).
897+
"""
898+
from diff_diff.survey import SurveyDesign
899+
900+
df = _make_panel(n_control=10, n_treated=3, seed=42)
901+
df["wt"] = np.where(df["treated"] == 1, 0.0, 1.0)
902+
df["stratum"] = df["unit"] % 2
903+
df["psu"] = df["unit"]
904+
with pytest.raises(ValueError, match=r"treated arm has zero total mass"):
905+
SyntheticDiD(variance_method="bootstrap", n_bootstrap=20, seed=1).fit(
906+
df, outcome="outcome", treatment="treated",
907+
unit="unit", time="period",
908+
post_periods=[5, 6, 7],
909+
survey_design=SurveyDesign(weights="wt", strata="stratum", psu="psu"),
910+
)
911+
845912
def test_bootstrap_scale_invariance_under_pweight_rescaling(self):
846913
"""Survey-bootstrap SE / p / CI are invariant to a global pweight rescaling.
847914

0 commit comments

Comments
 (0)