Skip to content

Commit e622d60

Browse files
igerberclaude
andcommitted
Address PR #346 CI review round 4: P1 d_lower > 0 on Design 1 paths
**P1 (Methodology): Design 1 paths must reject d_lower = 0** Paper Section 3.2 partitions HAD by regime: `d_lower = 0` is Design 1' (`continuous_at_zero`); `d_lower > 0` is Design 1 (`continuous_near_ d_lower` or `mass_point`). The auto-detect rule already respects this partition, but explicit overrides previously allowed `design="mass_point", d_lower=0` and `design="continuous_near_d_lower"` on a `d.min()==0` sample to run silently, returning a paper-incompatible estimand (2SLS with degenerate single-unit mass for mass_point; Design 1' algebra relabeled as `WAS_d_lower` with a spurious Assumption 5/6 warning for continuous_near_d_lower). Fix: add a fit-time guard that raises `ValueError` when `resolved_design in ("mass_point", "continuous_near_d_lower")` and the resolved `d_lower_val` is within float tolerance of zero (same tolerance family as `_detect_design`'s d.min()==0 tie-break). The error message points users to `continuous_at_zero` or `auto` for samples with support infimum at zero. **Docstring + test updates:** - Rewrote the `design` parameter docstring to document the regime-partition contract precisely: each explicit override is now described with its d_lower precondition and mass-point compatibility. - Rewrote the `d_lower` parameter docstring to note the Design-1-requires-positive contract. - Inverted the prior `test_force_mass_point_on_continuous_data_at_ support_infimum` test (which incorrectly codified the unsupported behavior) into three rejection regressions: `test_force_mass_point_on_d_lower_zero_sample_raises`, `test_force_continuous_near_d_lower_on_d_lower_zero_sample_raises`, `test_force_mass_point_d_lower_none_on_zero_sample_raises`. Targeted regression: 135 HAD tests + 514 total across Phase 1 and adjacent surfaces, all green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent ddc09e4 commit e622d60

2 files changed

Lines changed: 75 additions & 13 deletions

File tree

diff_diff/had.py

Lines changed: 50 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -828,13 +828,36 @@ class HeterogeneousAdoptionDiD:
828828
design : {"auto", "continuous_at_zero", "continuous_near_d_lower", "mass_point"}
829829
Design-dispatch strategy. Defaults to ``"auto"`` which resolves
830830
via the REGISTRY auto-detect rule on the fitted dose data
831-
(see :func:`_detect_design`). Explicit overrides run the chosen
832-
path without auto-reject (e.g., forcing Design 1' on mass-point
833-
data runs the nonparametric fit even though the paper would
834-
counsel the 2SLS path).
831+
(see :func:`_detect_design`).
832+
833+
Explicit overrides are checked against the paper's
834+
regime-partition contract (Section 3.2) at fit time:
835+
836+
- ``"continuous_at_zero"`` (Design 1'): paper requires the
837+
support infimum ``d_lower = 0``. Phase 1c's
838+
``_validate_had_inputs`` rejects mass-point samples passed
839+
to this path.
840+
- ``"continuous_near_d_lower"`` (Design 1, continuous density
841+
near ``d_lower``): requires ``d_lower > 0`` and a
842+
non-mass-point sample (modal fraction at ``d.min()`` must be
843+
<= 2%). ``d_lower`` must equal ``float(d.min())`` within
844+
float tolerance; non-support-infimum thresholds are off-
845+
support and raise.
846+
- ``"mass_point"`` (Design 1 mass-point): requires
847+
``d_lower > 0`` and ``d_lower == float(d.min())`` within
848+
float tolerance. Forcing this design on a ``d_lower = 0``
849+
sample raises (use ``"continuous_at_zero"`` or ``"auto"``).
850+
851+
Mismatched overrides raise ``ValueError`` pointing at the
852+
correct design rather than silently identifying a different
853+
estimand.
835854
d_lower : float or None
836855
Support infimum ``d_lower``. ``None`` means use ``0.0`` on the
837856
Design 1' path and ``float(d.min())`` on the other two paths.
857+
On Design 1 paths (``continuous_near_d_lower`` and
858+
``mass_point``), an explicit ``d_lower`` must equal
859+
``float(d.min())`` within float tolerance AND must be strictly
860+
positive; zero-valued or mismatched thresholds raise.
838861
kernel : {"epanechnikov", "triangular", "uniform"}
839862
Forwarded to :func:`bias_corrected_local_linear` on the continuous
840863
paths. Ignored on the mass-point path.
@@ -1070,6 +1093,29 @@ def fit(
10701093
else:
10711094
d_lower_val = float(d_lower_arg)
10721095

1096+
# ---- Regime partition: d_lower > 0 for Design 1 paths ----
1097+
# Paper Section 3.2 partitions HAD into d_lower = 0 (Design 1',
1098+
# continuous_at_zero) and d_lower > 0 (Design 1, continuous_near
1099+
# _d_lower or mass_point). The auto-detect rule already enforces
1100+
# this partition; explicit overrides must respect it too, otherwise
1101+
# `design="mass_point", d_lower=0` returns a finite but
1102+
# paper-incompatible 2SLS result and `design="continuous_near_d_lower"`
1103+
# with d_lower=0 reduces to Design 1' algebra while mislabeling the
1104+
# estimand as `WAS_d_lower` and emitting the wrong Assumption 5/6
1105+
# warning. Use the same float-tolerance family as _detect_design's
1106+
# d.min()==0 tie-break.
1107+
if resolved_design in ("mass_point", "continuous_near_d_lower"):
1108+
scale = max(1.0, float(np.max(np.abs(d_arr))))
1109+
if abs(d_lower_val) <= 1e-12 * scale:
1110+
raise ValueError(
1111+
f"design={resolved_design!r} requires d_lower > 0 (paper "
1112+
f"Section 3.2 reserves the d_lower=0 regime for Design 1' "
1113+
f"/ `continuous_at_zero`). Got d_lower={d_lower_val!r}. "
1114+
f"Use design='continuous_at_zero' (explicit) or "
1115+
f"design='auto' (auto-detect) for samples with support "
1116+
f"infimum at zero."
1117+
)
1118+
10731119
# ---- Original-scale mass-point check before the regressor shift ----
10741120
# When a user explicitly forces design="continuous_near_d_lower"
10751121
# on a sample that is actually a mass-point sample (modal fraction

tests/test_had.py

Lines changed: 25 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1108,20 +1108,36 @@ def test_force_continuous_at_zero_on_mass_point_data(self):
11081108
with pytest.raises(NotImplementedError, match="mass-point"):
11091109
est.fit(panel, "outcome", "dose", "period", "unit")
11101110

1111-
def test_force_mass_point_on_continuous_data_at_support_infimum(self):
1112-
"""Forcing mass-point on continuous data with d_lower=d.min() runs.
1111+
def test_force_mass_point_on_d_lower_zero_sample_raises(self):
1112+
"""Review P1 round 4: Design 1 paths require d_lower > 0.
11131113
1114-
d.min()==0 exactly in this DGP, so d_lower=0.0 is the paper-consistent
1115-
support-infimum threshold. The resulting mass=1 case exercises the
1116-
degenerate-mass boundary (only unit 0 is "at d_lower"; rest are above).
1114+
Paper Section 3.2 reserves the d_lower=0 regime for Design 1'
1115+
(continuous_at_zero). Forcing `mass_point` on a sample with
1116+
d.min()==0 must raise, pointing the user to continuous_at_zero
1117+
or auto.
11171118
"""
11181119
d, dy = _dgp_continuous_at_zero(500, seed=0)
11191120
panel = _make_panel(d, dy)
1120-
# d.min() == 0 exactly (d[0]=0 by construction); d_lower must match.
11211121
est = HeterogeneousAdoptionDiD(design="mass_point", d_lower=0.0)
1122-
r = est.fit(panel, "outcome", "dose", "period", "unit")
1123-
assert r.design == "mass_point"
1124-
assert r.d_lower == 0.0
1122+
with pytest.raises(ValueError, match=r"d_lower > 0|Design 1'"):
1123+
est.fit(panel, "outcome", "dose", "period", "unit")
1124+
1125+
def test_force_continuous_near_d_lower_on_d_lower_zero_sample_raises(self):
1126+
"""Parallel: continuous_near_d_lower must also reject d_lower=0."""
1127+
d, dy = _dgp_continuous_at_zero(500, seed=0)
1128+
panel = _make_panel(d, dy)
1129+
est = HeterogeneousAdoptionDiD(design="continuous_near_d_lower")
1130+
# d_lower auto-resolves to float(d.min()) == 0.0 on this DGP.
1131+
with pytest.raises(ValueError, match=r"d_lower > 0|Design 1'"):
1132+
est.fit(panel, "outcome", "dose", "period", "unit")
1133+
1134+
def test_force_mass_point_d_lower_none_on_zero_sample_raises(self):
1135+
"""d_lower=None on a d.min()==0 sample resolves to 0; must still raise."""
1136+
d, dy = _dgp_continuous_at_zero(500, seed=0)
1137+
panel = _make_panel(d, dy)
1138+
est = HeterogeneousAdoptionDiD(design="mass_point", d_lower=None)
1139+
with pytest.raises(ValueError, match=r"d_lower > 0"):
1140+
est.fit(panel, "outcome", "dose", "period", "unit")
11251141

11261142

11271143
# =============================================================================

0 commit comments

Comments
 (0)