Skip to content

Commit f30e1f7

Browse files
igerberclaude
andcommitted
Tighten SyntheticDiD set_params and TWFE Conley + wild_bootstrap guards
Closes two no-silent-failure gaps surfaced after the initial Phase 1 commit: 1. SyntheticDiD.set_params() now mirrors __init__'s Conley rejection contract. The constructor correctly raises TypeError on vcov_type="conley" / conley_*=... but set_params() previously only checked hasattr(self, key) and silently accepted those kwargs because SyntheticDiD inherits the conley_* attributes from DifferenceInDifferences. A user calling est.set_params(vcov_type="conley", conley_cutoff_km=100.0) followed by est.fit(...) would have gotten the bootstrap/jackknife/placebo variance silently with no Conley computation -- forbidden by feedback_no_silent_failures. The new set_params() rejects non-None vcov_type and any non-None conley_* kwarg with TypeError before mutation; None values for these keys are passthrough no-ops so get_params() -> set_params() round-trips cleanly. SyntheticDiD.get_params() now surfaces the inherited conley_* keys with None values for sklearn-style API consistency. 2. TwoWayFixedEffects(vcov_type="conley", inference="wild_bootstrap") now raises NotImplementedError. Conley analytical spatial-HAC and wild cluster bootstrap are different inference paths; combining them would route the bootstrap branch with cluster_ids=None (TWFE auto-cluster is disabled under Conley) and fail with a non-targeted error inside wild_bootstrap_se. Use inference='analytical' for Conley spatial HAC, or vcov_type='hc1' with inference='wild_bootstrap'. 3. DifferenceInDifferences and MultiPeriodDiD class docstrings now list vcov_type="conley" in the enum and document the four conley_* params (previously the Conley path was documented in REGISTRY/CHANGELOG/llms.txt but the in-code class docs still listed only {classical, hc1, hc2, hc2_bm}). 4. New tests: - SyntheticDiD().set_params(vcov_type="conley") raises TypeError + state unchanged - SyntheticDiD().set_params(conley_cutoff_km=100.0) raises + state unchanged - SyntheticDiD().get_params() includes conley_* keys with None values; round-trip set_params(**get_params()) is a no-op - TwoWayFixedEffects(vcov_type="conley", inference="wild_bootstrap") raises 5. TODO.md entries added for deferred follow-ups: callable-conley_metric shape/finiteness/symmetry validation, common Conley estimator-validator helper extraction, and a stronger TWFE Conley FWL invariance test that actually compares TWFE-within Conley to a full-dummy FE design (the current test only asserts finite SEs). 74 Conley tests pass (70 prior + 4 new); 261 tests pass on the targeted regression surface (test_conley_vcov, test_estimators_vcov_type, test_linalg, test_linalg_hc2_bm). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 9bfe95b commit f30e1f7

5 files changed

Lines changed: 165 additions & 2 deletions

File tree

TODO.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -117,6 +117,9 @@ Deferred items from PR reviews that were not addressed before merge.
117117
| Conley + survey weights / `survey_design`. Score-reweighted meat `s_i = w_i · X_i · ε_i` is mechanical, but PSU clustering interaction with the spatial kernel and replicate-weights variance under spatial correlation are non-trivial (Bertanha-Imbens 2014 covers cluster-sample but not the explicit Conley case). Phase 5 of the spillover-conley initiative; paper review prerequisite. Currently raises `NotImplementedError`. | `linalg.py::_validate_vcov_args`, `twfe.py`, `estimators.py` | Phase 5 (spillover-conley) | Medium |
118118
| Conley + `absorb=` (arbitrary FE projection beyond TWFE's two-FE within-transformation). FWL composability is proven analytically for TWFE's fixed two-FE design but not formally verified for arbitrary `absorb` dimensions; conservatively rejected at fit-time with a redirect to `fixed_effects=` dummies. Lift after empirical verification on multi-FE within-transformations. | `estimators.py::DifferenceInDifferences.fit`, `MultiPeriodDiD.fit` | follow-up (spillover-conley) | Low |
119119
| `SyntheticDiD(vcov_type="conley")` support. Currently raises `TypeError` at `__init__` because SyntheticDiD uses `variance_method ∈ {bootstrap, jackknife, placebo}` rather than the analytical sandwich that Conley plugs into. Wiring would require either reimplementing an analytical sandwich path for SyntheticDiD or designing a spatial-block bootstrap (new methodology, Politis-Romano 1994 territory). | `synthetic_did.py::SyntheticDiD` | follow-up (spillover-conley) | Low |
120+
| Validate user-supplied callable `conley_metric` for shape `(n, n)`, finiteness, non-negativity, and symmetry. Currently `np.asarray(metric(coords, coords))` is accepted unchecked; a malformed callable produces opaque matmul errors and a non-symmetric distance matrix produces a non-symmetric vcov. CI reviewer flagged as P2 M3 in PR #(spillover-conley). | `diff_diff/conley.py::_pairwise_distance_matrix`, `_compute_conley_vcov` | follow-up (spillover-conley) | Low |
121+
| Extract common Conley estimator-level validation helper. Today `cluster=`, `survey_design=`, `conley_coords=`, and `conley_cutoff_km=` checks are duplicated across `DifferenceInDifferences.fit` (estimators.py:~370-400), `MultiPeriodDiD.fit` (estimators.py:~1395-1455), and `TwoWayFixedEffects.fit` (twfe.py:~165-205). A future Conley-feature change risks updating one estimator but not the others. CI reviewer flagged as P2 MT1. | `diff_diff/estimators.py`, `diff_diff/twfe.py` | follow-up (spillover-conley) | Low |
122+
| Strengthen `tests/test_conley_vcov.py::TestConleyTWFE::test_twfe_conley_FWL_invariance` to actually verify FWL equivalence between TWFE-within Conley and a full-dummy-FE design (build the dummy regression explicitly and compare the ATT coefficient + Conley SE). The current test only asserts both fits produce finite SEs — the name overstates the assertion. CI reviewer flagged as P2 DT3. | `tests/test_conley_vcov.py` | follow-up (spillover-conley) | Low |
120123

121124
#### Performance
122125

diff_diff/estimators.py

Lines changed: 36 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -57,7 +57,7 @@ class DifferenceInDifferences:
5757
``vcov_type``: with ``"hc1"`` dispatches to CR1 (Liang-Zeger); with
5858
``"hc2_bm"`` dispatches to CR2 Bell-McCaffrey (Pustejovsky-Tipton 2018
5959
symmetric-sqrt + Satterthwaite DOF).
60-
vcov_type : {"classical", "hc1", "hc2", "hc2_bm"}, optional
60+
vcov_type : {"classical", "hc1", "hc2", "hc2_bm", "conley"}, optional
6161
Variance-covariance family. Defaults to the ``robust`` alias.
6262
6363
- ``"classical"``: non-robust OLS SEs, ``sigma_hat^2 * (X'X)^{-1}``.
@@ -69,6 +69,11 @@ class DifferenceInDifferences:
6969
with ``cluster=``, Pustejovsky-Tipton (2018) CR2 cluster-robust.
7070
(Note: ``MultiPeriodDiD`` does NOT yet support ``cluster=`` with
7171
``"hc2_bm"`` — see ``MultiPeriodDiD`` docstring and REGISTRY.md.)
72+
- ``"conley"``: Conley (1999) spatial-HAC sandwich. Requires
73+
``conley_coords`` (lat/lon column tuple) and ``conley_cutoff_km``
74+
(positive bandwidth — REQUIRED, no default per the no-silent-failures
75+
rule). Combining with ``cluster=``, ``survey_design=``, or ``absorb=``
76+
raises ``NotImplementedError`` (deferred to Phase 2+).
7277
alpha : float, default=0.05
7378
Significance level for confidence intervals.
7479
inference : str, default="analytical"
@@ -88,6 +93,20 @@ class DifferenceInDifferences:
8893
- "warn": Issue warning and drop linearly dependent columns (default)
8994
- "error": Raise ValueError
9095
- "silent": Drop columns silently without warning
96+
conley_coords : tuple of (str, str), optional
97+
Column-name tuple ``(lat_col, lon_col)`` for Conley spatial HAC SE.
98+
Required when ``vcov_type="conley"``; raises ``ValueError`` otherwise.
99+
conley_cutoff_km : float, optional
100+
Positive finite bandwidth in km (haversine) or coord units (euclidean).
101+
Required when ``vcov_type="conley"``; no default per Conley 1999
102+
Section 5 sensitivity-grid recommendation.
103+
conley_metric : str, default "haversine"
104+
Distance metric: ``"haversine"`` (lat/lon, km), ``"euclidean"`` (any
105+
units), or a callable ``(coords1, coords2) -> n×n``.
106+
conley_kernel : str, default "bartlett"
107+
Kernel function: ``"bartlett"`` (PSD-guaranteed, default) or
108+
``"uniform"`` (emits ``UserWarning`` if the meat has a materially
109+
negative eigenvalue per Conley 1999 footnote 11).
91110
92111
Attributes
93112
----------
@@ -1055,7 +1074,7 @@ class MultiPeriodDiD(DifferenceInDifferences):
10551074
``TODO.md``; also documented as a Note in
10561075
``docs/methodology/REGISTRY.md`` under the HeterogeneousAdoptionDiD
10571076
requirements-checklist block.
1058-
vcov_type : {"classical", "hc1", "hc2", "hc2_bm"}, optional
1077+
vcov_type : {"classical", "hc1", "hc2", "hc2_bm", "conley"}, optional
10591078
Variance-covariance family. Defaults to the ``robust`` alias.
10601079
10611080
- ``"classical"``: non-robust OLS SEs, ``sigma_hat^2 * (X'X)^{-1}``.
@@ -1066,8 +1085,23 @@ class MultiPeriodDiD(DifferenceInDifferences):
10661085
- ``"hc2_bm"``: one-way HC2 + Imbens-Kolesar (2016) Satterthwaite DOF
10671086
per coefficient plus a contrast-aware DOF for the post-period-average
10681087
ATT. **Unsupported with** ``cluster=`` — see ``cluster`` above.
1088+
- ``"conley"``: Conley (1999) spatial-HAC sandwich. Requires
1089+
``conley_coords`` and ``conley_cutoff_km``. Combining with
1090+
``cluster=``, ``survey_design=``, or ``absorb=`` raises
1091+
``NotImplementedError`` (deferred to Phase 2+).
10691092
alpha : float, default=0.05
10701093
Significance level for confidence intervals.
1094+
conley_coords : tuple of (str, str), optional
1095+
Column-name tuple ``(lat_col, lon_col)`` for Conley spatial HAC SE.
1096+
Required when ``vcov_type="conley"``.
1097+
conley_cutoff_km : float, optional
1098+
Positive finite bandwidth for Conley spatial HAC. Required when
1099+
``vcov_type="conley"`` (no default per Conley 1999 sensitivity-grid).
1100+
conley_metric : str, default "haversine"
1101+
Distance metric for Conley: ``"haversine"`` (lat/lon, km),
1102+
``"euclidean"``, or a callable.
1103+
conley_kernel : str, default "bartlett"
1104+
Conley kernel: ``"bartlett"`` (PSD-guaranteed) or ``"uniform"``.
10711105
10721106
Attributes
10731107
----------

diff_diff/synthetic_did.py

Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2706,6 +2706,16 @@ def get_params(self) -> Dict[str, Any]:
27062706
"variance_method": self.variance_method,
27072707
"n_bootstrap": self.n_bootstrap,
27082708
"seed": self.seed,
2709+
# Conley kwargs are inherited from DifferenceInDifferences.__init__
2710+
# but rejected by SyntheticDiD's __init__ / set_params (Conley uses
2711+
# the analytical sandwich, SyntheticDiD uses bootstrap variance).
2712+
# Surface them here as None for sklearn-style API consistency; any
2713+
# non-None value is rejected by set_params/__init__.
2714+
"vcov_type": None,
2715+
"conley_coords": None,
2716+
"conley_cutoff_km": None,
2717+
"conley_metric": None,
2718+
"conley_kernel": None,
27092719
}
27102720

27112721
def set_params(self, **params) -> "SyntheticDiD":
@@ -2715,16 +2725,54 @@ def set_params(self, **params) -> "SyntheticDiD":
27152725
post-update state, the instance is rolled back to the pre-call values
27162726
so a raised ``ValueError`` leaves the object consistent with its
27172727
pre-call configuration.
2728+
2729+
Mirrors ``__init__``'s defensive rejection of ``vcov_type`` /
2730+
``conley_*`` non-None values: SyntheticDiD uses bootstrap/jackknife/
2731+
placebo variance, not the analytical sandwich, so any Conley kwarg
2732+
would be silently ignored otherwise (forbidden by
2733+
``feedback_no_silent_failures``). Tracked in TODO.md for a follow-up
2734+
that wires Conley to a non-bootstrap variance path.
27182735
"""
2736+
# Reject Conley kwargs / non-None vcov_type before any mutation —
2737+
# mirrors __init__'s contract. Empty/None values are permitted so
2738+
# round-tripping get_params() back through set_params() is a no-op.
2739+
_conley_keys = ("conley_coords", "conley_cutoff_km", "conley_metric", "conley_kernel")
2740+
if params.get("vcov_type") is not None and params["vcov_type"] != "conley":
2741+
raise TypeError(
2742+
f"SyntheticDiD does not accept vcov_type={params['vcov_type']!r}. "
2743+
"SyntheticDiD's variance is bootstrap/jackknife/placebo based; "
2744+
"configure via variance_method=..."
2745+
)
2746+
if params.get("vcov_type") == "conley" or any(
2747+
k in params and params[k] is not None for k in _conley_keys
2748+
):
2749+
raise TypeError(
2750+
"SyntheticDiD does not yet support vcov_type='conley' or any "
2751+
"conley_* kwargs. SyntheticDiD uses bootstrap/jackknife/placebo "
2752+
"variance (variance_method=...), not the analytical sandwich "
2753+
"routed through compute_robust_vcov. Tracked in TODO.md as "
2754+
"a follow-up."
2755+
)
27192756
# Deprecated parameter names — emit warning and ignore
27202757
_deprecated = {"lambda_reg", "zeta"}
2758+
# Conley kwargs are not stored as instance attributes; surfacing them
2759+
# in get_params() returns None unconditionally. set_params() with None
2760+
# values for these keys is a no-op (the rejection above only fires on
2761+
# non-None values).
2762+
_silent_conley_passthrough = {"vcov_type", *_conley_keys}
27212763
# Snapshot original values for transactional rollback on validation failure.
27222764
_rollback: Dict[str, Any] = {}
27232765
for key in params:
2766+
if key in _silent_conley_passthrough:
2767+
continue
27242768
if key not in _deprecated and hasattr(self, key):
27252769
_rollback[key] = getattr(self, key)
27262770
try:
27272771
for key, value in params.items():
2772+
if key in _silent_conley_passthrough:
2773+
# No-op: explicitly None passthrough for round-trip
2774+
# get_params() -> set_params() consistency.
2775+
continue
27282776
if key in _deprecated:
27292777
warnings.warn(
27302778
f"{key} is deprecated and ignored. Use zeta_omega/zeta_lambda "

diff_diff/twfe.py

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -162,6 +162,22 @@ def fit( # type: ignore[override]
162162
"Conley; the unit auto-cluster default is also disabled "
163163
"when vcov_type='conley'."
164164
)
165+
# Conley + wild_bootstrap: Conley is an analytical spatial-HAC
166+
# variance and wild cluster bootstrap is a different inference path
167+
# that resamples residuals within clusters. There is no clean
168+
# composition — the two methods target different things — and
169+
# combining them would either silently drop one or fail downstream
170+
# (TWFE auto-cluster is disabled under Conley, so the bootstrap
171+
# would receive cluster_ids=None and fail with a non-targeted
172+
# error in `wild_bootstrap_se`). Reject early.
173+
if self.vcov_type == "conley" and self.inference == "wild_bootstrap":
174+
raise NotImplementedError(
175+
"TwoWayFixedEffects(vcov_type='conley', inference='wild_bootstrap') "
176+
"is not supported: Conley is an analytical spatial-HAC variance and "
177+
"wild cluster bootstrap is a different inference path. Use "
178+
"inference='analytical' for Conley spatial HAC, or use "
179+
"vcov_type='hc1' with inference='wild_bootstrap'."
180+
)
165181
if self.vcov_type == "conley":
166182
if survey_design is not None:
167183
raise NotImplementedError(

tests/test_conley_vcov.py

Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -798,6 +798,22 @@ def test_twfe_conley_with_explicit_cluster_raises(self, panel):
798798
conley_cutoff_km=2000.0,
799799
).fit(panel, outcome="y", treatment="treated", time="time", unit="unit")
800800

801+
def test_twfe_conley_with_wild_bootstrap_raises(self, panel):
802+
"""Conley analytical spatial-HAC and wild cluster bootstrap are
803+
different inference paths; combining them would either silently drop
804+
one or fail downstream (TWFE auto-cluster is disabled under Conley,
805+
so the bootstrap would receive cluster_ids=None). Reject early.
806+
Closes CI reviewer P1 CQ2."""
807+
from diff_diff import TwoWayFixedEffects
808+
809+
with pytest.raises(NotImplementedError, match="wild_bootstrap"):
810+
TwoWayFixedEffects(
811+
vcov_type="conley",
812+
inference="wild_bootstrap",
813+
conley_coords=("lat", "lon"),
814+
conley_cutoff_km=2000.0,
815+
).fit(panel, outcome="y", treatment="treated", time="time", unit="unit")
816+
801817
def test_twfe_conley_FWL_invariance(self, panel):
802818
"""TWFE Conley SE matches DifferenceInDifferences with same kwargs
803819
(verifies FWL composability — Conley meat survives within-transformation
@@ -905,6 +921,52 @@ def test_synthetic_did_conley_kwarg_raises(self):
905921
with pytest.raises(TypeError, match="conley"):
906922
SyntheticDiD(conley_cutoff_km=100.0) # type: ignore[call-arg]
907923

924+
def test_synthetic_did_set_params_conley_raises(self):
925+
"""SyntheticDiD.set_params(vcov_type='conley') must raise (mirrors
926+
__init__'s contract — closes the silent-bypass gap CI reviewer flagged
927+
as P1 CQ1)."""
928+
from diff_diff import SyntheticDiD
929+
930+
est = SyntheticDiD()
931+
# Snapshot pre-call state
932+
before_variance = est.variance_method
933+
before_n_boot = est.n_bootstrap
934+
before_zeta = est.zeta_omega
935+
936+
with pytest.raises(TypeError, match="conley"):
937+
est.set_params(vcov_type="conley")
938+
# Verify nothing mutated
939+
assert est.variance_method == before_variance
940+
assert est.n_bootstrap == before_n_boot
941+
assert est.zeta_omega == before_zeta
942+
943+
def test_synthetic_did_set_params_conley_kwarg_raises(self):
944+
from diff_diff import SyntheticDiD
945+
946+
est = SyntheticDiD()
947+
with pytest.raises(TypeError, match="conley"):
948+
est.set_params(conley_cutoff_km=100.0)
949+
# Verify the conley attr stays None (rejected before mutation)
950+
assert getattr(est, "conley_cutoff_km", None) is None
951+
952+
def test_synthetic_did_get_params_includes_conley_keys(self):
953+
"""get_params() / set_params() round-trip must include the inherited
954+
conley_* keys with None values for sklearn-style API consistency
955+
(CI reviewer P2 CQ3)."""
956+
from diff_diff import SyntheticDiD
957+
958+
est = SyntheticDiD(variance_method="placebo", n_bootstrap=10)
959+
params = est.get_params()
960+
assert "vcov_type" in params and params["vcov_type"] is None
961+
assert "conley_coords" in params and params["conley_coords"] is None
962+
assert "conley_cutoff_km" in params and params["conley_cutoff_km"] is None
963+
assert "conley_metric" in params and params["conley_metric"] is None
964+
assert "conley_kernel" in params and params["conley_kernel"] is None
965+
# Round-trip: passing None values back into set_params is a no-op
966+
est.set_params(**params)
967+
assert est.variance_method == "placebo"
968+
assert est.n_bootstrap == 10
969+
908970

909971
class TestConleySetParamsAtomicity:
910972
"""set_params atomicity for Conley fields. Per

0 commit comments

Comments
 (0)