Skip to content

Commit a0ff9be

Browse files
igerberclaude
andcommitted
Codex re-review round 5: clear cluster_name on TWFE Conley path
P1 (Maintainability): TWFE drops its auto-unit-cluster on the Conley path (`_conley_cluster_override = None` in the LinearRegression call) but still recorded `_twfe_cluster_label = unit` in the result metadata. Downstream consumers reading `res.cluster_name` or `res.to_dict()["cluster_name"]` were told the SEs were CR1-clustered when they were actually Conley spatial HAC with no clustering. Fix: when `_fit_vcov_type == "conley"`, set `_twfe_cluster_label = None` so result-level provenance mirrors the actual cluster IDs passed to LinearRegression. Regression: `test_twfe_conley_cluster_name_is_none` asserts both `res.cluster_name is None` and that `to_dict()` doesn't advertise the `cluster_name` key on a TWFE Conley fit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 2cf5ebe commit a0ff9be

2 files changed

Lines changed: 30 additions & 3 deletions

File tree

diff_diff/twfe.py

Lines changed: 10 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -561,9 +561,16 @@ def _refit_twfe(w_r):
561561
# `self.cluster is None` AND the vcov family is cluster-compatible.
562562
# One-way families (`classical`, `hc2`) disable the auto-cluster (see
563563
# the `cluster_var` block above); report None so summary() labels the
564-
# one-way family instead of "CR1 cluster-robust at unit".
565-
if self.cluster is not None:
566-
_twfe_cluster_label: Optional[str] = self.cluster
564+
# one-way family instead of "CR1 cluster-robust at unit". The Conley
565+
# path drops the auto-cluster too (`_conley_cluster_override = None`
566+
# above); mirror that here so the variance-provenance metadata
567+
# (`res.cluster_name`, serialized `to_dict()["cluster_name"]`)
568+
# accurately reflects the cluster IDs actually passed to
569+
# `LinearRegression` — i.e. None on the Conley path.
570+
if _fit_vcov_type == "conley":
571+
_twfe_cluster_label: Optional[str] = None
572+
elif self.cluster is not None:
573+
_twfe_cluster_label = self.cluster
567574
elif cluster_var is None:
568575
# One-way family path: auto-cluster was intentionally dropped.
569576
_twfe_cluster_label = None

tests/test_conley_vcov.py

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1484,6 +1484,26 @@ def test_twfe_conley_to_dict_carries_lag_cutoff(self, panel):
14841484
assert d["vcov_type"] == "conley"
14851485
assert d["conley_lag_cutoff"] == 1
14861486

1487+
def test_twfe_conley_cluster_name_is_none(self, panel):
1488+
"""Closes Codex re-review round 5 P1 (Maintainability): TWFE drops
1489+
its auto-unit-cluster on the Conley path (`_conley_cluster_override =
1490+
None`), so the variance-provenance metadata must reflect that. The
1491+
result's `cluster_name` is None and `to_dict()` does not advertise
1492+
`cluster_name` — otherwise downstream consumers would be told the
1493+
SEs were CR1-clustered when they're actually Conley spatial HAC.
1494+
"""
1495+
from diff_diff import TwoWayFixedEffects
1496+
1497+
res = TwoWayFixedEffects(
1498+
vcov_type="conley",
1499+
conley_coords=("lat", "lon"),
1500+
conley_cutoff_km=2000.0,
1501+
conley_lag_cutoff=1,
1502+
).fit(panel, outcome="y", treatment="treated", time="time", unit="unit")
1503+
assert res.cluster_name is None
1504+
d = res.to_dict()
1505+
assert "cluster_name" not in d
1506+
14871507
def test_twfe_conley_non_numeric_time_fails(self, panel):
14881508
"""TWFE's `_treatment_post = treated * time` design step requires
14891509
numeric `time`. Non-numeric labels (datetime64, pd.Period, strings)

0 commit comments

Comments
 (0)