Skip to content

Commit 8b6a6b1

Browse files
igerberclaude
andcommitted
Fix CI review R6: P3 doc + replicate-weight parity polish
- twowayfeweights() now raises NotImplementedError when the resolved survey design carries replicate weights, matching fit()'s contract. The fit/helper parity promise now holds for unsupported inputs too. - docs/llms-full.txt and ROADMAP.md no longer claim that Phase 3 parameters raise NotImplementedError; both now correctly note that only `aggregate` remains gated. - Added a helper-level regression that pins the replicate-weight rejection, complementing the existing fit()-level test. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent aeca1d8 commit 8b6a6b1

4 files changed

Lines changed: 34 additions & 2 deletions

File tree

ROADMAP.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -191,7 +191,7 @@ These remain in **Future Estimators** below if/when we choose to extend.
191191
### Architectural notes (for plan and PR reviewers)
192192

193193
- **Single `ChaisemartinDHaultfoeuille` class** (alias `DCDH`). Not a family. New features land as `fit()` parameters or fields on the results dataclass. No `DCDHDynamic`, `DCDHCovariate`, etc. Matches the library's idiomatic pattern: `CallawaySantAnna`, `ImputationDiD`, and `EfficientDiD` are all single classes that evolved across many phases.
194-
- **Forward-compatible API from Phase 1.** `fit(aggregate=None, controls=None, trends_linear=None, L_max=None, ...)` accepts the Phase 2/3 parameters from day one and raises `NotImplementedError` with a clear pointer to the relevant phase until they are implemented. No signature changes between phases.
194+
- **Forward-compatible API from Phase 1.** `fit(aggregate=None, controls=None, trends_linear=None, L_max=None, ...)` accepts the Phase 2/3 parameters from day one and raised `NotImplementedError` with a clear pointer to the relevant phase until they were implemented. As of the dCDH work, Phase 2, Phase 3, and `survey_design` are all live; only `aggregate` remains gated with `NotImplementedError`. No signature changes between phases.
195195
- **Conservative CI** under Assumption 8 (independent groups), exact only under iid sampling. Documented in REGISTRY.md as a `**Note:**` deviation from "default nominal coverage." Theorem 1 of the dynamic paper.
196196
- **Cohort recentering for variance is essential.** Cohorts are defined by the triple `(D_{g,1}, F_g, S_g)`. The plug-in variance subtracts cohort-conditional means, **NOT a single grand mean**. Test fixtures must catch this — a wrong implementation silently produces a smaller, incorrect variance.
197197
- **No Rust acceleration is planned for any phase.** The estimator's hot path is groupby + BLAS-accelerated matrix-vector products, where NumPy already operates near-optimally. If profiling on large panels (`G > 100K`) reveals a bottleneck post-ship, the existing `_rust_bootstrap_weights` helper can be reused for the bootstrap loop without writing new Rust code.

diff_diff/chaisemartin_dhaultfoeuille.py

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5068,6 +5068,16 @@ def twowayfeweights(
50685068
f"The TWFE diagnostic under survey uses survey-weighted cell "
50695069
f"means; other weight types are not supported."
50705070
)
5071+
if (
5072+
resolved is not None
5073+
and resolved.replicate_weights is not None
5074+
and resolved.replicate_weights.shape[1] > 0
5075+
):
5076+
raise NotImplementedError(
5077+
"Replicate weight variance for twowayfeweights() is not "
5078+
"supported. Use strata/PSU/FPC for design-based inference "
5079+
"via Taylor Series Linearization (matches the fit() path)."
5080+
)
50715081

50725082
# Validation + cell aggregation via the same helper used by
50735083
# ChaisemartinDHaultfoeuille.fit() — enforces NaN/binary/within-cell

docs/llms-full.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -270,7 +270,7 @@ est.fit(
270270
) -> ChaisemartinDHaultfoeuilleResults
271271
```
272272

273-
`L_max` controls multi-horizon computation. Phase 3 parameters raise `NotImplementedError`.
273+
`L_max` controls multi-horizon computation. Phase 3 parameters (`controls`, `trends_linear`, `trends_nonparam`, `honest_did`, `heterogeneity`, `design2`) and `survey_design` are implemented; only `aggregate` remains gated with `NotImplementedError`.
274274

275275
**Usage:**
276276

tests/test_survey_dcdh.py

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -726,6 +726,28 @@ def test_twfe_helper_rejects_non_pweight(self, base_data):
726726
survey_design=sd,
727727
)
728728

729+
def test_twfe_helper_rejects_replicate_weights(self, base_data):
730+
"""Replicate-weight survey designs must be rejected by the helper,
731+
matching fit()'s NotImplementedError contract."""
732+
from diff_diff.chaisemartin_dhaultfoeuille import twowayfeweights
733+
734+
df_ = base_data.copy()
735+
df_["pw"] = 1.0
736+
df_["rep1"] = 1.0
737+
df_["rep2"] = 1.0
738+
sd = SurveyDesign(
739+
weights="pw",
740+
replicate_weights=["rep1", "rep2"],
741+
replicate_method="BRR",
742+
)
743+
with pytest.raises(NotImplementedError, match="Replicate"):
744+
twowayfeweights(
745+
df_,
746+
outcome="outcome", group="group",
747+
time="period", treatment="treatment",
748+
survey_design=sd,
749+
)
750+
729751

730752
# ── Test: TWFE diagnostic oracle under survey ───────────────────────
731753

0 commit comments

Comments
 (0)