Skip to content

Commit 98dcbb2

Browse files
igerberclaude
andcommitted
Address PR #356 CI review round 12 (1 P2 guide)
Balanced-panel eligibility gate tightened. `PanelProfile.is_balanced` is computed from the unique `(unit, time)` support, so it stays `True` even when duplicate rows exist — `duplicate_unit_time_rows` is a separate alert for that case. But ContinuousDiD / EfficientDiD / HeterogeneousAdoptionDiD all require exactly one observation per cell: EfficientDiD and HAD raise ValueError on duplicates at fit() time, and ContinuousDiD's precompute path silently resolves duplicates via last-row-wins (which can change the estimand without warning). Guide §3 balanced-panel-eligibility block now requires BOTH `is_balanced == True` AND absence of the `duplicate_unit_time_rows` alert before routing to these estimators, with the specific failure mode (raise vs silent overwrite) named per-estimator and a concrete two-step fix (`balance_panel()` + `drop_duplicates([unit, time])`). Tests: extended the semantic guide test to assert the guide mentions `duplicate_unit_time_rows` and uses "BOTH/both" wording around the is_balanced gate, so future edits cannot silently drop the duplicate- row half of the eligibility requirement. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 3b7408a commit 98dcbb2

2 files changed

Lines changed: 33 additions & 8 deletions

File tree

diff_diff/guides/llms-autonomous.txt

Lines changed: 17 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -269,15 +269,24 @@ supported / out of scope; `warn` supported but with documented caveats;
269269
raises `ValueError`. For full staggered support that retains
270270
every cohort, use `ChaisemartinDHaultfoeuille` instead.
271271

272-
**Balanced-panel eligibility.** The following estimators hard-reject
273-
unbalanced panels (each raises `ValueError` at `fit()` when a unit is
274-
missing any period): `ContinuousDiD`, `EfficientDiD`, `SyntheticDiD`,
275-
`HeterogeneousAdoptionDiD`. Gate these on
276-
`PanelProfile.is_balanced == True`; if `False`, pre-process with
277-
`diff_diff.prep.balance_panel()` or pick a balance-tolerant
272+
**Balanced-panel eligibility.** The following estimators require
273+
exactly one observation per `(unit, time)` cell with every unit
274+
observed in every period: `ContinuousDiD`, `EfficientDiD`,
275+
`SyntheticDiD`, `HeterogeneousAdoptionDiD`. Gate these on BOTH
276+
`PanelProfile.is_balanced == True` AND the absence of the
277+
`duplicate_unit_time_rows` alert (`is_balanced` is computed from the
278+
unique-key support and stays `True` when duplicates exist; the
279+
alert is the separate signal for duplicates). Treat both
280+
conditions as hard gates: `EfficientDiD` and
281+
`HeterogeneousAdoptionDiD` raise `ValueError` at `fit()` on
282+
duplicate cells, and `ContinuousDiD`'s precompute path resolves
283+
duplicates with last-row-wins (silent overwrite that can change
284+
the estimand). If either condition fails, pre-process with
285+
`diff_diff.prep.balance_panel()` and a
286+
`drop_duplicates([unit, time])` pass, or pick a balance-tolerant
278287
estimator from the remaining rows (CS/SA/dCDH/Imputation/TwoStage/
279-
Stacked/ETWFE all accept unbalanced input, with some caveats in their
280-
own docs).
288+
Stacked/ETWFE all accept unbalanced input, with some caveats in
289+
their own docs).
281290

282291

283292
## §4. Estimator-choice reasoning by design feature

tests/test_profile_panel.py

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -675,6 +675,22 @@ def test_guide_api_strings_resolve_against_public_api():
675675
"for full staggered support"
676676
)
677677

678+
# Balanced-panel gate is incomplete with `is_balanced` alone because
679+
# duplicate (unit, time) rows don't flip is_balanced. Guide must
680+
# require BOTH is_balanced == True AND absence of the
681+
# duplicate_unit_time_rows alert before routing to the duplicate-
682+
# intolerant estimators (ContinuousDiD silently overwrites
683+
# duplicates via last-row-wins; EfficientDiD/HAD raise).
684+
assert "duplicate_unit_time_rows" in text, (
685+
"Guide must name the duplicate_unit_time_rows alert as part of "
686+
"the balanced-panel eligibility gate"
687+
)
688+
assert "BOTH" in text or "both" in text, (
689+
"Guide must require BOTH is_balanced and absence of the "
690+
"duplicate_unit_time_rows alert before routing to duplicate-"
691+
"intolerant estimators"
692+
)
693+
678694

679695
def test_min_pre_post_use_per_unit_observed_support():
680696
"""On an unbalanced panel where one treated unit is missing its

0 commit comments

Comments
 (0)