Skip to content

Commit da2a7bd

Browse files
igerberclaude
andcommitted
Address R6 review (2 P3) on PreTrendsPower PR-B
R6 codex verdict ✅ "Looks good" but with two P3 polish items. **P3 — `_infer_cov_source` docstring drifted from new MPD special-case** R5 added an explicit MPD branch to ``_infer_cov_source`` that returns ``"diag_fallback"`` when ``interaction_indices`` is absent, but the docstring's ``"full_pre_period_vcov"`` bullet still claimed all non-event-study types (including MPD) "always" expose full pre-period covariance. Fix: update the docstring so the ``"full_pre_period_vcov"`` bullet excludes MPD (with a forward pointer to the explicit MPD branch below), and the ``"diag_fallback"`` bullet enumerates the MPD-without- ``interaction_indices`` case. **P3 — BR no-downgrade live regression was conditionally bypassed** The R5-fixed ``test_full_vcov_path_no_downgrade_on_real_cs_fit`` gated the well-powered phrasing assertions on ``if block["tier"] == "well_powered"``, which silently skipped the key prose assertion if a future regression reintroduced the conservative downgrade (the test then passes trivially). Fix: pin the expected tier deterministically on the ``cs_fit`` fixture, which produces ``mdv/|att| ≈ 0.053`` (well under the ``0.25`` well_powered threshold) on ``seed=7`` + ``treatment_effect=1.5``. New assertions: - ``block["covariance_source"] == "full_pre_period_vcov"`` (asserted, not guarded) - ``block["mdv_share_of_att"] < 0.25`` (asserts the raw ratio is in the well_powered range so the no-downgrade assertion below is meaningful) - ``block["tier"] == "well_powered"`` (locks the no-downgrade contract — a regression reintroducing the downgrade would fail here, not silently bypass) The well-powered / moderately-informative prose contracts on ``summary()`` and ``full_report()`` are now also unconditionally asserted. Tests: 125 pass on the impacted classes (BR centralized-downgrade + all methodology + all DR). No regressions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 5fb4aa7 commit da2a7bd

2 files changed

Lines changed: 64 additions & 50 deletions

File tree

diff_diff/diagnostic_report.py

Lines changed: 11 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1525,11 +1525,13 @@ def _infer_cov_source(source_fit: Any) -> str:
15251525
15261526
Classification rules:
15271527
1528-
- ``"full_pre_period_vcov"`` — non-event-study result types
1529-
(``MultiPeriodDiDResults``, basic ``DiDResults``, etc.) that
1530-
always exposed full pre-period covariance via
1531-
``interaction_indices`` or equivalent. No ambiguity for these
1532-
types regardless of pre-/post-PR-B serialization.
1528+
- ``"full_pre_period_vcov"`` — basic ``DiDResults`` and other
1529+
non-event-study, non-MPD result types that historically expose
1530+
the full pre-period covariance. ``MultiPeriodDiDResults`` is
1531+
handled by an explicit branch below because its
1532+
``pretrends.py`` MPD branch only takes the full sub-block path
1533+
when ``interaction_indices`` is populated, otherwise falling
1534+
through to ``diag(ses**2)``.
15331535
- ``"diag_fallback_available_full_vcov_unused"`` — event-study
15341536
result types with populated ``event_study_vcov``. Under PR-B,
15351537
new fits route through the full sub-block, but a legacy
@@ -1543,7 +1545,10 @@ def _infer_cov_source(source_fit: Any) -> str:
15431545
- ``"diag_fallback"`` — event-study result types with
15441546
``event_study_vcov is None`` (bootstrap or replicate-weight
15451547
CS / SA fits, plus ImputationDiD / Stacked / EfficientDiD /
1546-
TwoStageDiD / etc. which don't yet expose ``event_study_vcov``).
1548+
TwoStageDiD / etc. which don't yet expose ``event_study_vcov``);
1549+
OR ``MultiPeriodDiDResults`` without ``interaction_indices``
1550+
(genuine diag-only path inside ``pretrends.py:_extract_pre_period_params``,
1551+
no "available but unused" concern, so no downgrade applies).
15471552
"""
15481553
is_event_study_type = type(source_fit).__name__ in {
15491554
"CallawaySantAnnaResults",

tests/test_business_report.py

Lines changed: 53 additions & 44 deletions
Original file line numberDiff line numberDiff line change
@@ -2425,19 +2425,21 @@ def test_full_vcov_path_no_downgrade_on_real_cs_fit(self, cs_fit):
24252425
consumes the full ``event_study_vcov`` sub-block (PR-B Step 3),
24262426
the DR / BR layer must NOT downgrade ``well_powered``.
24272427
2428-
Exercises the live PR-B path on a real CS fit. The fit is
2429-
non-bootstrap (analytical CS), so ``event_study_vcov`` is
2430-
populated and ``pretrends.py`` records
2431-
``covariance_source='full_pre_period_vcov'`` on the result —
2432-
which the DR adapter consumes directly. If the headline is
2433-
well-powered, the BR ``summary()`` prose (the actual surface
2434-
the well-powered phrasing is rendered on) must reflect that
2435-
positively, not via the conservative moderately-informative
2436-
phrasing.
2437-
2438-
Skips if the fixture happens to land in a different tier; the
2439-
important contract is "when the full-VCV path fires, the
2440-
downgrade does NOT".
2428+
Exercises the live PR-B path on the deterministic ``cs_fit``
2429+
fixture (analytical non-bootstrap CS, ``seed=7``,
2430+
``treatment_effect=1.5``). On this fixture the raw
2431+
``mdv / |att|`` ratio is well under the ``0.25`` well_powered
2432+
threshold, so the expected tier is unconditionally
2433+
``well_powered`` — no skip-on-different-tier branch (R6 codex:
2434+
previous version would silently bypass the key assertion if a
2435+
regression reintroduced the downgrade).
2436+
2437+
``pretrends.py`` records
2438+
``covariance_source='full_pre_period_vcov'`` on the result, which
2439+
the DR adapter consumes directly. The BR ``summary()`` prose
2440+
(the actual surface the well-powered phrasing is rendered on)
2441+
must contain the well-powered text and lack the conservative
2442+
moderately-informative text.
24412443
"""
24422444
from diff_diff import BusinessReport, DiagnosticReport
24432445
from diff_diff.pretrends import compute_pretrends_power
@@ -2452,41 +2454,48 @@ def test_full_vcov_path_no_downgrade_on_real_cs_fit(self, cs_fit):
24522454
first_treat="first_treat",
24532455
)
24542456
block = dr.to_dict()["pretrends_power"]
2455-
if block.get("status") != "ran":
2456-
pytest.skip("pretrends_power did not run on this fixture")
2457-
2458-
# Provenance: PR-B records full_pre_period_vcov on non-bootstrap CS.
2459-
cov = block.get("covariance_source")
2460-
if cov != "full_pre_period_vcov":
2461-
pytest.skip(f"fixture did not exercise the full-VCV path (got {cov})")
2457+
assert block.get("status") == "ran", "pretrends_power should run on cs_fit"
2458+
2459+
# Deterministic fixture pins: cov_source = full_pre_period_vcov,
2460+
# mdv/|att| ratio ≈ 0.053 (well under 0.25), tier = well_powered.
2461+
# Codex R6 P3: pin the expected tier explicitly so a future
2462+
# regression that reintroduces the conservative downgrade fails
2463+
# this test loudly (was previously bypassed by the `if tier ==
2464+
# well_powered` guard).
2465+
assert block["covariance_source"] == "full_pre_period_vcov", (
2466+
"cs_fit is analytical CS with event_study_vcov populated — "
2467+
"PR-B routing must report full_pre_period_vcov"
2468+
)
2469+
ratio = block["mdv_share_of_att"]
2470+
assert ratio is not None and ratio < 0.25, (
2471+
f"cs_fit raw mdv/|att|={ratio} must be in the well_powered "
2472+
"range (<0.25) for this assertion to pin the no-downgrade contract"
2473+
)
2474+
assert block["tier"] == "well_powered", (
2475+
"well-powered raw ratio must NOT be downgraded under the PR-B " "full-VCV path"
2476+
)
24622477

2463-
# Sanity: the same label appears on the compute_pretrends_power
2464-
# output's persisted field — locks the architectural fix
2465-
# (provenance recorded at fit time, consumed at the report layer).
2478+
# Architectural fix: the same provenance label appears on the
2479+
# compute_pretrends_power output's persisted field, locking that
2480+
# provenance is recorded at fit time and consumed at the report
2481+
# layer (not re-inferred from the source-fit type).
24662482
pp = compute_pretrends_power(fit, alpha=0.05, target_power=0.80)
24672483
assert pp.covariance_source == "full_pre_period_vcov"
24682484

2469-
# Positive prose contract: when the tier is well_powered post-PR-B,
2470-
# BR.summary() must contain the well-powered phrasing and must NOT
2471-
# contain the moderately-informative phrasing (which would only
2472-
# appear under the conservative downgrade). BR.full_report() also
2473-
# must not surface the downgrade phrasing as a defensive secondary
2474-
# check; the primary assertion is on summary() per
2475-
# ``diff_diff/business_report.py`` rendering surface.
2476-
if block["tier"] == "well_powered":
2477-
br = BusinessReport(fit, data=sdf)
2478-
summary = br.summary()
2479-
full = br.full_report()
2480-
# Primary surface: summary() renders the tier prose.
2481-
assert "well-powered" in summary, (
2482-
"BR.summary() should surface well-powered phrasing under the "
2483-
"PR-B full-VCV no-downgrade path"
2484-
)
2485-
assert "moderately informative" not in summary
2486-
assert "moderately-informative" not in summary
2487-
# Secondary defensive check on full_report().
2488-
assert "moderately informative" not in full.lower()
2489-
assert "moderately-informative" not in full.lower()
2485+
# Positive prose contract on the rendered surfaces.
2486+
br = BusinessReport(fit, data=sdf)
2487+
summary = br.summary()
2488+
full = br.full_report()
2489+
# Primary surface: summary() renders the tier prose.
2490+
assert "well-powered" in summary, (
2491+
"BR.summary() should surface well-powered phrasing under the "
2492+
"PR-B full-VCV no-downgrade path"
2493+
)
2494+
assert "moderately informative" not in summary
2495+
assert "moderately-informative" not in summary
2496+
# Secondary defensive check on full_report().
2497+
assert "moderately informative" not in full.lower()
2498+
assert "moderately-informative" not in full.lower()
24902499

24912500

24922501
class TestCSNotYetTreatedControlGroupSemantics:

0 commit comments

Comments
 (0)