Address R6 review (2 P3) on PreTrendsPower PR-B

igerber · claude · igerber · commit da2a7bdad8ae · 2026-05-18T20:40:16.000-04:00
R6 codex verdict ✅ "Looks good" but with two P3 polish items.

**P3 — `_infer_cov_source` docstring drifted from new MPD special-case**

R5 added an explicit MPD branch to ``_infer_cov_source`` that returns
``"diag_fallback"`` when ``interaction_indices`` is absent, but the
docstring's ``"full_pre_period_vcov"`` bullet still claimed all
non-event-study types (including MPD) "always" expose full pre-period
covariance. Fix: update the docstring so the
``"full_pre_period_vcov"`` bullet excludes MPD (with a forward
pointer to the explicit MPD branch below), and the
``"diag_fallback"`` bullet enumerates the MPD-without-
``interaction_indices`` case.

**P3 — BR no-downgrade live regression was conditionally bypassed**

The R5-fixed ``test_full_vcov_path_no_downgrade_on_real_cs_fit``
gated the well-powered phrasing assertions on
``if block["tier"] == "well_powered"``, which silently skipped the
key prose assertion if a future regression reintroduced the
conservative downgrade (the test then passes trivially). Fix: pin
the expected tier deterministically on the ``cs_fit`` fixture, which
produces ``mdv/|att| ≈ 0.053`` (well under the ``0.25`` well_powered
threshold) on ``seed=7`` + ``treatment_effect=1.5``. New assertions:

- ``block["covariance_source"] == "full_pre_period_vcov"`` (asserted,
  not guarded)
- ``block["mdv_share_of_att"] &lt; 0.25`` (asserts the raw ratio is in
  the well_powered range so the no-downgrade assertion below is
  meaningful)
- ``block["tier"] == "well_powered"`` (locks the no-downgrade
  contract — a regression reintroducing the downgrade would fail
  here, not silently bypass)

The well-powered / moderately-informative prose contracts on
``summary()`` and ``full_report()`` are now also unconditionally
asserted.

Tests: 125 pass on the impacted classes (BR centralized-downgrade +
all methodology + all DR). No regressions.

Co-Authored-By: Claude Opus 4.7 (1M context) &lt;noreply@anthropic.com&gt;
diff --git a/diff_diff/diagnostic_report.py b/diff_diff/diagnostic_report.py
@@ -1525,11 +1525,13 @@ def _infer_cov_source(source_fit: Any) -> str:
 
         Classification rules:
 
-        - ``"full_pre_period_vcov"`` — non-event-study result types
-          (``MultiPeriodDiDResults``, basic ``DiDResults``, etc.) that
-          always exposed full pre-period covariance via
-          ``interaction_indices`` or equivalent. No ambiguity for these
-          types regardless of pre-/post-PR-B serialization.
+        - ``"full_pre_period_vcov"`` — basic ``DiDResults`` and other
+          non-event-study, non-MPD result types that historically expose
+          the full pre-period covariance. ``MultiPeriodDiDResults`` is
+          handled by an explicit branch below because its
+          ``pretrends.py`` MPD branch only takes the full sub-block path
+          when ``interaction_indices`` is populated, otherwise falling
+          through to ``diag(ses**2)``.
         - ``"diag_fallback_available_full_vcov_unused"`` — event-study
           result types with populated ``event_study_vcov``. Under PR-B,
           new fits route through the full sub-block, but a legacy
@@ -1543,7 +1545,10 @@ def _infer_cov_source(source_fit: Any) -> str:
         - ``"diag_fallback"`` — event-study result types with
           ``event_study_vcov is None`` (bootstrap or replicate-weight
           CS / SA fits, plus ImputationDiD / Stacked / EfficientDiD /
-          TwoStageDiD / etc. which don't yet expose ``event_study_vcov``).
+          TwoStageDiD / etc. which don't yet expose ``event_study_vcov``);
+          OR ``MultiPeriodDiDResults`` without ``interaction_indices``
+          (genuine diag-only path inside ``pretrends.py:_extract_pre_period_params``,
+          no "available but unused" concern, so no downgrade applies).
         """
         is_event_study_type = type(source_fit).__name__ in {
             "CallawaySantAnnaResults",
diff --git a/tests/test_business_report.py b/tests/test_business_report.py
@@ -2425,19 +2425,21 @@ def test_full_vcov_path_no_downgrade_on_real_cs_fit(self, cs_fit):
         consumes the full ``event_study_vcov`` sub-block (PR-B Step 3),
         the DR / BR layer must NOT downgrade ``well_powered``.
 
-        Exercises the live PR-B path on a real CS fit. The fit is
-        non-bootstrap (analytical CS), so ``event_study_vcov`` is
-        populated and ``pretrends.py`` records
-        ``covariance_source='full_pre_period_vcov'`` on the result —
-        which the DR adapter consumes directly. If the headline is
-        well-powered, the BR ``summary()`` prose (the actual surface
-        the well-powered phrasing is rendered on) must reflect that
-        positively, not via the conservative moderately-informative
-        phrasing.
-
-        Skips if the fixture happens to land in a different tier; the
-        important contract is "when the full-VCV path fires, the
-        downgrade does NOT".
+        Exercises the live PR-B path on the deterministic ``cs_fit``
+        fixture (analytical non-bootstrap CS, ``seed=7``,
+        ``treatment_effect=1.5``). On this fixture the raw
+        ``mdv / |att|`` ratio is well under the ``0.25`` well_powered
+        threshold, so the expected tier is unconditionally
+        ``well_powered`` — no skip-on-different-tier branch (R6 codex:
+        previous version would silently bypass the key assertion if a
+        regression reintroduced the downgrade).
+
+        ``pretrends.py`` records
+        ``covariance_source='full_pre_period_vcov'`` on the result, which
+        the DR adapter consumes directly. The BR ``summary()`` prose
+        (the actual surface the well-powered phrasing is rendered on)
+        must contain the well-powered text and lack the conservative
+        moderately-informative text.
         """
         from diff_diff import BusinessReport, DiagnosticReport
         from diff_diff.pretrends import compute_pretrends_power
@@ -2452,41 +2454,48 @@ def test_full_vcov_path_no_downgrade_on_real_cs_fit(self, cs_fit):
             first_treat="first_treat",
         )
         block = dr.to_dict()["pretrends_power"]
-        if block.get("status") != "ran":
-            pytest.skip("pretrends_power did not run on this fixture")
-
-        # Provenance: PR-B records full_pre_period_vcov on non-bootstrap CS.
-        cov = block.get("covariance_source")
-        if cov != "full_pre_period_vcov":
-            pytest.skip(f"fixture did not exercise the full-VCV path (got {cov})")
+        assert block.get("status") == "ran", "pretrends_power should run on cs_fit"
+
+        # Deterministic fixture pins: cov_source = full_pre_period_vcov,
+        # mdv/|att| ratio ≈ 0.053 (well under 0.25), tier = well_powered.
+        # Codex R6 P3: pin the expected tier explicitly so a future
+        # regression that reintroduces the conservative downgrade fails
+        # this test loudly (was previously bypassed by the `if tier ==
+        # well_powered` guard).
+        assert block["covariance_source"] == "full_pre_period_vcov", (
+            "cs_fit is analytical CS with event_study_vcov populated — "
+            "PR-B routing must report full_pre_period_vcov"
+        )
+        ratio = block["mdv_share_of_att"]
+        assert ratio is not None and ratio < 0.25, (
+            f"cs_fit raw mdv/|att|={ratio} must be in the well_powered "
+            "range (<0.25) for this assertion to pin the no-downgrade contract"
+        )
+        assert block["tier"] == "well_powered", (
+            "well-powered raw ratio must NOT be downgraded under the PR-B " "full-VCV path"
+        )
 
-        # Sanity: the same label appears on the compute_pretrends_power
-        # output's persisted field — locks the architectural fix
-        # (provenance recorded at fit time, consumed at the report layer).
+        # Architectural fix: the same provenance label appears on the
+        # compute_pretrends_power output's persisted field, locking that
+        # provenance is recorded at fit time and consumed at the report
+        # layer (not re-inferred from the source-fit type).
         pp = compute_pretrends_power(fit, alpha=0.05, target_power=0.80)
         assert pp.covariance_source == "full_pre_period_vcov"
 
-        # Positive prose contract: when the tier is well_powered post-PR-B,
-        # BR.summary() must contain the well-powered phrasing and must NOT
-        # contain the moderately-informative phrasing (which would only
-        # appear under the conservative downgrade). BR.full_report() also
-        # must not surface the downgrade phrasing as a defensive secondary
-        # check; the primary assertion is on summary() per
-        # ``diff_diff/business_report.py`` rendering surface.
-        if block["tier"] == "well_powered":
-            br = BusinessReport(fit, data=sdf)
-            summary = br.summary()
-            full = br.full_report()
-            # Primary surface: summary() renders the tier prose.
-            assert "well-powered" in summary, (
-                "BR.summary() should surface well-powered phrasing under the "
-                "PR-B full-VCV no-downgrade path"
-            )
-            assert "moderately informative" not in summary
-            assert "moderately-informative" not in summary
-            # Secondary defensive check on full_report().
-            assert "moderately informative" not in full.lower()
-            assert "moderately-informative" not in full.lower()
+        # Positive prose contract on the rendered surfaces.
+        br = BusinessReport(fit, data=sdf)
+        summary = br.summary()
+        full = br.full_report()
+        # Primary surface: summary() renders the tier prose.
+        assert "well-powered" in summary, (
+            "BR.summary() should surface well-powered phrasing under the "
+            "PR-B full-VCV no-downgrade path"
+        )
+        assert "moderately informative" not in summary
+        assert "moderately-informative" not in summary
+        # Secondary defensive check on full_report().
+        assert "moderately informative" not in full.lower()
+        assert "moderately-informative" not in full.lower()
 
 
 class TestCSNotYetTreatedControlGroupSemantics: