dCDH heterogeneity wrap-up: terminology cleanup (codex R2 P3s)

igerber · claude · igerber · commit 8a9f32c8a51e · 2026-05-16T07:45:18.000-04:00
Two P3 informational findings from R2: 1. TODO.md Tier B backlog still listed the dCDH heterogeneity df- threading and by-path placebo predict_het items as open, but PR #449 closed both. Replaced the two stale bullets with a single bullet for the remaining survey + backward-horizon allocator derivation (the one Medium follow-up explicitly tracked in the wrap-up commit). 2. Three test-prose comments still said `df = n_obs - n_params` while the implementation and REGISTRY now use `df = n_obs - rank(design)`. Updated each comment to the post-drop rank wording; full-rank designs continue to have `rank == n_params` so the SE-derivation invariants under test are unchanged. Comment-only drift; no behavior change. 317 tests still pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
diff --git a/TODO.md b/TODO.md
@@ -184,8 +184,7 @@ Ordered paydown view across the tables above. Tier A → D is by effort × risk,
 - WooldridgeDiD: optional `weights="cohort_share"` on `aggregate()` (`wooldridge_results.py`)
 - HAD survey-design API consolidation: drop deprecated `survey=`/`weights=` kwargs (`had.py`, `had_pretests.py`; gated on next minor bump)
 - Survey-design resolution / collapse helper extraction across `continuous_did.py`, `efficient_did.py`, `stacked_did.py`
-- dCDH heterogeneity df threading: t-distribution at heterogeneity surface (or formalize the tolerance constant) (`chaisemartin_dhaultfoeuille.py`)
-- dCDH by_path placebo `predict_het` parity vs R `did_multiplegt_dyn(..., by_path, predict_het)` (`chaisemartin_dhaultfoeuille.py`, `chaisemartin_dhaultfoeuille_results.py`)
+- dCDH survey + backward-horizon `predict_het` allocator derivation: lift the warn-and-skip fallback at `_compute_heterogeneity_test` once the pre-period Binder TSL cell-period allocator is derived (currently the gate emits a `UserWarning` and falls back to forward-horizon-only heterogeneity under `survey_design + placebo + heterogeneity`) (`chaisemartin_dhaultfoeuille.py`, `docs/methodology/REGISTRY.md`)
 - Rust local-method solver path unification to `solve_wls_svd` + bootstrap-weight RNG parity audit (`rust/src/trop.rs`, `rust/src/bootstrap.rs`)
 - AI review CI workflow-contract pin test expansion (`tests/test_openai_review.py`)
 - In-site Sphinx render of `REPORTING.md` and `REGISTRY.md` (`docs/conf.py` + `:doc:` link migration)
diff --git a/tests/test_chaisemartin_dhaultfoeuille.py b/tests/test_chaisemartin_dhaultfoeuille.py
@@ -2870,11 +2870,12 @@ def test_heterogeneity_multi_horizon(self):
     def test_heterogeneity_inference_local_invariants(self):
         """Local SE-derivation invariants for non-survey heterogeneity
         inference. Post-2026-05-15 df threading: Python passes
-        ``df = n_obs - n_params`` to ``safe_inference`` (matching R's
-        t-distribution); R-parity is pinned in
+        ``df = n_obs - rank(design)`` to ``safe_inference`` (matching
+        R's t-distribution); for full-rank designs ``rank == n_params``.
+        R-parity is pinned in
         ``tests/test_chaisemartin_dhaultfoeuille_parity.py``. This local
         test verifies the SE-derived fields are wired correctly
-        without requiring back-derivation of ``n_params``:
+        without requiring back-derivation of ``rank``:
         ``t_stat = beta / se``; ``conf_int`` symmetric around ``beta``
         with positive half-width; ``p_value`` in ``[0, 1]``.
         Without these checks a regression isolated to the inference
@@ -10354,12 +10355,13 @@ def test_per_path_heterogeneity_finite_under_known_signal(self):
     def test_per_path_heterogeneity_inference_local_invariants(self):
         """Local SE-derivation invariants for non-survey per-path
         heterogeneity inference. Post-2026-05-15 df threading: Python
-        passes ``df = n_obs - n_params`` to ``safe_inference``; R-parity
-        is pinned in
+        passes ``df = n_obs - rank(design)`` to ``safe_inference``
+        (full-rank designs have ``rank == n_params``); R-parity is
+        pinned in
         ``tests/test_chaisemartin_dhaultfoeuille_parity.py::
         TestDCDHDynRParityByPathHeterogeneity``. Verifies SE-derivation
         wiring (``t_stat = beta/se``, symmetric ``conf_int`` around beta,
-        ``p_value`` in ``[0, 1]``) without back-deriving ``n_params``.
+        ``p_value`` in ``[0, 1]``) without back-deriving ``rank``.
         Mirrors
         ``TestHeterogeneityTesting::test_heterogeneity_inference_local_invariants``.
         """
diff --git a/tests/test_chaisemartin_dhaultfoeuille_parity.py b/tests/test_chaisemartin_dhaultfoeuille_parity.py
@@ -1410,11 +1410,13 @@ def test_parity_multi_path_reversible_predict_het(self, golden_values):
                 f"h={h} n_obs: py={py_h['n_obs']} vs r={r_h['n_obs']}"
             )
             # `p_value` and `conf_int` parity (post-2026-05-15 df threading).
-            # `_compute_heterogeneity_test` now passes `df = n_obs - n_params`
-            # to `safe_inference`, matching R's t-distribution with WLS df.
-            # Pinned at INFERENCE_RTOL = 1e-4 because Wald-test critical
-            # values come from `scipy.stats.t.ppf` and `t.sf` which are
-            # implementation-aligned with R's `qt`/`pt` to ~6 sig figs.
+            # `_compute_heterogeneity_test` now passes
+            # `df = n_obs - rank(design)` to `safe_inference`, matching
+            # R's t-distribution with WLS df (full-rank designs have
+            # `rank == n_params`). Pinned at INFERENCE_RTOL = 1e-4
+            # because Wald-test critical values come from
+            # `scipy.stats.t.ppf` and `t.sf` which are implementation-
+            # aligned with R's `qt`/`pt` to ~6 sig figs.
             assert py_h["p_value"] == pytest.approx(
                 r_h["p_value"], rel=self.INFERENCE_RTOL
             ), f"h={h} p_value: py={py_h['p_value']:.6e} vs r={r_h['p_value']:.6e}"