Skip to content

Commit 2cb8def

Browse files
authored
Merge pull request #449 from igerber/dcdh-heterogeneity-placebo-and-df
dCDH heterogeneity: per-path + global placebo predict_het R-parity + df threading
2 parents e798d0c + d20988d commit 2cb8def

11 files changed

Lines changed: 1316 additions & 109 deletions

CHANGELOG.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,12 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
77

88
## [Unreleased]
99

10+
### Added
11+
- **`ChaisemartinDHaultfoeuille.predict_het` × `placebo`: R-parity on both global and per-path surfaces.** R-verified — `did_multiplegt_dyn(predict_het, placebo)` emits heterogeneity OLS results on backward (placebo) horizons via R's `DIDmultiplegtDYN:::did_multiplegt_main` placebo block (`effect = matrix(-i, ...)` rbind site); the same block runs per-by_level under `did_multiplegt_dyn(by_path, predict_het, placebo)`, so both global `res$results$predict_het` and per-by_level `res$by_level_i$results$predict_het` slots emit backward rows. R's predict_het syntax with `placebo > 0` requires the `c(-1)` sentinel in the horizon vector to trigger "compute heterogeneity for ALL forward (1..effects) AND ALL placebo (1..placebo) positions" — passing positive-only horizons errors with "specified numbers in predict_het that exceed the number of placebos". Python mirrors via `_compute_heterogeneity_test(..., placebo=L_max)` (set automatically from `self.placebo` at both global and per-path call sites in `fit()`) — the function iterates forward (1..L_max) and backward (-1..-L_max) horizons in a single loop with an explicit `out_idx < 0` eligibility guard for backward horizons whose `F_g` is too small (would otherwise silently misread `N_mat` via numpy negative indexing). `results.heterogeneity_effects` uses negative-int keys for backward horizons; `path_heterogeneity_effects` does the same per path. Placebo rows in `to_dataframe(level="by_path")` have non-NaN `het_*` columns when `placebo=True` and `heterogeneity=` are both set. **Survey gate (warn + skip):** `survey_design + placebo + heterogeneity` emits a `UserWarning` at fit-time and falls back to forward-horizon-only heterogeneity on both surfaces — the Binder TSL cell-period allocator's REGISTRY justification is tied to **post-period** attribution; backward-horizon attribution puts ψ_g mass on a pre-period cell, a separate library-extension claim that needs its own derivation. Forward-horizon `predict_het + survey_design` continues to work unchanged on both global and per-path surfaces. The function-level `_compute_heterogeneity_test` keeps a per-iteration `NotImplementedError` backstop for direct callers that bypass fit(). Pre-period allocator derivation deferred to a follow-up methodology PR (tracked in TODO.md). R parity confirmed at `tests/test_chaisemartin_dhaultfoeuille_parity.py::TestDCDHDynRParityHeterogeneityWithPlacebo` (scenario 23, `multi_path_reversible_predict_het_with_placebo_global`, `placebo=2, effects=3, no by_path`) and `::TestDCDHDynRParityByPathHeterogeneityWithPlacebo` (scenario 22, same DGP plus `by_path=3`); pinned at `BETA_RTOL=1e-6` / `SE_RTOL=1e-5` for `beta` / `se` / `t_stat` / `n_obs` and `INFERENCE_RTOL=1e-4` for `p_value` / `conf_int` across 3 paths × (3 forward + 2 placebo) = 15 horizons + 1 global × 5 horizons. Cross-surface invariants regression-tested at `tests/test_chaisemartin_dhaultfoeuille.py::TestByPathPredictHetPlacebo` (placebo het column population, survey-gate warn+skip behavior, forward+survey anti-regression, `out_idx<0` eligibility guard, single-path telescope `path_heterogeneity_effects[(only_path,)] == heterogeneity_effects` bit-exactly, summary rendering, direct-call `NotImplementedError` backstop). Closes TODO #422.
12+
13+
### Changed
14+
- **`ChaisemartinDHaultfoeuille.predict_het` inference: t-distribution df threading (closes TODO pilot-412).** `_compute_heterogeneity_test` now passes `df = n_obs - n_params` to `safe_inference` on the non-survey OLS path, matching R `did_multiplegt_dyn(predict_het=...)`'s t-distribution inference (`DIDmultiplegtDYN:::did_multiplegt_main` `t_stat <- qt(0.975, df.residual(model))` site). Pre-PR Python used `df=None` (normal Z critical), producing 0.1-2% rtol gaps on `p_value` and `conf_int` vs R. Parity tolerance tightened on the existing forward-horizon scenarios (`multi_path_reversible_predict_het`, `multi_path_reversible_by_path_predict_het`) from "unpinned" to `INFERENCE_RTOL=1e-4` on `p_value` and `conf_int`; `beta` / `se` / `t_stat` continue at `BETA_RTOL=1e-6` / `SE_RTOL=1e-5`. **Rank-deficient caveat:** `n_params = design.shape[1]` is the pre-drop column count; under near-rank-deficient designs that `solve_ols` retains rather than NaN-out, the actual rank may be lower than `n_params` (R's `df.residual` uses post-drop rank). Fully rank-deficient designs are NaN-filled by the existing short-circuit; the gap only affects near-rank-deficient edge cases (tracked as a Low TODO follow-up). The Z-vs-t REGISTRY deviation note is replaced with an "R parity (post-2026-05-15 df threading)" positive-claim note.
15+
1016
## [3.3.3] - 2026-05-15
1117

1218
### Added

TODO.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -78,8 +78,8 @@ Deferred items from PR reviews that were not addressed before merge.
7878
| dCDH: Survey cell-period allocator's post-period attribution is a library convention, not derived from the observation-level survey linearization. MC coverage is empirically close to nominal on the test DGP; a formal derivation (or a covariance-aware two-cell alternative) is deferred. Documented in REGISTRY.md survey IF expansion Note. | `chaisemartin_dhaultfoeuille.py`, `docs/methodology/REGISTRY.md` | #408 | Medium |
7979
| dCDH: Parity test SE/CI assertions only cover pure-direction scenarios; mixed-direction SE comparison is structurally apples-to-oranges (cell-count vs obs-count weighting). | `test_chaisemartin_dhaultfoeuille_parity.py` | #294 | Low |
8080
| dCDH by_path: negative-baseline path regression (e.g. `(-1, 0, 0, 0)`) is not yet exercised. The existing negative-D test (`test_negative_integer_D_supported`) only covers paths with negative values in non-baseline positions like `(0, -1, -1, -1)`, which does not trigger the R `substr(path, 1, 1)` bug regime (the bug needs a multi-character baseline). Add a switcher fixture with `D_{g,1} = -1` and assert the resulting path tuple key. | `tests/test_chaisemartin_dhaultfoeuille.py` | #419 | Low |
81-
| dCDH by_path: per-path placebo heterogeneity (`predict_het` rows for negative horizons) is currently NaN-filled in `to_dataframe(level="by_path")` `het_*` columns and unpopulated in `path_heterogeneity_effects`. R `did_multiplegt_dyn(..., by_path, predict_het)` forwards `predict_het` into each per-path `did_multiplegt_main` call alongside `placebo`, so R likely emits placebo het rows we do not yet mirror. Validate R's actual placebo predict_het output, then either implement parity or document the deviation explicitly. | `diff_diff/chaisemartin_dhaultfoeuille.py`, `diff_diff/chaisemartin_dhaultfoeuille_results.py`, `tests/test_chaisemartin_dhaultfoeuille_parity.py` | #422 | Medium |
82-
| dCDH heterogeneity: `_compute_heterogeneity_test` passes `df=None` to `safe_inference`, so Python uses the normal Z critical value (~1.96) for `t_stat`-derived `p_value` and `conf_int`. R `did_multiplegt_dyn(..., predict_het)` uses the t-distribution with df = n - k from the OLS regression, producing ~0.1-2% rtol gaps on CIs and p-values vs Python. Documented as a deviation in the heterogeneity R-parity Note; parity tests pin only `beta`, `se`, `t_stat`, and `n_obs`. Either thread the OLS df into `safe_inference` to match R, or formalize a separate inference-tolerance constant for the heterogeneity surface. | `diff_diff/chaisemartin_dhaultfoeuille.py`, `tests/test_chaisemartin_dhaultfoeuille_parity.py` | pilot-412 | Low |
81+
| dCDH by_path: survey-aware backward-horizon (`placebo + predict_het + survey_design`) raises `NotImplementedError` because the Binder TSL cell-period allocator's REGISTRY justification is tied to post-period attribution. Backward horizons would put ψ_g mass on a pre-period cell. Deriving the pre-period cell allocator (or adding a covariance-aware two-cell alternative) is deferred to a follow-up methodology PR. | `diff_diff/chaisemartin_dhaultfoeuille.py`, `docs/methodology/REGISTRY.md` | follow-up | Medium |
82+
| dCDH heterogeneity: rank-deficient designs use `df = n_obs - n_params` (pre-drop column count) in the t-distribution inference. R's `lm(predict_het=...)` uses `df.residual = n - rank(design)` post-drop. Fully rank-deficient designs are NaN-filled by the rank-deficient short-circuit at `_compute_heterogeneity_test:5141-5150`, so the gap only affects near-rank-deficient designs where `solve_ols` retains the design. Thread actual rank from `solve_ols` to close the gap. | `diff_diff/chaisemartin_dhaultfoeuille.py` | follow-up | Low |
8383
| CallawaySantAnna: consider materializing NaN entries for non-estimable (g,t) cells in group_time_effects dict (currently omitted with consolidated warning); would require updating downstream consumers (event study, balance_e, aggregation) | `staggered.py` | #256 | Low |
8484
| ImputationDiD dense `(A0'A0).toarray()` scales O((U+T+K)^2), OOM risk on large panels | `imputation.py` | #141 | Medium (deferred — only triggers when sparse solver fails) |
8585
| Multi-absorb weighted demeaning needs iterative alternating projections for N > 1 absorbed FE with survey weights; unweighted multi-absorb also uses single-pass (pre-existing, exact only for balanced panels) | `estimators.py` | #218 | Medium |

benchmarks/R/generate_dcdh_dynr_test_values.R

Lines changed: 127 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -622,12 +622,30 @@ extract_dcdh_by_path <- function(res, n_effects, n_placebos = 0) {
622622
# res$results$predict_het, a data.frame with columns
623623
# {effect, covariate, Estimate, SE, t, LB, UB, N, pF}. Estimate is the
624624
# WLS coefficient on the heterogeneity covariate.
625+
#
626+
# `n_effects` retained for backward-compat with scenarios 20/21 callers
627+
# but unused: we iterate ALL rows in ph$effect and partition by sign so
628+
# placebo (negative-effect) rows are captured separately. Scenario 22
629+
# probes whether R emits negative-effect rows when called with
630+
# `predict_het + placebo`; resolves TODO #422.
625631
extract_dcdh_predict_het <- function(res, n_effects) {
626632
ph <- res$results$predict_het
627-
horizons <- list()
628-
if (is.null(ph) || nrow(ph) == 0) return(list(predict_het = horizons))
629-
for (h in seq_len(min(n_effects, nrow(ph)))) {
630-
horizons[[as.character(ph$effect[h])]] <- list(
633+
# `structure(list(), names = character(0))` produces a named list with
634+
# zero entries; jsonlite serializes it as `{}` (object) rather than
635+
# `[]` (array). Plain `list()` would serialize as `[]`, which gives
636+
# the JSON contract a type-unstable shape (object when populated, array
637+
# when empty). Type stability matters for generic consumers — see
638+
# `tests/test_chaisemartin_dhaultfoeuille_parity.py::_as_dict` for the
639+
# defensive Python-side coercion that backstops this.
640+
forward_horizons <- structure(list(), names = character(0))
641+
placebo_horizons <- structure(list(), names = character(0))
642+
if (is.null(ph) || nrow(ph) == 0) {
643+
return(list(predict_het = forward_horizons,
644+
placebo_predict_het = placebo_horizons))
645+
}
646+
for (h in seq_len(nrow(ph))) {
647+
effect_val <- as.numeric(ph$effect[h])
648+
entry <- list(
631649
beta = as.numeric(ph$Estimate[h]),
632650
se = as.numeric(ph$SE[h]),
633651
t = as.numeric(ph$t[h]),
@@ -636,8 +654,15 @@ extract_dcdh_predict_het <- function(res, n_effects) {
636654
n_obs = as.numeric(ph$N[h]),
637655
p_value = as.numeric(ph$pF[h])
638656
)
657+
if (effect_val > 0) {
658+
forward_horizons[[as.character(effect_val)]] <- entry
659+
} else if (effect_val < 0) {
660+
placebo_horizons[[as.character(effect_val)]] <- entry
661+
}
662+
# effect_val == 0: skip (not a valid event-study horizon).
639663
}
640-
list(predict_het = horizons)
664+
list(predict_het = forward_horizons,
665+
placebo_predict_het = placebo_horizons)
641666
}
642667

643668
# Helper: extract per-path predict_het results. Under by_path=k +
@@ -650,10 +675,16 @@ extract_dcdh_by_path_predict_het <- function(res, n_effects) {
650675
for (i in seq_along(by_levels)) {
651676
slot <- res[[paste0("by_level_", i)]]
652677
ph <- slot$results$predict_het
653-
horizons <- list()
678+
# See extract_dcdh_predict_het comment for the named-list rationale.
679+
forward_horizons <- structure(list(), names = character(0))
680+
placebo_horizons <- structure(list(), names = character(0))
654681
if (!is.null(ph) && nrow(ph) > 0) {
655-
for (h in seq_len(min(n_effects, nrow(ph)))) {
656-
horizons[[as.character(ph$effect[h])]] <- list(
682+
# Iterate ALL rows; partition by sign so placebo (negative-effect)
683+
# rows are captured under `placebo_horizons`. Scenario 22 probes
684+
# whether R emits negative-effect rows on the per-path surface.
685+
for (h in seq_len(nrow(ph))) {
686+
effect_val <- as.numeric(ph$effect[h])
687+
entry <- list(
657688
beta = as.numeric(ph$Estimate[h]),
658689
se = as.numeric(ph$SE[h]),
659690
t = as.numeric(ph$t[h]),
@@ -662,12 +693,18 @@ extract_dcdh_by_path_predict_het <- function(res, n_effects) {
662693
n_obs = as.numeric(ph$N[h]),
663694
p_value = as.numeric(ph$pF[h])
664695
)
696+
if (effect_val > 0) {
697+
forward_horizons[[as.character(effect_val)]] <- entry
698+
} else if (effect_val < 0) {
699+
placebo_horizons[[as.character(effect_val)]] <- entry
700+
}
665701
}
666702
}
667703
out[[i]] <- list(
668704
path = by_levels[i],
669705
frequency_rank = i,
670-
horizons = horizons
706+
horizons = forward_horizons,
707+
placebo_horizons = placebo_horizons
671708
)
672709
}
673710
list(by_path_predict_het = out)
@@ -1201,6 +1238,87 @@ cat(" Scenarios 20/21: multi_path_reversible_predict_het + by_path version\n")
12011238
dont_drop_larger_lower = TRUE),
12021239
results = extract_dcdh_by_path_predict_het(res21, n_effects = 3)
12031240
)
1241+
1242+
# Scenario 23: GLOBAL predict_het + placebo (no by_path). Mirrors
1243+
# scenario 22's syntax minus by_path so we have a parity anchor for
1244+
# the GLOBAL `results.heterogeneity_effects` surface emitting both
1245+
# forward and backward (placebo) horizons. Resolves codex R1 P1 #2:
1246+
# the Phase 1A change extended the global heterogeneity loop to
1247+
# cover backward horizons, so a global-surface parity test was
1248+
# required to lock that contract independently of the per-path
1249+
# dispatcher. Same `c(-1)` sentinel as scenario 22 (computes ALL
1250+
# forward + ALL placebo positions); reuses `d20` for DGP parity.
1251+
res23 <- did_multiplegt_dyn(
1252+
df = d20, outcome = "outcome", group = "group", time = "period",
1253+
treatment = "treatment", effects = 3, placebo = 2,
1254+
dont_drop_larger_lower = TRUE,
1255+
predict_het = list("het_x", c(-1)),
1256+
ci_level = 95, graph_off = TRUE
1257+
)
1258+
scenarios$multi_path_reversible_predict_het_with_placebo_global <- list(
1259+
data = list(
1260+
group = as.numeric(d20$group),
1261+
period = as.numeric(d20$period),
1262+
treatment = as.numeric(d20$treatment),
1263+
outcome = as.numeric(d20$outcome),
1264+
het_x = as.numeric(d20$het_x)
1265+
),
1266+
params = list(pattern = "multi_path_reversible_predict_het_with_placebo_global",
1267+
n_switchers = n_switchers20, n_controls = n_controls20,
1268+
n_groups = n_groups20, n_periods = n_periods20,
1269+
seed = 120L, effects = 3, placebo = 2,
1270+
predict_het_var = "het_x",
1271+
predict_het_horizons = c(-1),
1272+
ci_level = 95,
1273+
dont_drop_larger_lower = TRUE),
1274+
results = extract_dcdh_predict_het(res23, n_effects = 3)
1275+
)
1276+
1277+
# Scenario 22: by_path + predict_het + placebo (probes TODO #422). Reuses
1278+
# d20 from scenarios 20/21 for DGP parity. Tests whether R's
1279+
# did_multiplegt_dyn(by_path=k, predict_het, placebo=N) per-by_level
1280+
# dispatcher emits predict_het rows on backward (placebo) horizons.
1281+
#
1282+
# R's predict_het syntax with `placebo > 0` (per did_multiplegt_main
1283+
# source `DIDmultiplegtDYN:::did_multiplegt_main` lines 1907 / 2030):
1284+
# the SAME horizon vector is used for BOTH forward effects AND placebo
1285+
# positions. Passing `c(1, 2, 3)` with `placebo=2` errors because
1286+
# `max(c(1, 2, 3)) > placebo=2`. The `c(-1)` sentinel triggers "compute
1287+
# heterogeneity for ALL forward (1..effects) AND ALL placebo
1288+
# (1..placebo) positions" by replacing `het_effects` with `1:l_XX` in
1289+
# the forward block and `1:l_placebo_XX` in the placebo block. Forward
1290+
# rows are emitted with positive `effect` values (1, 2, 3); placebo
1291+
# rows with NEGATIVE values (-1, -2) per `effect = matrix(-i, ...)` at
1292+
# the placebo block's rbind site.
1293+
#
1294+
# The extended extract_dcdh_by_path_predict_het partitions the per-path
1295+
# predict_het table by `effect` sign: forward into `horizons`, placebo
1296+
# into `placebo_horizons`.
1297+
res22 <- did_multiplegt_dyn(
1298+
df = d20, outcome = "outcome", group = "group", time = "period",
1299+
treatment = "treatment", effects = 3, placebo = 2, by_path = 3,
1300+
dont_drop_larger_lower = TRUE,
1301+
predict_het = list("het_x", c(-1)),
1302+
ci_level = 95, graph_off = TRUE
1303+
)
1304+
scenarios$multi_path_reversible_predict_het_with_placebo <- list(
1305+
data = list(
1306+
group = as.numeric(d20$group),
1307+
period = as.numeric(d20$period),
1308+
treatment = as.numeric(d20$treatment),
1309+
outcome = as.numeric(d20$outcome),
1310+
het_x = as.numeric(d20$het_x)
1311+
),
1312+
params = list(pattern = "multi_path_reversible_predict_het_with_placebo",
1313+
n_switchers = n_switchers20, n_controls = n_controls20,
1314+
n_groups = n_groups20, n_periods = n_periods20,
1315+
seed = 120L, effects = 3, placebo = 2, by_path = 3,
1316+
predict_het_var = "het_x",
1317+
predict_het_horizons = c(-1),
1318+
ci_level = 95,
1319+
dont_drop_larger_lower = TRUE),
1320+
results = extract_dcdh_by_path_predict_het(res22, n_effects = 3)
1321+
)
12041322
}
12051323

12061324
# ---------------------------------------------------------------------------

0 commit comments

Comments
 (0)