Skip to content

Commit 6f525d6

Browse files
igerberclaude
andcommitted
Compose by_path / paths_of_interest with heterogeneity (Wave 5 #11)
Lifts the gate at chaisemartin_dhaultfoeuille.py:1230-1234 so per-path event-study disaggregation composes with heterogeneity="<col>" (Web Appendix Section 1.5, Lemma 7), mirroring R did_multiplegt_dyn(..., by_path, predict_het) per-by_level dispatch. Per-path heterogeneity is computed by re-running the Lemma 7 regression on each path-restricted switcher subsample. New `path_groups` (Optional[Set[int]]) parameter on _compute_heterogeneity_test restricts eligibility to switchers ON path p; the variance machinery (standard WLS vcov for non-survey, cell-period IF allocator for Binder TSL, group-level allocator for Rao-Wu replicate) is unchanged from the global heterogeneity path. Cohort dummies absorb baseline by construction, so multi-baseline switcher panels do not produce R-divergence (no parallel UserWarning like controls / trends_linear). Surfaces on results.path_heterogeneity_effects keyed {path: {l: {beta, se, t_stat, p_value, conf_int, n_obs}}} and on to_dataframe(level="by_path") via new always-present het_* columns, populated for positive horizons and NaN otherwise (mirrors cband_* / cumulated_* convention). Per-(path, horizon) inference is refreshed in the final R2 P1b block so all surfaces use the same df_survey after replicate-weight n_valid appends. R parity: introduces the FIRST predict_het R-parity baseline in the repo. Two new scenarios (multi_path_reversible_predict_het global anchor + multi_path_reversible_by_path_predict_het per-path) use dont_drop_larger_lower=TRUE to match drop_larger_lower=False and provide cohort variation under reversal paths. Per-path beta and SE match R within rtol=1e-6. Multiplier bootstrap (n_bootstrap > 0) under by_path + heterogeneity + survey_design inherits the existing per-path multiplier-bootstrap gate from PR #408. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 6a1e5b2 commit 6f525d6

10 files changed

Lines changed: 1191 additions & 31 deletions

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
88
## [Unreleased]
99

1010
### Added
11+
- **`ChaisemartinDHaultfoeuille.by_path` and `paths_of_interest` now compose with `heterogeneity="<col>"`** (Web Appendix Section 1.5, Lemma 7). Per-path heterogeneity coefficient is computed by re-running the Lemma 7 regression on each path-restricted switcher subsample. The path filter (`path_groups: Optional[Set[int]]`) restricts eligibility to switchers ON path `p` inside the inner regression; the variance machinery (standard WLS vcov for non-survey, cell-period IF allocator for Binder TSL, group-level allocator for Rao-Wu replicate) is unchanged from the global heterogeneity path. Cohort dummies in the design matrix absorb baseline by construction, so multi-baseline switcher panels do not produce R-divergence (no parallel `UserWarning` like `controls` / `trends_linear`). Surfaces on `results.path_heterogeneity_effects` keyed `{path: {l: {beta, se, t_stat, p_value, conf_int, n_obs}}}` and on `results.to_dataframe(level="by_path")` via new always-present `het_*` columns (`het_beta`, `het_se`, `het_t_stat`, `het_p_value`, `het_conf_int_lower`, `het_conf_int_upper`), populated for positive-horizon rows when `heterogeneity` is set and NaN otherwise (mirrors the `cband_*` and `cumulated_*` always-present convention). Composes with `survey_design` (analytical Binder TSL + replicate-weight bootstrap) via the existing PR #408 IF allocator path; under replicate weights, every per-(path, horizon) fit appends `n_valid` to the shared `_replicate_n_valid_list` accumulator and the final `_effective_df_survey` recomputation reflects all per-path appends. R parity verified against `did_multiplegt_dyn(..., by_path=3, predict_het=list("het_x", c(1,2,3)))` on the new `multi_path_reversible_by_path_predict_het` golden-value scenario; a sibling global anchor `multi_path_reversible_predict_het` introduces the FIRST `predict_het` R-parity baseline in the repo (no prior `TestDCDHDynRParityHeterogeneity` existed). Both R calls use `dont_drop_larger_lower=TRUE` to match the Python `drop_larger_lower=False` requirement and to provide cohort variation at every horizon under reversal paths. Per-path SE matches global SE bit-exactly on a single-path panel (telescope invariant, `atol=rtol=1e-14`). Multiplier bootstrap (`n_bootstrap > 0`) under `by_path + heterogeneity + survey_design` inherits the existing per-path multiplier-bootstrap-survey gate from PR #408. The `NotImplementedError` gate at `chaisemartin_dhaultfoeuille.py:1230-1234` is removed; `heterogeneity` precondition mutex with `controls` / `trends_linear` / `trends_nonparam` stays in place. Cross-surface invariants regression-tested at `tests/test_chaisemartin_dhaultfoeuille.py::TestByPathHeterogeneity` (~13 tests across gate dispatch, behavior, single-path telescope, zero-signal anti-regression, multi-baseline UserWarning anti-regression, DataFrame integration, edge cases) + `tests/test_chaisemartin_dhaultfoeuille_parity.py::TestDCDHDynRParityHeterogeneity` (global anchor) + `::TestDCDHDynRParityByPathHeterogeneity` (per-path). See `docs/methodology/REGISTRY.md` §`ChaisemartinDHaultfoeuille` `Note (Phase 3 by_path ...)` → "Per-path heterogeneity testing" for the full contract.
1112
- **Tutorial 21: HAD Pre-test Workflow** (`docs/tutorials/21_had_pretest_workflow.ipynb`) — composite pre-test walkthrough for `HeterogeneousAdoptionDiD` building on Tutorial 20's brand-campaign framing. Uses a 60-DMA × 8-week panel close in shape to T20's but with the dose distribution drawn from `Uniform[$0.01K, $50K]` (vs T20's `[$5K, $50K]`); the true support is strictly positive but very near zero, chosen so the QUG step in `did_had_pretest_workflow` fails-to-reject `H0: d_lower = 0` in this finite sample and the verdict text fires the load-bearing "Assumption 7 deferred" pivot for the upgrade-arc narrative. (HAD's `design="auto"` selector — a separate min/median heuristic at `had.py::_detect_design`, NOT the QUG p-value — independently lands on the `continuous_at_zero` identification path with target `WAS` on this panel because `d.min() < 0.01 * median(|d|)`. The QUG test and the design selector are independent rules that point to the same identification path here.) Walks through three surfaces: (a) `did_had_pretest_workflow(aggregate="overall")` on a two-period collapse, where the verdict explicitly flags Step 2 (Assumption 7 pre-trends) as not run because a single pre-period structurally cannot support a pre-trends test, and the structural fields `pretrends_joint` / `homogeneity_joint` are both `None`; (b) `did_had_pretest_workflow(aggregate="event_study")` on the full multi-period panel, where the verdict reads "TWFE admissible under Section 4 assumptions" because all three testable diagnostics (QUG + joint pre-trends Stute over 3 horizons + joint homogeneity Stute over 4 horizons) fail-to-reject — non-rejection evidence under finite-sample power and test specification, not proof that the identifying assumptions hold; and (c) a side panel exercising both `yatchew_hr_test` null modes — `null="linearity"` (default, paper Theorem 7) vs `null="mean_independence"` (Phase 4 R-parity with R `YatchewTest::yatchew_test(order=0)`) — on the within-pre-period first-difference paired with post-period dose, illustrating the stricter null's larger residual variance (`sigma2_lin` 7.01 vs 6.53) and smaller p-value (0.29 vs 0.49). Companion drift-test file `tests/test_t21_had_pretest_workflow_drift.py` (16 tests pinning panel composition, both verdict pivots, structural anchors on both paths, deterministic QUG / Yatchew statistics, bootstrap p-value tolerance bands per `feedback_bootstrap_drift_tests_need_backend_tolerance`, and `HAD(design="auto")` resolution to `continuous_at_zero` on this panel). T20's "Composite pretest workflow" Extensions bullet updated with a forward-pointer to T21. T22 weighted/survey HAD tutorial remains queued as a separate notebook PR.
1213
- **`ChaisemartinDHaultfoeuille.by_path` and `paths_of_interest` now compose with `survey_design`** for analytical Binder TSL SE and replicate-weight bootstrap variance. The `NotImplementedError` gate at `chaisemartin_dhaultfoeuille.py:1233-1239` is replaced by a per-path multiplier-bootstrap-only gate (`survey_design + n_bootstrap > 0` under by_path / paths_of_interest still raises, since the survey-aware perturbation pivot for path-restricted IFs is methodologically underived). Per-path SE routes through the existing `_survey_se_from_group_if` cell-period allocator: the per-period IF (`U_pp_l_path`) is built with non-path switcher-side contributions skipped (control contributions are unchanged, matching the joiners/leavers IF convention; preserves the row-sum identity `U_pp.sum(axis=1) == U`), cohort-recentered via `_cohort_recenter_per_period`, then expanded to observations as `psi_i = U_pp[g_i, t_i] · (w_i / W_{g_i, t_i})`. Replicate-weight designs unconditionally use the cell allocator (Class A contract from PR #323). New `_refresh_path_inference` helper post-call refreshes `safe_inference` on every populated entry across `multi_horizon_inference`, `placebo_horizon_inference`, `path_effects`, and `path_placebos` so all four surfaces use the same final `df_survey` after per-path replicate fits append `n_valid` to the shared accumulator. Path-enumeration ranking under `survey_design` remains unweighted (group-cardinality, not population-weight mass). Lonely-PSU policy stays sample-wide, not per-path. Telescope invariant: on a single-path panel, per-path SE matches the global non-by_path survey SE bit-exactly. **No R parity** — R `did_multiplegt_dyn` does not support survey weighting; this is a Python-only methodology extension. The global non-by_path TSL multiplier-bootstrap path is unaffected (anti-regression test `tests/test_chaisemartin_dhaultfoeuille.py::TestByPathSurveyDesignAnalytical::test_global_survey_plus_n_bootstrap_still_works` locks the per-path-only scope of the new gate). Cross-surface invariants regression-tested at `TestByPathSurveyDesignAnalytical` (~17 tests across gate / dispatch / analytical SE / replicate-weight SE / per-path placebos / `trends_linear` composition / unobserved-path warnings / final-df refresh regressions) and `TestByPathSurveyDesignTelescope`. See `docs/methodology/REGISTRY.md` §`ChaisemartinDHaultfoeuille` `Note (Phase 3 by_path ...)` → "Per-path survey-design SE" for the full contract.
1314
- **Inference-field aliases on staggered result classes** for adapter / external-consumer compatibility. Read-only `@property` aliases expose the flat `att` / `se` / `conf_int` / `p_value` / `t_stat` names (matching `DiDResults` / `TROPResults` / `SyntheticDiDResults` / `HeterogeneousAdoptionDiDResults`) on every result class that previously only carried prefixed canonical fields: `CallawaySantAnnaResults`, `StackedDiDResults`, `EfficientDiDResults`, `ChaisemartinDHaultfoeuilleResults`, `StaggeredTripleDiffResults`, `WooldridgeDiDResults`, `SunAbrahamResults`, `ImputationDiDResults`, `TwoStageDiDResults` (mapping to `overall_*`); `ContinuousDiDResults` (mapping to `overall_att_*`, ATT-side as the headline, ACRT-side accessible unchanged via `overall_acrt_*`); `MultiPeriodDiDResults` (mapping to `avg_*`). `ContinuousDiDResults` additionally exposes `overall_se` / `overall_conf_int` / `overall_p_value` / `overall_t_stat` aliases for naming consistency with the rest of the staggered family. Aliases are pure read-throughs over the canonical fields — no recomputation, no behavior change — so the `safe_inference()` joint-NaN contract (per CLAUDE.md "Inference computation") is inherited automatically (NaN canonical → NaN alias, locked at `tests/test_result_aliases.py::test_pattern_b_aliases_propagate_nan`). The native `overall_*` / `overall_att_*` / `avg_*` fields remain canonical for documentation and computation. Motivated by the `balance.interop.diff_diff.as_balance_diagnostic()` adapter (`facebookresearch/balance` PR #465) which calls `getattr(res, "se", None)` / `getattr(res, "conf_int", None)` without a fallback chain — pre-alias, every staggered result class returned `None` on those keys, silently dropping `se` and `conf_int` from the adapter's diagnostic dict. 23 alias-mechanic + balance-adapter regression tests at `tests/test_result_aliases.py`. Patch-level (additive on stable surfaces).

benchmarks/R/generate_dcdh_dynr_test_values.R

Lines changed: 185 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -618,6 +618,61 @@ extract_dcdh_by_path <- function(res, n_effects, n_placebos = 0) {
618618
list(by_path = out)
619619
}
620620

621+
# Helper: extract global predict_het results. R's predict_het slot is at
622+
# res$results$predict_het, a data.frame with columns
623+
# {effect, covariate, Estimate, SE, t, LB, UB, N, pF}. Estimate is the
624+
# WLS coefficient on the heterogeneity covariate.
625+
extract_dcdh_predict_het <- function(res, n_effects) {
626+
ph <- res$results$predict_het
627+
horizons <- list()
628+
if (is.null(ph) || nrow(ph) == 0) return(list(predict_het = horizons))
629+
for (h in seq_len(min(n_effects, nrow(ph)))) {
630+
horizons[[as.character(ph$effect[h])]] <- list(
631+
beta = as.numeric(ph$Estimate[h]),
632+
se = as.numeric(ph$SE[h]),
633+
t = as.numeric(ph$t[h]),
634+
ci_lo = as.numeric(ph$LB[h]),
635+
ci_hi = as.numeric(ph$UB[h]),
636+
n_obs = as.numeric(ph$N[h]),
637+
p_value = as.numeric(ph$pF[h])
638+
)
639+
}
640+
list(predict_het = horizons)
641+
}
642+
643+
# Helper: extract per-path predict_het results. Under by_path=k +
644+
# predict_het, R's per-by_level dispatcher writes a predict_het table to
645+
# each res$by_level_i$results$predict_het. Output mirrors
646+
# extract_dcdh_by_path's shape with a horizons dict keyed by horizon.
647+
extract_dcdh_by_path_predict_het <- function(res, n_effects) {
648+
by_levels <- res$by_levels
649+
out <- list()
650+
for (i in seq_along(by_levels)) {
651+
slot <- res[[paste0("by_level_", i)]]
652+
ph <- slot$results$predict_het
653+
horizons <- list()
654+
if (!is.null(ph) && nrow(ph) > 0) {
655+
for (h in seq_len(min(n_effects, nrow(ph)))) {
656+
horizons[[as.character(ph$effect[h])]] <- list(
657+
beta = as.numeric(ph$Estimate[h]),
658+
se = as.numeric(ph$SE[h]),
659+
t = as.numeric(ph$t[h]),
660+
ci_lo = as.numeric(ph$LB[h]),
661+
ci_hi = as.numeric(ph$UB[h]),
662+
n_obs = as.numeric(ph$N[h]),
663+
p_value = as.numeric(ph$pF[h])
664+
)
665+
}
666+
}
667+
out[[i]] <- list(
668+
path = by_levels[i],
669+
frequency_rank = i,
670+
horizons = horizons
671+
)
672+
}
673+
list(by_path_predict_het = out)
674+
}
675+
621676
# Scenario 13: mixed_single_switch + by_path=2 (basic 2-path case).
622677
# The mixed_single_switch DGP produces joiners (path 0,1,1,1) and
623678
# leavers (path 1,0,0,0) as its only two observed paths at L_max=3, so
@@ -1018,6 +1073,136 @@ cat(" Scenario 19: multi_path_reversible_by_path_non_binary\n")
10181073
)
10191074
}
10201075

1076+
# Scenarios 20 + 21: predict_het R-parity (Wave 5 #11). Scenario 20 is
1077+
# the global anchor (predict_het without by_path); scenario 21 is the
1078+
# per-path version (by_path + predict_het). Both use the same DGP so the
1079+
# Python-side parity tests can calibrate atol on the global scenario and
1080+
# inherit the same atol for the per-path test. R's predict_het syntax
1081+
# requires a 2-element list: list(covariate_name, horizons_vec).
1082+
#
1083+
# DGP shape: 90 switchers + 30 never-treated controls, 10 periods, 3
1084+
# paths (0,1,1,1) / (0,1,0,0) / (0,1,1,0), F_g varies in {3,4,5}
1085+
# INDEPENDENTLY of path so each path has multiple cohorts. het_x is
1086+
# binary {0,1}, balanced across both switchers and controls. Effect =
1087+
# 5.0 + 3.0 * het_x to produce a detectable heterogeneity signal.
1088+
# Never-treated controls are required for R's predict_het to compute
1089+
# horizons l>=2 under reversal paths (otherwise R returns
1090+
# "max effects = 2" or empty cohort dummies). `dont_drop_larger_lower
1091+
# = TRUE` matches the Python by_path requirement that
1092+
# `drop_larger_lower=False`.
1093+
cat(" Scenarios 20/21: multi_path_reversible_predict_het + by_path version\n")
1094+
{
1095+
set.seed(120L)
1096+
n_switchers20 <- 90L
1097+
n_controls20 <- 30L
1098+
n_groups20 <- n_switchers20 + n_controls20
1099+
n_periods20 <- 10L
1100+
paths20 <- list(c(0L, 1L, 1L, 1L), c(0L, 1L, 0L, 0L), c(0L, 1L, 1L, 0L))
1101+
D20 <- matrix(0L, nrow = n_groups20, ncol = n_periods20)
1102+
het_x20 <- integer(n_groups20)
1103+
group_fe20 <- rnorm(n_groups20, 0, 2.0)
1104+
# Switchers (groups 1..n_switchers20)
1105+
for (g in seq_len(n_switchers20)) {
1106+
F_g_choice <- ((g - 1L) %/% 3L) %% 3L
1107+
F_g <- 3L + F_g_choice
1108+
path_idx <- ((g - 1L) %% 3L) + 1L
1109+
pp <- paths20[[path_idx]]
1110+
het_x20[g] <- if (g <= n_switchers20 %/% 2L) 1L else 0L
1111+
for (j in seq_along(pp)) {
1112+
t <- F_g - 1L + j
1113+
if (t >= 1L && t <= n_periods20) D20[g, t] <- pp[j]
1114+
}
1115+
if (F_g - 1L + length(pp) <= n_periods20) {
1116+
tail_t <- (F_g - 1L + length(pp)):n_periods20
1117+
D20[g, tail_t] <- pp[length(pp)]
1118+
}
1119+
}
1120+
# Never-treated controls (groups n_switchers20+1..n_groups20).
1121+
# Balanced het_x assignment so the heterogeneity covariate has
1122+
# variation in the control pool too.
1123+
for (g in (n_switchers20 + 1L):n_groups20) {
1124+
k <- g - n_switchers20
1125+
het_x20[g] <- if (k <= n_controls20 %/% 2L) 1L else 0L
1126+
}
1127+
noise20 <- matrix(rnorm(n_groups20 * n_periods20, 0, 0.5),
1128+
nrow = n_groups20, ncol = n_periods20)
1129+
period_arr20 <- 0:(n_periods20 - 1L)
1130+
effect20 <- 5.0 + 3.0 * het_x20 # heterogeneity signal
1131+
Y20 <- matrix(group_fe20, nrow = n_groups20, ncol = n_periods20) +
1132+
matrix(0.5 * period_arr20, nrow = n_groups20, ncol = n_periods20, byrow = TRUE) +
1133+
matrix(effect20, nrow = n_groups20, ncol = n_periods20) * D20 +
1134+
noise20
1135+
d20 <- data.frame(
1136+
group = rep(seq_len(n_groups20) - 1L, each = n_periods20),
1137+
period = rep(period_arr20, n_groups20),
1138+
treatment = as.vector(t(D20)),
1139+
outcome = as.vector(t(Y20)),
1140+
het_x = rep(het_x20, each = n_periods20)
1141+
)
1142+
1143+
# Scenario 20: global predict_het anchor (no by_path).
1144+
# `dont_drop_larger_lower = TRUE` preserves multi-switch cohorts so the
1145+
# heterogeneity regression has cohort variation at every horizon.
1146+
# Without this, R drops off-switch paths at l>=2, leaving a single
1147+
# cohort and triggering the `prod_het_l_XX ~ het_x + ` empty-cohort
1148+
# error. dCDH `by_path` requires `drop_larger_lower=False` on the Python
1149+
# side anyway, so this flag is consistent with the per-path scope.
1150+
res20 <- did_multiplegt_dyn(
1151+
df = d20, outcome = "outcome", group = "group", time = "period",
1152+
treatment = "treatment", effects = 3,
1153+
dont_drop_larger_lower = TRUE,
1154+
predict_het = list("het_x", c(1, 2, 3)),
1155+
ci_level = 95, graph_off = TRUE
1156+
)
1157+
scenarios$multi_path_reversible_predict_het <- list(
1158+
data = list(
1159+
group = as.numeric(d20$group),
1160+
period = as.numeric(d20$period),
1161+
treatment = as.numeric(d20$treatment),
1162+
outcome = as.numeric(d20$outcome),
1163+
het_x = as.numeric(d20$het_x)
1164+
),
1165+
params = list(pattern = "multi_path_reversible_predict_het",
1166+
n_switchers = n_switchers20, n_controls = n_controls20,
1167+
n_groups = n_groups20, n_periods = n_periods20,
1168+
seed = 120L, effects = 3,
1169+
predict_het_var = "het_x",
1170+
predict_het_horizons = c(1, 2, 3),
1171+
ci_level = 95,
1172+
dont_drop_larger_lower = TRUE),
1173+
results = extract_dcdh_predict_het(res20, n_effects = 3)
1174+
)
1175+
1176+
# Scenario 21: by_path + predict_het (per-path version on same DGP).
1177+
# `dont_drop_larger_lower = TRUE` matches scenario 20 + Python
1178+
# by_path's `drop_larger_lower=False` requirement.
1179+
res21 <- did_multiplegt_dyn(
1180+
df = d20, outcome = "outcome", group = "group", time = "period",
1181+
treatment = "treatment", effects = 3, by_path = 3,
1182+
dont_drop_larger_lower = TRUE,
1183+
predict_het = list("het_x", c(1, 2, 3)),
1184+
ci_level = 95, graph_off = TRUE
1185+
)
1186+
scenarios$multi_path_reversible_by_path_predict_het <- list(
1187+
data = list(
1188+
group = as.numeric(d20$group),
1189+
period = as.numeric(d20$period),
1190+
treatment = as.numeric(d20$treatment),
1191+
outcome = as.numeric(d20$outcome),
1192+
het_x = as.numeric(d20$het_x)
1193+
),
1194+
params = list(pattern = "multi_path_reversible_by_path_predict_het",
1195+
n_switchers = n_switchers20, n_controls = n_controls20,
1196+
n_groups = n_groups20, n_periods = n_periods20,
1197+
seed = 120L, effects = 3, by_path = 3,
1198+
predict_het_var = "het_x",
1199+
predict_het_horizons = c(1, 2, 3),
1200+
ci_level = 95,
1201+
dont_drop_larger_lower = TRUE),
1202+
results = extract_dcdh_by_path_predict_het(res21, n_effects = 3)
1203+
)
1204+
}
1205+
10211206
# ---------------------------------------------------------------------------
10221207
# Write output
10231208
# ---------------------------------------------------------------------------

0 commit comments

Comments
 (0)