You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Compose by_path / paths_of_interest with heterogeneity (Wave 5 #11)
Lifts the gate at chaisemartin_dhaultfoeuille.py:1230-1234 so per-path
event-study disaggregation composes with heterogeneity="<col>" (Web
Appendix Section 1.5, Lemma 7), mirroring R did_multiplegt_dyn(...,
by_path, predict_het) per-by_level dispatch.
Per-path heterogeneity is computed by re-running the Lemma 7 regression
on each path-restricted switcher subsample. New `path_groups`
(Optional[Set[int]]) parameter on _compute_heterogeneity_test restricts
eligibility to switchers ON path p; the variance machinery (standard
WLS vcov for non-survey, cell-period IF allocator for Binder TSL,
group-level allocator for Rao-Wu replicate) is unchanged from the
global heterogeneity path. Cohort dummies absorb baseline by
construction, so multi-baseline switcher panels do not produce
R-divergence (no parallel UserWarning like controls / trends_linear).
Surfaces on results.path_heterogeneity_effects keyed
{path: {l: {beta, se, t_stat, p_value, conf_int, n_obs}}} and on
to_dataframe(level="by_path") via new always-present het_* columns,
populated for positive horizons and NaN otherwise (mirrors cband_* /
cumulated_* convention). Per-(path, horizon) inference is refreshed
in the final R2 P1b block so all surfaces use the same df_survey
after replicate-weight n_valid appends.
R parity: introduces the FIRST predict_het R-parity baseline in the
repo. Two new scenarios (multi_path_reversible_predict_het global
anchor + multi_path_reversible_by_path_predict_het per-path) use
dont_drop_larger_lower=TRUE to match drop_larger_lower=False and
provide cohort variation under reversal paths. Per-path beta and SE
match R within rtol=1e-6.
Multiplier bootstrap (n_bootstrap > 0) under by_path + heterogeneity
+ survey_design inherits the existing per-path multiplier-bootstrap
gate from PR #408.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy file name to clipboardExpand all lines: CHANGELOG.md
+1Lines changed: 1 addition & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,6 +8,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
8
8
## [Unreleased]
9
9
10
10
### Added
11
+
- **`ChaisemartinDHaultfoeuille.by_path` and `paths_of_interest` now compose with `heterogeneity="<col>"`** (Web Appendix Section 1.5, Lemma 7). Per-path heterogeneity coefficient is computed by re-running the Lemma 7 regression on each path-restricted switcher subsample. The path filter (`path_groups: Optional[Set[int]]`) restricts eligibility to switchers ON path `p` inside the inner regression; the variance machinery (standard WLS vcov for non-survey, cell-period IF allocator for Binder TSL, group-level allocator for Rao-Wu replicate) is unchanged from the global heterogeneity path. Cohort dummies in the design matrix absorb baseline by construction, so multi-baseline switcher panels do not produce R-divergence (no parallel `UserWarning` like `controls` / `trends_linear`). Surfaces on `results.path_heterogeneity_effects` keyed `{path: {l: {beta, se, t_stat, p_value, conf_int, n_obs}}}` and on `results.to_dataframe(level="by_path")` via new always-present `het_*` columns (`het_beta`, `het_se`, `het_t_stat`, `het_p_value`, `het_conf_int_lower`, `het_conf_int_upper`), populated for positive-horizon rows when `heterogeneity` is set and NaN otherwise (mirrors the `cband_*` and `cumulated_*` always-present convention). Composes with `survey_design` (analytical Binder TSL + replicate-weight bootstrap) via the existing PR #408 IF allocator path; under replicate weights, every per-(path, horizon) fit appends `n_valid` to the shared `_replicate_n_valid_list` accumulator and the final `_effective_df_survey` recomputation reflects all per-path appends. R parity verified against `did_multiplegt_dyn(..., by_path=3, predict_het=list("het_x", c(1,2,3)))` on the new `multi_path_reversible_by_path_predict_het` golden-value scenario; a sibling global anchor `multi_path_reversible_predict_het` introduces the FIRST `predict_het` R-parity baseline in the repo (no prior `TestDCDHDynRParityHeterogeneity` existed). Both R calls use `dont_drop_larger_lower=TRUE` to match the Python `drop_larger_lower=False` requirement and to provide cohort variation at every horizon under reversal paths. Per-path SE matches global SE bit-exactly on a single-path panel (telescope invariant, `atol=rtol=1e-14`). Multiplier bootstrap (`n_bootstrap > 0`) under `by_path + heterogeneity + survey_design` inherits the existing per-path multiplier-bootstrap-survey gate from PR #408. The `NotImplementedError` gate at `chaisemartin_dhaultfoeuille.py:1230-1234` is removed; `heterogeneity` precondition mutex with `controls` / `trends_linear` / `trends_nonparam` stays in place. Cross-surface invariants regression-tested at `tests/test_chaisemartin_dhaultfoeuille.py::TestByPathHeterogeneity` (~13 tests across gate dispatch, behavior, single-path telescope, zero-signal anti-regression, multi-baseline UserWarning anti-regression, DataFrame integration, edge cases) + `tests/test_chaisemartin_dhaultfoeuille_parity.py::TestDCDHDynRParityHeterogeneity` (global anchor) + `::TestDCDHDynRParityByPathHeterogeneity` (per-path). See `docs/methodology/REGISTRY.md` §`ChaisemartinDHaultfoeuille` `Note (Phase 3 by_path ...)` → "Per-path heterogeneity testing" for the full contract.
11
12
- **Tutorial 21: HAD Pre-test Workflow** (`docs/tutorials/21_had_pretest_workflow.ipynb`) — composite pre-test walkthrough for `HeterogeneousAdoptionDiD` building on Tutorial 20's brand-campaign framing. Uses a 60-DMA × 8-week panel close in shape to T20's but with the dose distribution drawn from `Uniform[$0.01K, $50K]` (vs T20's `[$5K, $50K]`); the true support is strictly positive but very near zero, chosen so the QUG step in `did_had_pretest_workflow` fails-to-reject `H0: d_lower = 0` in this finite sample and the verdict text fires the load-bearing "Assumption 7 deferred" pivot for the upgrade-arc narrative. (HAD's `design="auto"` selector — a separate min/median heuristic at `had.py::_detect_design`, NOT the QUG p-value — independently lands on the `continuous_at_zero` identification path with target `WAS` on this panel because `d.min() < 0.01 * median(|d|)`. The QUG test and the design selector are independent rules that point to the same identification path here.) Walks through three surfaces: (a) `did_had_pretest_workflow(aggregate="overall")` on a two-period collapse, where the verdict explicitly flags Step 2 (Assumption 7 pre-trends) as not run because a single pre-period structurally cannot support a pre-trends test, and the structural fields `pretrends_joint` / `homogeneity_joint` are both `None`; (b) `did_had_pretest_workflow(aggregate="event_study")` on the full multi-period panel, where the verdict reads "TWFE admissible under Section 4 assumptions" because all three testable diagnostics (QUG + joint pre-trends Stute over 3 horizons + joint homogeneity Stute over 4 horizons) fail-to-reject — non-rejection evidence under finite-sample power and test specification, not proof that the identifying assumptions hold; and (c) a side panel exercising both `yatchew_hr_test` null modes — `null="linearity"` (default, paper Theorem 7) vs `null="mean_independence"` (Phase 4 R-parity with R `YatchewTest::yatchew_test(order=0)`) — on the within-pre-period first-difference paired with post-period dose, illustrating the stricter null's larger residual variance (`sigma2_lin` 7.01 vs 6.53) and smaller p-value (0.29 vs 0.49). Companion drift-test file `tests/test_t21_had_pretest_workflow_drift.py` (16 tests pinning panel composition, both verdict pivots, structural anchors on both paths, deterministic QUG / Yatchew statistics, bootstrap p-value tolerance bands per `feedback_bootstrap_drift_tests_need_backend_tolerance`, and `HAD(design="auto")` resolution to `continuous_at_zero` on this panel). T20's "Composite pretest workflow" Extensions bullet updated with a forward-pointer to T21. T22 weighted/survey HAD tutorial remains queued as a separate notebook PR.
12
13
- **`ChaisemartinDHaultfoeuille.by_path` and `paths_of_interest` now compose with `survey_design`** for analytical Binder TSL SE and replicate-weight bootstrap variance. The `NotImplementedError` gate at `chaisemartin_dhaultfoeuille.py:1233-1239` is replaced by a per-path multiplier-bootstrap-only gate (`survey_design + n_bootstrap > 0` under by_path / paths_of_interest still raises, since the survey-aware perturbation pivot for path-restricted IFs is methodologically underived). Per-path SE routes through the existing `_survey_se_from_group_if` cell-period allocator: the per-period IF (`U_pp_l_path`) is built with non-path switcher-side contributions skipped (control contributions are unchanged, matching the joiners/leavers IF convention; preserves the row-sum identity `U_pp.sum(axis=1) == U`), cohort-recentered via `_cohort_recenter_per_period`, then expanded to observations as `psi_i = U_pp[g_i, t_i] · (w_i / W_{g_i, t_i})`. Replicate-weight designs unconditionally use the cell allocator (Class A contract from PR #323). New `_refresh_path_inference` helper post-call refreshes `safe_inference` on every populated entry across `multi_horizon_inference`, `placebo_horizon_inference`, `path_effects`, and `path_placebos` so all four surfaces use the same final `df_survey` after per-path replicate fits append `n_valid` to the shared accumulator. Path-enumeration ranking under `survey_design` remains unweighted (group-cardinality, not population-weight mass). Lonely-PSU policy stays sample-wide, not per-path. Telescope invariant: on a single-path panel, per-path SE matches the global non-by_path survey SE bit-exactly. **No R parity** — R `did_multiplegt_dyn` does not support survey weighting; this is a Python-only methodology extension. The global non-by_path TSL multiplier-bootstrap path is unaffected (anti-regression test `tests/test_chaisemartin_dhaultfoeuille.py::TestByPathSurveyDesignAnalytical::test_global_survey_plus_n_bootstrap_still_works` locks the per-path-only scope of the new gate). Cross-surface invariants regression-tested at `TestByPathSurveyDesignAnalytical` (~17 tests across gate / dispatch / analytical SE / replicate-weight SE / per-path placebos / `trends_linear` composition / unobserved-path warnings / final-df refresh regressions) and `TestByPathSurveyDesignTelescope`. See `docs/methodology/REGISTRY.md` §`ChaisemartinDHaultfoeuille` `Note (Phase 3 by_path ...)` → "Per-path survey-design SE" for the full contract.
13
14
- **Inference-field aliases on staggered result classes** for adapter / external-consumer compatibility. Read-only `@property` aliases expose the flat `att` / `se` / `conf_int` / `p_value` / `t_stat` names (matching `DiDResults` / `TROPResults` / `SyntheticDiDResults` / `HeterogeneousAdoptionDiDResults`) on every result class that previously only carried prefixed canonical fields: `CallawaySantAnnaResults`, `StackedDiDResults`, `EfficientDiDResults`, `ChaisemartinDHaultfoeuilleResults`, `StaggeredTripleDiffResults`, `WooldridgeDiDResults`, `SunAbrahamResults`, `ImputationDiDResults`, `TwoStageDiDResults` (mapping to `overall_*`); `ContinuousDiDResults` (mapping to `overall_att_*`, ATT-side as the headline, ACRT-side accessible unchanged via `overall_acrt_*`); `MultiPeriodDiDResults` (mapping to `avg_*`). `ContinuousDiDResults` additionally exposes `overall_se` / `overall_conf_int` / `overall_p_value` / `overall_t_stat` aliases for naming consistency with the rest of the staggered family. Aliases are pure read-throughs over the canonical fields — no recomputation, no behavior change — so the `safe_inference()` joint-NaN contract (per CLAUDE.md "Inference computation") is inherited automatically (NaN canonical → NaN alias, locked at `tests/test_result_aliases.py::test_pattern_b_aliases_propagate_nan`). The native `overall_*` / `overall_att_*` / `avg_*` fields remain canonical for documentation and computation. Motivated by the `balance.interop.diff_diff.as_balance_diagnostic()` adapter (`facebookresearch/balance` PR #465) which calls `getattr(res, "se", None)` / `getattr(res, "conf_int", None)` without a fallback chain — pre-alias, every staggered result class returned `None` on those keys, silently dropping `se` and `conf_int` from the adapter's diagnostic dict. 23 alias-mechanic + balance-adapter regression tests at `tests/test_result_aliases.py`. Patch-level (additive on stable surfaces).
0 commit comments