You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: TODO.md
-1Lines changed: 0 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -74,7 +74,6 @@ Deferred items from PR reviews that were not addressed before merge.
74
74
75
75
| Issue | Location | PR | Priority |
76
76
|-------|----------|----|----------|
77
-
| BaconDecomposition R parity goldens: `bacondecomp` R package not installed in the local R 4.5.2 library at PR-B authoring time (2026-05-16). R generator script committed at `benchmarks/R/generate_bacon_golden.R`; running it requires `install.packages("bacondecomp")` + `install.packages("jsonlite")` then `cd benchmarks/R && Rscript generate_bacon_golden.R`, writing `benchmarks/data/r_bacondecomp_golden.json`. `tests/test_methodology_bacon.py::TestBaconParityR` (3 tests) skips with a pointer until the JSON lands. The PR-B audit substantiates Theorem 1 (Eqs. 7-9 + 10e-g) via hand-calculable + machine-precision identity tests; R parity is desirable as a cross-language anchor but not the only substantiation. Mirrors StaggeredTripleDifference precedent (PR #245). |`benchmarks/R/generate_bacon_golden.R`, `benchmarks/data/r_bacondecomp_golden.json` (TBD), `tests/test_methodology_bacon.py::TestBaconParityR`| follow-up | Medium |
78
77
| dCDH: Phase 1 per-period placebo DID_M^pl has NaN SE (no IF derivation for the per-period aggregation path). Multi-horizon placebos (L_max >= 1) have valid SE. |`chaisemartin_dhaultfoeuille.py`|#294| Low |
79
78
| dCDH: Survey cell-period allocator's post-period attribution is a library convention, not derived from the observation-level survey linearization. MC coverage is empirically close to nominal on the test DGP; a formal derivation (or a covariance-aware two-cell alternative) is deferred. Documented in REGISTRY.md survey IF expansion Note. |`chaisemartin_dhaultfoeuille.py`, `docs/methodology/REGISTRY.md`|#408| Medium |
80
79
| dCDH: Parity test SE/CI assertions only cover pure-direction scenarios; mixed-direction SE comparison is structurally apples-to-oranges (cell-count vs obs-count weighting). |`test_chaisemartin_dhaultfoeuille_parity.py`|#294| Low |
Copy file name to clipboardExpand all lines: docs/methodology/REGISTRY.md
+7-5Lines changed: 7 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2616,7 +2616,7 @@ Shipped in `diff_diff/had_pretests.py` as `stute_joint_pretest()` (residuals-in
2616
2616
2617
2617
*Assumption checks / warnings:*
2618
2618
- Requires variation in treatment timing (staggered adoption)
2619
-
- Always-treated units (`first_treat <= min(time)`, excluding the never-treated sentinels `0` and `np.inf`; paper footnote 11) are automatically remapped to the `U` (untreated) bucket with a `UserWarning`; see the `**Note (always-treated remap)**` below for the full ordered-time / sentinel contract
2619
+
- Always-treated units (`first_treat <= min(time)`, excluding the never-treated sentinels `0` and `np.inf`; per paper footnote 11 with a library-convention extension on the first-period boundary case, see `**Deviation (first-period boundary extension)**` below) are automatically remapped to the `U` (untreated) bucket with a `UserWarning`; see the `**Note (always-treated remap)**` below for the full ordered-time / sentinel contract
2620
2620
- Unbalanced panels are accepted with a `UserWarning`; the paper's Appendix A proof assumes balanced panels
2621
2621
- Falls back to timing-only comparisons when no never-treated units are present (no untreated group → `s_{kU}` terms drop, weights rescale to sum to 1; **VWCT and ΔATT can still bias the result** — see paper Eqs. 14-15)
2622
2622
@@ -2668,7 +2668,7 @@ Where `n_k` is the sample share of timing group `k`, `n_{kℓ} = n_k / (n_k + n_
2668
2668
- Always-treated units: see `**Note (always-treated remap)**` below
2669
2669
2670
2670
**Reference implementation(s):**
2671
-
- R: `bacondecomp::bacon()` (CRAN). Parity script at `benchmarks/R/generate_bacon_golden.R`; goldens pending follow-up R install (see TODO.md).
2671
+
- R: `bacondecomp::bacon()` (CRAN). Parity script at `benchmarks/R/generate_bacon_golden.R`; goldens committed at `benchmarks/data/r_bacondecomp_golden.json` (generated against `bacondecomp` 0.1.1 + R 4.5.2). Parity validated at `atol=1e-6` via `tests/test_methodology_bacon.py::TestBaconParityR` (4 tests: TWFE coefficient + weights-sum match across 3 fixtures; per-component estimate + weight parity locked on the 2 non-remap fixtures and on the 6 timing-vs-timing rows of `always_treated_remapped`; the U-bucket convention divergence on `always_treated_remapped` is pinned by a dedicated fold-back test).
-[] R `bacondecomp::bacon()` parity at atol=1e-6 (R generator script committed; JSON goldens pending follow-up R install — `tests/test_methodology_bacon.py::TestBaconParityR` skips when missing)
2681
+
-[x] R `bacondecomp::bacon()` parity at atol=1e-6 (3 fixtures; TWFE coefficient + weights-sum match across all 3; per-component parity locked on the 2 non-remap fixtures and on the 6 timing-vs-timing rows of `always_treated_remapped`; the U-bucket fold-back is asserted by a dedicated `test_always_treated_remapped_fold_back_matches_r` — see `**Note (R parity convention divergence)**` below)
2682
2682
-[x] Survey design support (Phase 3): weighted cell means, weighted within-transform, weighted group shares
2683
-
-**Note (weight modes):**`weights="exact"` (default, paper-faithful Eqs. 7-9 + 10e-g) vs `weights="approximate"` (simplified variance, opt-in for speed-sensitive diagnostic loops). The PR-A paper review (#451) and PR-B audit established `"exact"` as the default with the **intent**to match R `bacondecomp::bacon()` and the paper's Theorem 1 contract; R parity is validated by hand-calculation (atol=1e-10) and TWFE-vs-weighted-sum identity (atol=1e-10) but the direct R bit-by-bit parity at atol=1e-6 is still pending the R `bacondecomp` install — see Test Coverage checklist above. The approximate path is retained for backward compatibility; numerical output may differ from R.
2684
-
- **Note (always-treated remap):** Units whose `first_treat` is at or before the first observable period (`first_treat <= min(time)`, excluding the never-treated sentinels `0` and `np.inf`) are automatically remapped to the `U` bucket via an internal column (`__bacon_first_treat_internal__`) with a `UserWarning` — per paper footnote 11. Detection uses ordered-time logic on the **time axis**, so panels whose `time` column has negative or zero-crossing labels (e.g. event-time `time ∈ [-2,..,3]`) are handled correctly: a cohort at `first_treat=-1` on such a panel is a valid timing group; a cohort at `first_treat=-3` is remapped to U. The user's original `first_treat` column on the input `data` frame is preserved unchanged. The count of remapped units is surfaced via `BaconDecompositionResults.n_always_treated_remapped`. **Sentinel restriction:** `first_treat ∈ {0, np.inf}` is reserved as the never-treated marker and is not configurable today; a real treatment cohort with `first_treat == 0` would be folded into `U` and should be re-labeled to a non-sentinel value before fitting. The `0` reservation applies to `first_treat` only, not to `time`.
2683
+
-**Note (weight modes):**`weights="exact"` (default, paper-faithful Eqs. 7-9 + 10e-g) vs `weights="approximate"` (simplified variance, opt-in for speed-sensitive diagnostic loops). The PR-A paper review (#451) and PR-B audit established `"exact"` as the default to match R `bacondecomp::bacon()` and the paper's Theorem 1 contract; R parity is validated at `atol=1e-6` (see `**Note (R parity convention divergence)**` below for the one structural convention difference). Hand-calculation + TWFE-vs-weighted-sum identity hold at `atol=1e-10`. The approximate path is retained for backward compatibility; numerical output may differ from R.
2684
+
- **Note (always-treated remap):** Units whose `first_treat` is at or before the first observable period (`first_treat <= min(time)`, excluding the never-treated sentinels `0` and `np.inf`) are automatically remapped to the `U` bucket via an internal column (`__bacon_first_treat_internal__`) with a `UserWarning` — per paper footnote 11 (with a library boundary extension on `first_treat == min(time)`; see `**Deviation (first-period boundary extension)**` below). Detection uses ordered-time logic on the **time axis**, so panels whose `time` column has negative or zero-crossing labels (e.g. event-time `time ∈ [-2,..,3]`) are handled correctly: a cohort at `first_treat=-1` on such a panel is a valid timing group; a cohort at `first_treat=-3` is remapped to U. The user's original `first_treat` column on the input `data` frame is preserved unchanged. The count of remapped units is surfaced via `BaconDecompositionResults.n_always_treated_remapped`. **Sentinel restriction:** `first_treat ∈ {0, np.inf}` is reserved as the never-treated marker and is not configurable today; a real treatment cohort with `first_treat == 0` would be folded into `U` and should be re-labeled to a non-sentinel value before fitting. The `0` reservation applies to `first_treat` only, not to `time`.
2685
2685
-**Note (Bacon survey diagnostic):** Bacon decomposition with survey weights is diagnostic; exact-sum guarantee holds at machine precision under `weights="exact"`**on balanced panels**. `weights="exact"` requires within-unit-constant survey columns (approximate path accepts time-varying weights).
2686
+
- **Note (R parity convention divergence on always-treated):** R `bacondecomp::bacon()` keeps `first_treat=1` (the always-treated cohort) as a separate timing cohort and emits an additional comparison type `Later vs Always Treated` (cohort k vs the always-treated cell) alongside the standard `Treated vs Untreated` row. Python's footnote-11 convention remaps these units to the `U` bucket and folds those R-side rows into a single `treated_vs_never` cell per treated cohort. The aggregate (TWFE coefficient + sum of weights) is invariant to this re-bucketing — Theorem 1's identity holds identically because the U bucket's total weight gets re-allocated across nested 2x2 cells but the total weight on `{cohort_k vs U}` is the same. The per-component breakdown, however, differs structurally between the two conventions. The R parity test (`tests/test_methodology_bacon.py::TestBaconParityR::test_component_estimates_match_r`) asserts per-component parity at `atol=1e-6` on the 2 fixtures without always-treated (`uniform_3groups_with_never_treated`, `two_groups_no_never_treated`) AND on the 6 timing-vs-timing rows of `always_treated_remapped` — the carve-out is narrowed to U-bucket rows only (R's `Later vs Always Treated` rows canonicalize to `treated_vs_never` and are dropped alongside the matching Python rows). The R→Python U-bucket fold-back is pinned separately by `test_always_treated_remapped_fold_back_matches_r`, which aggregates R's split `Later vs Always Treated` + `Treated vs Untreated` rows per treated cohort and asserts the combined weight + weight-averaged estimate match Python's single `treated_vs_never` cell at `atol=1e-6`. Aggregate parity (`test_twfe_coef_matches_r`, `test_weights_sum_matches_r`) is locked across all 3 fixtures.
2687
+
- **Deviation (first-period boundary extension on always-treated remap):** Paper footnote 11 (Goodman-Bacon 2021) uses the strict inequality `t_i < 1` (units treated *before* the first observable period) for the always-treated bucket. The library applies the **inclusive** `first_treat <= min(time)` rule, which additionally folds units treated *at* the first observable period (`first_treat == min(time)`) into `U`. This is a library boundary convention, not a paper-faithful rule: such units have no untreated cell in the observed panel and so cannot contribute to any 2x2 DD as a treated cohort, so folding them into the U bucket mirrors the always-treated handling rather than dropping them silently. R `bacondecomp::bacon()` does not apply this boundary fold-back — it keeps `first_treat == min(time)` cohorts in their own bucket and emits `Later vs Always Treated` comparisons (see the **Note (R parity convention divergence on always-treated)** above for how the parity tests handle the resulting structural breakdown difference; aggregate Theorem 1 identity remains invariant). When `min(time)` is strictly greater than 1 (no first-period-treated cohorts), the library rule reduces to the paper's strict rule and the two conventions coincide.
2686
2688
-**Deviation (unbalanced-panel library extension):** Unbalanced panels are accepted with a `UserWarning` ("Unbalanced panel detected. Bacon decomposition assumes balanced panels. Results may be inaccurate."). Goodman-Bacon (2021) Appendix A's proof assumes a balanced panel; under unbalance, the Theorem 1 identity holds only approximately. The decomposition still returns finite, well-defined outputs but `weights="exact"` does NOT achieve the machine-precision algebraic identity that the balanced-panel claims above describe.
0 commit comments