Fix universal base period skip and comparison-cohort REGISTRY docs

igerber · claude · igerber · commit 1a309c44cb93 · 2026-03-30T09:24:34.000-04:00
- Skip reference period (t == g-1-anticipation) in universal mode,
  matching CallawaySantAnna behavior. Event-study mixin injects
  synthetic reference row with effect=0.
- Update REGISTRY comparison-group rule to match code: uses
  max(t, base_period) + anticipation for notyettreated threshold.
- Add TODO.md entries for deferred P2 items (CSV fixtures, covariate
  parity, group-effect WIF).

Co-Authored-By: Claude Opus 4.6 (1M context) &lt;noreply@anthropic.com&gt;
diff --git a/TODO.md b/TODO.md
@@ -59,6 +59,9 @@ Deferred items from PR reviews that were not addressed before merge.
 | TripleDifference power: `generate_ddd_data` is a fixed 2×2×2 cross-sectional DGP — no multi-period or unbalanced-group support. | `prep_dgp.py`, `power.py` | #208 | Low |
 | Survey design resolution/collapse patterns inconsistent across panel estimators — extract shared helpers for panel-to-unit collapse, post-filter re-resolution, metadata recomputation | `continuous_did.py`, `efficient_did.py`, `stacked_did.py` | #226 | Low |
 | TROP: `fit()` and `_fit_global()` share ~150 lines of near-identical data setup. Extract shared helpers to eliminate cross-file sync risk. | `trop.py`, `trop_global.py`, `trop_local.py` | — | Low |
+| StaggeredTripleDifference R cross-validation: CSV fixtures not committed (gitignored); tests skip without local R + triplediff. Commit fixtures or generate deterministically. | `tests/test_methodology_staggered_triple_diff.py` | #245 | Medium |
+| StaggeredTripleDifference R parity: benchmark only tests no-covariate path (xformla=~1). Add covariate-adjusted scenarios and aggregation SE parity assertions. | `benchmarks/R/benchmark_staggered_triplediff.R` | #245 | Medium |
+| StaggeredTripleDifference: per-cohort group-effect SEs include WIF (conservative vs R's wif=NULL). Documented in REGISTRY. Could override mixin for exact R match. | `staggered_triple_diff.py` | #245 | Low |
 
 #### Performance
 
diff --git a/diff_diff/staggered_triple_diff.py b/diff_diff/staggered_triple_diff.py
@@ -272,7 +272,16 @@ def fit(
         gmm_weights_store: Dict[Tuple, Dict] = {}
 
         for g in treatment_groups:
-            for t in time_periods:
+            # In universal mode, skip the reference period (t == g-1-anticipation)
+            # so it's omitted from GT estimation. The event-study mixin injects
+            # a synthetic reference row with effect=0, matching CS behavior.
+            if self.base_period == "universal":
+                universal_base = g - 1 - self.anticipation
+                valid_periods = [t for t in time_periods if t != universal_base]
+            else:
+                valid_periods = time_periods
+
+            for t in valid_periods:
                 base_period_val = self._get_base_period(g, t)
                 if base_period_val is None:
                     continue
diff --git a/docs/methodology/REGISTRY.md b/docs/methodology/REGISTRY.md
@@ -1294,7 +1294,10 @@ This sign convention matches both the paper's Equation 4.1 and the existing
 `TripleDifference` decomposition (DDD = DiD_3 + DiD_2 - DiD_1 with subgroups
 4=G1P1, 3=G1P0, 2=G0P1, 1=G0P0).
 
-Valid comparison groups: `G_c = {g_c : g_c > max(g, t)}`, including never-enabled (S=0).
+Valid comparison groups: for `control_group="nevertreated"`, only the never-enabled
+cohort (S=0). For `control_group="notyettreated"`, `G_c = {g_c : g_c > max(t, base_period)
++ anticipation}`, plus never-enabled. The anticipation-adjusted threshold ensures cohorts
+within the anticipation window are excluded from controls.
 
 *With covariates / doubly robust (DR, recommended):*
 
@@ -1396,7 +1399,8 @@ confidence bands (sup-t) for event study.
 - [x] Panel data with (unit, time, enabling-group S, eligibility Q, outcome Y)
 - [x] Three comparison sub-groups per (g, g_c): (S=g, Q=0), (S=g_c, Q=1), (S=g_c, Q=0)
 - [x] Individual comparison cohorts, never pooled — combined via GMM weights
-- [x] Comparison groups satisfy g_c > max{g, t}
+- [x] Comparison groups satisfy g_c > max(t, base_period) + anticipation (notyettreated)
+  or g_c = never-enabled only (nevertreated)
 - [x] Doubly robust: consistent if either propensity or outcome model correct (per component)
 - [x] GMM-optimal weighting via closed-form inverse-variance formula
 - [x] Event-study aggregation with cohort-share weights (via CS mixin)