Fix CI review R2: zero-weight row drops, weighted covariates, df_survey propagation

igerber · claude · igerber · commit cf17cb2fdeb2 · 2026-04-16T07:04:02.000-04:00
- P1-A: Drop zero-weight rows entirely from cell DataFrame (instead
  of just zeroing n_gt) so ragged-panel validator doesn't see them
- P1-B: Survey-weighted covariate aggregation in DID^X path
  (sum(w*x)/sum(w) instead of unweighted mean)
- P1-C: Thread _df_survey to all remaining safe_inference() calls:
  bootstrap t-stats, normalized effects, cost-benefit delta, placebo
  bootstrap t-stats
- P3: Fix REGISTRY overview paragraph (was still saying survey deferred)

Co-Authored-By: Claude Opus 4.6 (1M context) &lt;noreply@anthropic.com&gt;
diff --git a/diff_diff/chaisemartin_dhaultfoeuille.py b/diff_diff/chaisemartin_dhaultfoeuille.py
@@ -227,12 +227,11 @@ def _validate_and_aggregate_to_cells(
         )
         cell["y_gt"] = cell["_wy_sum"] / cell["w_gt"]
         cell = cell.drop(columns=["_wy_sum"])
-        # Zero-weight cells: treat as absent so downstream presence
-        # logic (N_mat > 0) correctly excludes them.
+        # Zero-weight cells: drop entirely so downstream validators
+        # (ragged-panel, baseline requirement) don't see them.
         zero_w_mask = cell["w_gt"] <= 0
         if zero_w_mask.any():
-            cell.loc[zero_w_mask, "n_gt"] = 0
-            cell.loc[zero_w_mask, "y_gt"] = 0.0
+            cell = cell[~zero_w_mask].reset_index(drop=True)
         df.drop(columns=["_w_", "_wy_"], inplace=True)
     else:
         cell = df.groupby([group, time], as_index=False).agg(
@@ -744,7 +743,23 @@ def fit(
             # Use the coerced copy joined with group/time from original data.
             x_agg_input = data[[group, time]].copy()
             x_agg_input[controls] = data_controls[controls].values
-            x_cell_agg = x_agg_input.groupby([group, time], as_index=False)[controls].mean()
+            if survey_weights is not None:
+                # Survey-weighted covariate cell means: sum(w*x)/sum(w)
+                x_agg_input["_w_"] = survey_weights
+                for c in controls:
+                    x_agg_input[f"_wx_{c}"] = survey_weights * x_agg_input[c].values
+                wx_cols = [f"_wx_{c}" for c in controls]
+                g_agg = x_agg_input.groupby([group, time], as_index=False).agg(
+                    {**{wc: "sum" for wc in wx_cols}, "_w_": "sum"}
+                )
+                for c in controls:
+                    w_safe = g_agg["_w_"].replace(0, 1)
+                    g_agg[c] = g_agg[f"_wx_{c}"] / w_safe
+                x_cell_agg = g_agg[[group, time] + controls]
+            else:
+                x_cell_agg = x_agg_input.groupby(
+                    [group, time], as_index=False
+                )[controls].mean()
             cell = cell.merge(x_cell_agg, on=[group, time], how="left")
 
         # ------------------------------------------------------------------
@@ -1959,17 +1974,17 @@ def fit(
                 overall_se = br.overall_se
                 overall_p = br.overall_p_value if br.overall_p_value is not None else np.nan
                 overall_ci = br.overall_ci if br.overall_ci is not None else (np.nan, np.nan)
-                overall_t = safe_inference(overall_att, overall_se, alpha=self.alpha, df=None)[0]
+                overall_t = safe_inference(overall_att, overall_se, alpha=self.alpha, df=_df_survey)[0]
             if joiners_available and br.joiners_se is not None and np.isfinite(br.joiners_se):
                 joiners_se = br.joiners_se
                 joiners_p = br.joiners_p_value if br.joiners_p_value is not None else np.nan
                 joiners_ci = br.joiners_ci if br.joiners_ci is not None else (np.nan, np.nan)
-                joiners_t = safe_inference(joiners_att, joiners_se, alpha=self.alpha, df=None)[0]
+                joiners_t = safe_inference(joiners_att, joiners_se, alpha=self.alpha, df=_df_survey)[0]
             if leavers_available and br.leavers_se is not None and np.isfinite(br.leavers_se):
                 leavers_se = br.leavers_se
                 leavers_p = br.leavers_p_value if br.leavers_p_value is not None else np.nan
                 leavers_ci = br.leavers_ci if br.leavers_ci is not None else (np.nan, np.nan)
-                leavers_t = safe_inference(leavers_att, leavers_se, alpha=self.alpha, df=None)[0]
+                leavers_t = safe_inference(leavers_att, leavers_se, alpha=self.alpha, df=_df_survey)[0]
 
         # ------------------------------------------------------------------
         # Step 20: Build the results dataclass
@@ -2216,7 +2231,7 @@ def fit(
                             bs_ci if bs_ci is not None else (np.nan, np.nan)
                         )
                         placebo_event_study_dict[neg_key]["t_stat"] = safe_inference(
-                            eff, bs_se, alpha=self.alpha, df=None
+                            eff, bs_se, alpha=self.alpha, df=_df_survey
                         )[0]
 
         # Phase 2: build normalized_effects with SE
@@ -2229,7 +2244,7 @@ def fit(
                 # SE via delta method: SE(DID^n_l) = SE(DID_l) / delta^D_l
                 se_did_l = multi_horizon_se.get(l_h, float("nan"))
                 se_norm = se_did_l / denom if np.isfinite(denom) and denom > 0 else float("nan")
-                t_n, p_n, ci_n = safe_inference(eff, se_norm, alpha=self.alpha, df=None)
+                t_n, p_n, ci_n = safe_inference(eff, se_norm, alpha=self.alpha, df=_df_survey)
                 normalized_effects_out[l_h] = {
                     "effect": eff,
                     "se": se_norm,
@@ -2288,7 +2303,7 @@ def fit(
                 else:
                     running_se_ub = float("nan")
                 cum_t, cum_p, cum_ci = safe_inference(
-                    cum_effect, running_se_ub, alpha=self.alpha, df=None
+                    cum_effect, running_se_ub, alpha=self.alpha, df=_df_survey
                 )
                 cumulated[l_h] = {
                     "effect": cum_effect,
diff --git a/docs/methodology/REGISTRY.md b/docs/methodology/REGISTRY.md
@@ -463,7 +463,7 @@ The multiplier bootstrap uses random weights w_i with E[w]=0 and Var(w)=1:
 - [de Chaisemartin, C. & D'Haultfœuille, X. (2020). Two-Way Fixed Effects Estimators with Heterogeneous Treatment Effects. *American Economic Review*, 110(9), 2964-2996.](https://doi.org/10.1257/aer.20181169)
 - [de Chaisemartin, C. & D'Haultfœuille, X. (2022, revised 2024). Difference-in-Differences Estimators of Intertemporal Treatment Effects. NBER Working Paper 29873.](https://www.nber.org/papers/w29873) — Web Appendix Section 3.7.3 contains the cohort-recentered plug-in variance formula implemented here.
 
-**Phase 1-2 scope:** Ships the contemporaneous-switch estimator `DID_M` (= `DID_1` at horizon `l = 1`) from the AER 2020 paper **plus** the full multi-horizon event study `DID_l` for `l = 1..L_max` from the dynamic companion paper. Phase 2 adds: per-group `DID_{g,l}` building block (Equation 3), dynamic placebos `DID^{pl}_l`, normalized estimator `DID^n_l`, cost-benefit aggregate `delta`, sup-t simultaneous confidence bands, and `plot_event_study()` integration. Phase 3 adds covariate adjustment (`DID^X`), group-specific linear trends (`DID^{fd}`), state-set-specific trends, and HonestDiD integration. Survey design support is deferred to a separate effort after all phases ship. **This is the only modern staggered estimator in the library that handles non-absorbing (reversible) treatments** - treatment can switch on AND off over time, making it the natural fit for marketing campaigns, seasonal promotions, on/off policy cycles.
+**Phase 1-2 scope:** Ships the contemporaneous-switch estimator `DID_M` (= `DID_1` at horizon `l = 1`) from the AER 2020 paper **plus** the full multi-horizon event study `DID_l` for `l = 1..L_max` from the dynamic companion paper. Phase 2 adds: per-group `DID_{g,l}` building block (Equation 3), dynamic placebos `DID^{pl}_l`, normalized estimator `DID^n_l`, cost-benefit aggregate `delta`, sup-t simultaneous confidence bands, and `plot_event_study()` integration. Phase 3 adds covariate adjustment (`DID^X`), group-specific linear trends (`DID^{fd}`), state-set-specific trends, and HonestDiD integration. Survey design supports pweight with strata/PSU/FPC via Taylor Series Linearization; replicate weights and PSU-level bootstrap are deferred. **This is the only modern staggered estimator in the library that handles non-absorbing (reversible) treatments** - treatment can switch on AND off over time, making it the natural fit for marketing campaigns, seasonal promotions, on/off policy cycles.
 
 **Key implementation requirements:**