Update docs and docstrings for PR #165 review feedback

igerber · claude · igerber · commit e97ca9d8902b · 2026-02-17T12:55:57.000-05:00
Address 5 AI review items: update Methodology Registry for bootstrap_weights
and SunAbraham param removal, fix bootstrap result docstrings, add memory
guidance for dense .toarray(), and document n_bootstrap minimum.

Co-Authored-By: Claude Opus 4.6 &lt;noreply@anthropic.com&gt;
diff --git a/diff_diff/imputation_results.py b/diff_diff/imputation_results.py
@@ -33,7 +33,7 @@ class ImputationBootstrapResults:
     n_bootstrap : int
         Number of bootstrap iterations.
     weight_type : str
-        Type of bootstrap weights (currently "rademacher" only).
+        Type of bootstrap weights: "rademacher", "mammen", or "webb".
     alpha : float
         Significance level used for confidence intervals.
     overall_att_se : float
diff --git a/diff_diff/trop.py b/diff_diff/trop.py
@@ -93,7 +93,7 @@ class TROP:
     alpha : float, default=0.05
         Significance level for confidence intervals.
     n_bootstrap : int, default=200
-        Number of bootstrap replications for variance estimation.
+        Number of bootstrap replications for variance estimation. Must be >= 2.
     seed : int, optional
         Random seed for reproducibility.
 
diff --git a/diff_diff/two_stage.py b/diff_diff/two_stage.py
@@ -1225,6 +1225,8 @@ def _compute_gmm_variance(
         # Convert sparse to dense once for efficient cluster aggregation.
         # Total memory touched is identical to per-column .getcol().toarray();
         # only peak allocation differs (full matrix vs one column at a time).
+        # For panels with >100K FE columns, consider reverting to per-column
+        # .getcol() to limit peak memory.
         weighted_X10_dense = weighted_X10.toarray()
         c_by_cluster = np.zeros((G, p))
         for j_col in range(p):
diff --git a/diff_diff/two_stage_bootstrap.py b/diff_diff/two_stage_bootstrap.py
@@ -106,7 +106,8 @@ def _compute_cluster_S_scores(
         unique_clusters, cluster_indices = np.unique(cluster_ids, return_inverse=True)
         G = len(unique_clusters)
 
-        # Convert sparse to dense once (see _compute_gmm_variance for memory note)
+        # Convert sparse to dense once (see _compute_gmm_variance for memory note).
+        # For panels with >100K FE columns, consider per-column .getcol() instead.
         weighted_X10_dense = weighted_X10.toarray()
         c_by_cluster = np.zeros((G, p))
         for j_col in range(p):
diff --git a/diff_diff/two_stage_results.py b/diff_diff/two_stage_results.py
@@ -34,7 +34,7 @@ class TwoStageBootstrapResults:
     n_bootstrap : int
         Number of bootstrap iterations.
     weight_type : str
-        Type of bootstrap weights (currently "rademacher" only).
+        Type of bootstrap weights: "rademacher", "mammen", or "webb".
     alpha : float
         Significance level used for confidence intervals.
     overall_att_se : float
diff --git a/docs/methodology/REGISTRY.md b/docs/methodology/REGISTRY.md
@@ -427,7 +427,7 @@ where weights ŵ_{g,e} = n_{g,e} / Σ_g n_{g,e} (sample share of cohort g at eve
 - Extrapolation beyond observed event-times: not estimated
 - Event-time range: no artificial cap (estimates all available relative times, matching R's `fixest::sunab()`)
 - No post-treatment effects: returns `(NaN, NaN)` for overall ATT/SE; all inference fields (t_stat, p_value, conf_int) propagate NaN via `np.isfinite()` guards
-- `min_pre_periods`/`min_post_periods` parameters: deprecated (accepted but ignored, emit `FutureWarning`)
+- `min_pre_periods`/`min_post_periods` parameters: removed (previously deprecated with `FutureWarning`; callers passing these will now get `TypeError`)
 - Variance fallback: when full weight vector cannot be constructed for overall ATT SE, uses simplified variance (ignores covariances between periods) with `UserWarning`
 - Rank-deficient design matrix (covariate collinearity):
   - Detection: Pivoted QR decomposition with tolerance `1e-07` (R's `qr()` default)
@@ -545,7 +545,7 @@ Y_it = alpha_i + beta_t [+ X'_it * delta] + W'_it * gamma + epsilon_it
 - **treatment_effects DataFrame weights:** `weight` column uses `1/n_valid` for finite tau_hat and 0 for NaN tau_hat, consistent with the ATT estimand.
 - **Rank-deficient covariates in variance:** Covariates with NaN coefficients (dropped for rank deficiency in Step 1) are excluded from the variance design matrices `A_0`/`A_1`. Only covariates with finite coefficients participate in the `v_it` projection.
 - **Sparse variance solver:** `_compute_v_untreated_with_covariates` uses `scipy.sparse.linalg.spsolve` to solve `(A_0'A_0) z = A_1'w` without densifying the normal equations matrix. Falls back to dense `lstsq` if the sparse solver fails.
-- **Bootstrap inference:** Uses multiplier bootstrap on the Theorem 3 influence function: `psi_i = sum_t v_it * epsilon_tilde_it`. Cluster-level psi sums are pre-computed for each aggregation target (overall, per-horizon, per-group), then perturbed with Rademacher weights. This is a library extension (not in the paper) consistent with CallawaySantAnna/SunAbraham bootstrap patterns.
+- **Bootstrap inference:** Uses multiplier bootstrap on the Theorem 3 influence function: `psi_i = sum_t v_it * epsilon_tilde_it`. Cluster-level psi sums are pre-computed for each aggregation target (overall, per-horizon, per-group), then perturbed with multiplier weights (Rademacher by default; configurable via `bootstrap_weights` parameter to use Mammen or Webb weights, matching CallawaySantAnna). This is a library extension (not in the paper) consistent with CallawaySantAnna/SunAbraham bootstrap patterns.
 - **Auxiliary residuals (Equation 8):** Uses v_it-weighted tau_tilde_g formula: `tau_tilde_g = sum(v_it * tau_hat_it) / sum(v_it)` within each partition group. Zero-weight groups (common in event-study SE computation) fall back to unweighted mean.
 
 **Reference implementation(s):**
@@ -610,7 +610,7 @@ where `psi_i` is the stacked influence function for unit i across all its observ
 
 *Bootstrap:*
 
-Our implementation uses multiplier bootstrap on the GMM influence function: cluster-level `psi` sums are pre-computed, then perturbed with Rademacher weights. The R `did2s` package defaults to block bootstrap (resampling clusters with replacement). Both approaches are asymptotically valid; the multiplier bootstrap is computationally cheaper and consistent with the CallawaySantAnna/ImputationDiD bootstrap patterns in this library.
+Our implementation uses multiplier bootstrap on the GMM influence function: cluster-level `psi` sums are pre-computed, then perturbed with multiplier weights (Rademacher by default; configurable via `bootstrap_weights` parameter to use Mammen or Webb weights, matching CallawaySantAnna). The R `did2s` package defaults to block bootstrap (resampling clusters with replacement). Both approaches are asymptotically valid; the multiplier bootstrap is computationally cheaper and consistent with the CallawaySantAnna/ImputationDiD bootstrap patterns in this library.
 
 *Edge cases:*
 - **Always-treated units:** Units treated in all observed periods have no untreated observations for Stage 1 FE estimation. These are excluded with a warning listing the affected unit IDs. Their treated observations do NOT contribute to Stage 2.