You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Surface silent np.linalg.solve fallbacks across axis-A minor solver paths
Addresses findings #17, #18, #19 from the Phase 2 silent-failures audit (axis A,
all Minor). Each site previously ran np.linalg.solve against a matrix that
could be rank-deficient or near-singular with no user-facing signal.
- StaggeredTripleDifference: `_compute_did_panel` now appends a condition-number
sample to an instance tracker on LinAlgError; `fit()` emits ONE aggregate
UserWarning listing affected (g, g_c, t) cells and the max condition number
instead of silently falling back to np.linalg.lstsq per pair. Tracker resets
on repeat fit.
- EfficientDiD covariate sieve (estimate_propensity_ratio_sieve,
estimate_inverse_propensity_sieve): precondition-check the normal-equations
matrix via np.linalg.cond before solve and reject K values above
1/sqrt(eps); partial-K skips now surface via UserWarning listing the
skipped K values, instead of being swallowed by `continue`.
- compute_survey_vcov: check cond(X'WX) before the sandwich solve; emit
UserWarning above the 1/sqrt(eps) threshold so ill-conditioned bread
matrices don't silently produce unstable variance estimates.
Sibling sites picked up via repo-wide lstsq-fallback pattern grep (per
the pattern-check feedback memory):
- two_stage.py:1768 (TSL variance bread)
- two_stage_bootstrap.py:197 (multiplier bootstrap bread)
Both now warn before the silent lstsq fallback.
Adds 8 targeted tests across test_staggered_triple_diff.py,
test_efficient_did.py, and test_survey.py, covering collinear/ill-conditioned
triggers and happy-path negatives. REGISTRY.md notes added for each affected
estimator section. No behavioral change on well-conditioned inputs.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy file name to clipboardExpand all lines: docs/methodology/REGISTRY.md
+9Lines changed: 9 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -753,6 +753,7 @@ See `docs/methodology/continuous-did.md` Section 4 for full details.
753
753
-**Balanced panel**: Short balanced panel required ("large-n, fixed-T" regime). Does not handle unbalanced panels or repeated cross-sections
754
754
- Warn if treatment varies within units (non-absorbing treatment)
755
755
- Warn if propensity score estimates are near boundary values
756
+
-**Note:** Polynomial-sieve propensity fits now reject any K whose normal-equations matrix has condition number above `1/sqrt(eps)` (≈ 6.7e7) — previously a near-singular `np.linalg.solve` could return numerically meaningless coefficients without raising. If at least one K succeeds but others were skipped via this precondition, a `UserWarning` lists the skipped K values. If every K is skipped, the existing "estimation failed for all K values" fallback warning still fires. Axis-A finding #18 in the Phase 2 silent-failures audit.
756
757
757
758
*Estimator equation -- single treatment date (Equations 3.2, 3.5):*
758
759
@@ -1175,6 +1176,7 @@ Our implementation uses multiplier bootstrap on the GMM influence function: clus
1175
1176
-**Zero-observation cohorts in group effects:** If all treated observations for a cohort have NaN `y_tilde` (excluded from estimation), that cohort's group effect is NaN with n_obs=0.
1176
1177
-**Note:** Survey weights in TwoStageDiD GMM sandwich via weighted cross-products: bread uses (X'_2 W X_2)^{-1}, gamma_hat uses (X'_{10} W X_{10})^{-1}(X'_1 W X_2), per-cluster scores multiply by survey weights. PSU clustering, stratification, and FPC are fully supported in the meat matrix via `_compute_stratified_meat_from_psu_scores()`. When strata or FPC are present, the meat computation replaces `S' S` with the stratified formula `sum_h (1 - f_h) * (n_h/(n_h-1)) * centered_h' centered_h`. Strata also enters survey df (n_PSU - n_strata) for t-distribution inference. Bootstrap + survey supported (Phase 6) via PSU-level multiplier weights.
1177
1178
-**Note:** Both the iterative FE solver (`_iterative_fe`, Stage 1) and the iterative alternating-projection demeaning helper (`_iterative_demean`, used in covariate residualization) emit `UserWarning` when `max_iter` exhausts without reaching `tol`, via `diff_diff.utils.warn_if_not_converged`. Silent return of the current iterate was classified as a silent failure under the Phase 2 audit and replaced with an explicit signal to match the logistic/Poisson IRLS pattern in `linalg.py`.
1179
+
-**Note:** When the Stage-2 bread `X'_2 W X_2` is singular, both the analytical TSL variance (`two_stage.py`) and the multiplier-bootstrap bread (`two_stage_bootstrap.py`) now emit a `UserWarning` before falling back to `np.linalg.lstsq`. Previously this fallback was silent. Sibling of axis-A finding #17 in the Phase 2 silent-failures audit; surfaced by the repo-wide lstsq-fallback pattern grep that accompanied the StaggeredTripleDifference fix.
1178
1180
-**Note:** The GMM sandwich and bootstrap paths both use `scipy.sparse.linalg.factorized` for the Stage 1 normal-equations solve `(X'_{10} W X_{10}) gamma = X'_1 W X_2` and fall back to dense `lstsq` when the sparse factorization raises `RuntimeError` on a near-singular matrix. Both fallback sites emit a `UserWarning` (silent-failure audit axis C) so callers know SE estimates came from the degraded path rather than the fast sparse path.
1179
1181
1180
1182
**Reference implementation(s):**
@@ -1695,6 +1697,7 @@ has no additional effect.
1695
1697
-**Note:**`pscore_fallback` default changed from unconditional to error.
1696
1698
Set `pscore_fallback="unconditional"` for legacy behavior.
1697
1699
- Warns on singular GMM covariance matrix (falls back to pseudoinverse)
1700
+
-**Note:** Rank-deficient X'WX in the per-pair outcome-regression influence-function step now emits ONE aggregate `UserWarning` at `fit()` time (counting affected (g, g_c, t) cells and reporting the max condition number), instead of silently falling back to `np.linalg.lstsq`. Axis-A finding #17 in the Phase 2 silent-failures audit.
0 commit comments