Skip to content

Commit 04bd263

Browse files
igerberclaude
andcommitted
Survey Phase 7: CS IPW/DR covariates, repeated cross-sections, HonestDiD survey variance
Phase 7a: Remove NotImplementedError gate for IPW/DR + covariates + survey. Add DRDID panel nuisance IF corrections (PS + OR) for both survey and non-survey DR paths. Extract _safe_inv helper for matrix inversions. Phase 7d: Thread survey df through HonestDiD for t-distribution critical values. Compute full event-study VCV from influence function vectors. Add event_study_vcov to CallawaySantAnnaResults. Phase 7b: Add panel=False for repeated cross-section support in CallawaySantAnna. New _precompute_structures_rc, _compute_att_gt_rc, and three RC estimation methods (reg, ipw, dr) with covariates and survey weights. Canonical index abstraction in aggregation/bootstrap. RCS data generator in generate_staggered_data(panel=False). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 92b4d97 commit 04bd263

14 files changed

Lines changed: 2327 additions & 182 deletions

ROADMAP.md

Lines changed: 25 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -8,19 +8,42 @@ For past changes and release history, see [CHANGELOG.md](CHANGELOG.md).
88

99
## Current Status
1010

11-
diff-diff v2.6.0 is a **production-ready** DiD library with feature parity with R's `did` + `HonestDiD` + `synthdid` ecosystem for core DiD analysis:
11+
diff-diff v2.7.5 is a **production-ready** DiD library with feature parity with R's `did` + `HonestDiD` + `synthdid` ecosystem for core DiD analysis, plus **unique survey support** — design-based variance estimation (Taylor linearization, replicate weights) integrated across all estimators. No R or Python package offers this combination:
1212

1313
- **Core estimators**: Basic DiD, TWFE, MultiPeriod, Callaway-Sant'Anna, Sun-Abraham, Borusyak-Jaravel-Spiess Imputation, Synthetic DiD, Triple Difference (DDD), TROP, Two-Stage DiD (Gardner 2022), Stacked DiD (Wing et al. 2024), Continuous DiD (Callaway, Goodman-Bacon & Sant'Anna 2024)
1414
- **Valid inference**: Robust SEs, cluster SEs, wild bootstrap, multiplier bootstrap, placebo-based variance
1515
- **Assumption diagnostics**: Parallel trends tests, placebo tests, Goodman-Bacon decomposition
1616
- **Sensitivity analysis**: Honest DiD (Rambachan-Roth), Pre-trends power analysis (Roth 2022)
1717
- **Study design**: Power analysis tools
1818
- **Data utilities**: Real-world datasets (Card-Krueger, Castle Doctrine, Divorce Laws, MPDTA), DGP functions for all supported designs
19+
- **Survey support**: Full `SurveyDesign` with strata, PSU, FPC, weight types, replicate weights (BRR/Fay/JK1/JKn), Taylor linearization, DEFF diagnostics, subpopulation analysis — integrated across all estimators (see [survey-roadmap.md](docs/survey-roadmap.md))
1920
- **Performance**: Optional Rust backend for accelerated computation; faster than R at scale (see [CHANGELOG.md](CHANGELOG.md) for benchmarks)
2021

2122
---
2223

23-
## Near-Term Enhancements (v2.7)
24+
## Near-Term Enhancements (v2.8)
25+
26+
### Survey Phase 7: Completing the Survey Story
27+
28+
Close the remaining gaps for practitioners using major population surveys
29+
(ACS, CPS, BRFSS, MEPS). See [survey-roadmap.md](docs/survey-roadmap.md) for
30+
full details.
31+
32+
- **CS Covariates + IPW/DR + Survey** *(High priority)*: Implement DRDID
33+
nuisance IF corrections under survey weights. Currently the recommended DR
34+
method raises `NotImplementedError` with covariates + survey. This is the
35+
most commonly needed path in applied work (Medicaid expansion, minimum wage).
36+
- **Repeated Cross-Sections** *(High priority)*: `panel=False` support for
37+
CallawaySantAnna, enabling analysis of surveys that don't track units over
38+
time (BRFSS, ACS annual, CPS monthly). Uses cross-sectional DRDID
39+
(Sant'Anna & Zhao 2020, Section 4).
40+
- **Survey-Aware DiD Tutorial** *(High priority)*: Jupyter notebook
41+
demonstrating the full workflow with realistic survey data. diff-diff is
42+
the only package (R or Python) with design-based variance for modern DiD
43+
— this makes that capability discoverable.
44+
- **HonestDiD + Survey Variance** *(Medium priority)*: Pass survey vcov
45+
(TSL or replicate) into sensitivity analysis instead of cluster-robust vcov,
46+
so sensitivity bounds respect the same variance structure as main estimates.
2447

2548
### Staggered Triple Difference (DDD)
2649

@@ -32,12 +55,6 @@ Extend the existing `TripleDifference` estimator to handle staggered adoption se
3255

3356
**Reference**: [Ortiz-Villavicencio & Sant'Anna (2025)](https://arxiv.org/abs/2505.09942). *Working Paper*. R package: `triplediff`.
3457

35-
### Enhanced Visualization
36-
37-
- Synthetic control weight visualization (bar chart of unit weights)
38-
- Treatment adoption "staircase" plot for staggered designs
39-
- Interactive plots with plotly backend option
40-
4158
---
4259

4360
## Medium-Term Enhancements

TODO.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,7 @@ Deferred items from PR reviews that were not addressed before merge.
5454
| Multi-absorb weighted demeaning needs iterative alternating projections for N > 1 absorbed FE with survey weights; unweighted multi-absorb also uses single-pass (pre-existing, exact only for balanced panels) | `estimators.py` | #218 | Medium |
5555
| Replicate-weight survey df — **Resolved**. `df_survey = rank(replicate_weights) - 1` matching R's `survey::degf()`. For IF paths, `n_valid - 1` when dropped replicates reduce effective count. | `survey.py` | #238 | Resolved |
5656
| CallawaySantAnna survey: strata/PSU/FPC — **Resolved**. Aggregated SEs (overall, event study, group) use `compute_survey_if_variance()`. Bootstrap uses PSU-level multiplier weights. | `staggered.py` | #237 | Resolved |
57-
| CallawaySantAnna survey + covariates + IPW/DR: DRDID panel nuisance-estimation IF corrections not implemented. Currently gated with NotImplementedError. Regression method with covariates works (has WLS nuisance IF correction). | `staggered.py` | #233 | Medium |
57+
| CallawaySantAnna survey + covariates + IPW/DR**Resolved**. DRDID panel nuisance IF corrections (PS + OR) implemented for both survey and non-survey DR paths (Phase 7a). IPW path unblocked. | `staggered.py` | #233 | Resolved |
5858
| SyntheticDiD/TROP survey: strata/PSU/FPC — **Resolved**. Rao-Wu rescaled bootstrap implemented for both. TROP uses cross-classified pseudo-strata. Rust TROP remains pweight-only (Python fallback for full design). | `synthetic_did.py`, `trop.py` || Resolved |
5959
| EfficientDiD hausman_pretest() clustered covariance stale `n_cl`**Resolved**. Recompute `n_cl` and remap indices after `row_finite` filtering via `np.unique(return_inverse=True)`. | `efficient_did.py` | #230 | Resolved |
6060
| EfficientDiD `control_group="last_cohort"` trims at `last_g - anticipation` but REGISTRY says `t >= last_g`. With `anticipation=0` (default) these are identical. With `anticipation>0`, code is arguably more conservative (excludes anticipation-contaminated periods). Either align REGISTRY with code or change code to `t < last_g` — needs design decision. | `efficient_did.py` | #230 | Low |
@@ -163,11 +163,11 @@ Spurious RuntimeWarnings ("divide by zero", "overflow", "invalid value") are emi
163163

164164
Features in R's `did` package that block porting additional tests:
165165

166-
| Feature | R tests blocked | Priority |
167-
|---------|----------------|----------|
168-
| Repeated cross-sections (`panel=FALSE`) | ~7 tests in test-att_gt.R + test-user_bug_fixes.R | Medium |
169-
| Sampling/population weights | 7 tests incl. all JEL replication | Medium |
170-
| Calendar time aggregation | 1 test in test-att_gt.R | Low |
166+
| Feature | R tests blocked | Priority | Status |
167+
|---------|----------------|----------|--------|
168+
| Repeated cross-sections (`panel=FALSE`) | ~7 tests in test-att_gt.R + test-user_bug_fixes.R | High | **Resolved** — Phase 7b: `panel=False` on CallawaySantAnna |
169+
| Sampling/population weights | 7 tests incl. all JEL replication | Medium | **Resolved** (Phases 1-6 + 7a: CS IPW/DR + covariates + survey) |
170+
| Calendar time aggregation | 1 test in test-att_gt.R | Low | |
171171

172172
---
173173

diff_diff/honest_did.py

Lines changed: 99 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -22,11 +22,12 @@
2222

2323
import numpy as np
2424
import pandas as pd
25-
from scipy import optimize, stats
25+
from scipy import optimize
2626

2727
from diff_diff.results import (
2828
MultiPeriodDiDResults,
2929
)
30+
from diff_diff.utils import _get_critical_value
3031

3132
# =============================================================================
3233
# Delta Restriction Classes
@@ -193,6 +194,9 @@ class HonestDiDResults:
193194
original_results: Optional[Any] = field(default=None, repr=False)
194195
# Event study bounds (optional)
195196
event_study_bounds: Optional[Dict[Any, Dict[str, float]]] = field(default=None, repr=False)
197+
# Survey design metadata (Phase 7d)
198+
survey_metadata: Optional[Any] = field(default=None, repr=False)
199+
df_survey: Optional[int] = field(default=None, repr=False)
196200

197201
def __repr__(self) -> str:
198202
sig = "" if self.ci_lb <= 0 <= self.ci_ub else "*"
@@ -534,7 +538,7 @@ def plot(
534538

535539
def _extract_event_study_params(
536540
results: Union[MultiPeriodDiDResults, Any],
537-
) -> Tuple[np.ndarray, np.ndarray, int, int, List[Any], List[Any]]:
541+
) -> Tuple[np.ndarray, np.ndarray, int, int, List[Any], List[Any], Optional[int]]:
538542
"""
539543
Extract event study parameters from results objects.
540544
@@ -557,6 +561,8 @@ def _extract_event_study_params(
557561
Pre-period identifiers.
558562
post_periods : list
559563
Post-period identifiers.
564+
df_survey : int or None
565+
Survey degrees of freedom for t-distribution inference.
560566
"""
561567
if isinstance(results, MultiPeriodDiDResults):
562568
# Extract from MultiPeriodDiD
@@ -606,7 +612,20 @@ def _extract_event_study_params(
606612
# Fallback: diagonal from SEs
607613
sigma = np.diag(np.array(ses) ** 2)
608614

609-
return beta_hat, sigma, num_pre_periods, num_post_periods, pre_periods, post_periods
615+
# Extract survey df if available
616+
df_survey = None
617+
if hasattr(results, "survey_metadata") and results.survey_metadata is not None:
618+
df_survey = getattr(results.survey_metadata, "df_survey", None)
619+
620+
return (
621+
beta_hat,
622+
sigma,
623+
num_pre_periods,
624+
num_post_periods,
625+
pre_periods,
626+
post_periods,
627+
df_survey,
628+
)
610629

611630
else:
612631
# Try CallawaySantAnnaResults
@@ -641,9 +660,29 @@ def _extract_event_study_params(
641660
ses.append(event_effects[t]["se"])
642661

643662
beta_hat = np.array(effects)
644-
sigma = np.diag(np.array(ses) ** 2)
645663

646-
return (beta_hat, sigma, len(pre_times), len(post_times), pre_times, post_times)
664+
# Use full event-study VCV if available (Phase 7d),
665+
# otherwise fall back to diagonal from SEs
666+
if hasattr(results, "event_study_vcov") and results.event_study_vcov is not None:
667+
# event_study_vcov is indexed by sorted rel_times
668+
sigma = results.event_study_vcov
669+
else:
670+
sigma = np.diag(np.array(ses) ** 2)
671+
672+
# Extract survey df
673+
df_survey = None
674+
if hasattr(results, "survey_metadata") and results.survey_metadata is not None:
675+
df_survey = getattr(results.survey_metadata, "df_survey", None)
676+
677+
return (
678+
beta_hat,
679+
sigma,
680+
len(pre_times),
681+
len(post_times),
682+
pre_times,
683+
post_times,
684+
df_survey,
685+
)
647686
except ImportError:
648687
pass
649688

@@ -860,7 +899,13 @@ def _solve_bounds_lp(
860899
return lb, ub
861900

862901

863-
def _compute_flci(lb: float, ub: float, se: float, alpha: float = 0.05) -> Tuple[float, float]:
902+
def _compute_flci(
903+
lb: float,
904+
ub: float,
905+
se: float,
906+
alpha: float = 0.05,
907+
df: Optional[int] = None,
908+
) -> Tuple[float, float]:
864909
"""
865910
Compute Fixed Length Confidence Interval (FLCI).
866911
@@ -877,6 +922,9 @@ def _compute_flci(lb: float, ub: float, se: float, alpha: float = 0.05) -> Tuple
877922
Standard error of the estimator.
878923
alpha : float
879924
Significance level.
925+
df : int, optional
926+
Degrees of freedom. If provided, uses t-distribution critical value
927+
instead of normal (for survey designs with df = n_PSU - n_strata).
880928
881929
Returns
882930
-------
@@ -895,7 +943,7 @@ def _compute_flci(lb: float, ub: float, se: float, alpha: float = 0.05) -> Tuple
895943
if not (0 < alpha < 1):
896944
raise ValueError(f"alpha must be between 0 and 1, got alpha={alpha}")
897945

898-
z = stats.norm.ppf(1 - alpha / 2)
946+
z = _get_critical_value(alpha, df)
899947
ci_lb = lb - z * se
900948
ci_ub = ub + z * se
901949
return ci_lb, ci_ub
@@ -909,6 +957,7 @@ def _compute_clf_ci(
909957
max_pre_violation: float,
910958
alpha: float = 0.05,
911959
n_draws: int = 1000,
960+
df: Optional[int] = None,
912961
) -> Tuple[float, float, float, float]:
913962
"""
914963
Compute Conditional Least Favorable (C-LF) confidence interval.
@@ -931,6 +980,8 @@ def _compute_clf_ci(
931980
Significance level.
932981
n_draws : int
933982
Number of Monte Carlo draws for conditional CI.
983+
df : int, optional
984+
Degrees of freedom for t-distribution critical value.
934985
935986
Returns
936987
-------
@@ -956,7 +1007,7 @@ def _compute_clf_ci(
9561007
ub = theta + bound
9571008

9581009
# CI with estimation uncertainty
959-
z = stats.norm.ppf(1 - alpha / 2)
1010+
z = _get_critical_value(alpha, df)
9601011
ci_lb = lb - z * se
9611012
ci_ub = ub + z * se
9621013

@@ -1086,7 +1137,7 @@ def fit(
10861137
M = M if M is not None else self.M
10871138

10881139
# Extract event study parameters
1089-
(beta_hat, sigma, num_pre, num_post, pre_periods, post_periods) = (
1140+
(beta_hat, sigma, num_pre, num_post, pre_periods, post_periods, df_survey) = (
10901141
_extract_event_study_params(results)
10911142
)
10921143

@@ -1137,22 +1188,41 @@ def fit(
11371188
# Compute bounds based on method
11381189
if self.method == "smoothness":
11391190
lb, ub, ci_lb, ci_ub = self._compute_smoothness_bounds(
1140-
beta_post, sigma_post, l_vec, num_pre, num_post, M
1191+
beta_post, sigma_post, l_vec, num_pre, num_post, M, df=df_survey
11411192
)
11421193
ci_method = "FLCI"
11431194

11441195
elif self.method == "relative_magnitude":
11451196
lb, ub, ci_lb, ci_ub = self._compute_rm_bounds(
1146-
beta_post, sigma_post, l_vec, num_pre, num_post, M, pre_periods, results
1197+
beta_post,
1198+
sigma_post,
1199+
l_vec,
1200+
num_pre,
1201+
num_post,
1202+
M,
1203+
pre_periods,
1204+
results,
1205+
df=df_survey,
11471206
)
11481207
ci_method = "C-LF"
11491208

11501209
else: # combined
11511210
lb, ub, ci_lb, ci_ub = self._compute_combined_bounds(
1152-
beta_post, sigma_post, l_vec, num_pre, num_post, M, pre_periods, results
1211+
beta_post,
1212+
sigma_post,
1213+
l_vec,
1214+
num_pre,
1215+
num_post,
1216+
M,
1217+
pre_periods,
1218+
results,
1219+
df=df_survey,
11531220
)
11541221
ci_method = "FLCI"
11551222

1223+
# Extract survey_metadata for storage on results
1224+
survey_metadata = getattr(results, "survey_metadata", None)
1225+
11561226
return HonestDiDResults(
11571227
lb=lb,
11581228
ub=ub,
@@ -1165,6 +1235,8 @@ def fit(
11651235
alpha=self.alpha,
11661236
ci_method=ci_method,
11671237
original_results=results,
1238+
survey_metadata=survey_metadata,
1239+
df_survey=df_survey,
11681240
)
11691241

11701242
def _compute_smoothness_bounds(
@@ -1175,6 +1247,7 @@ def _compute_smoothness_bounds(
11751247
num_pre: int,
11761248
num_post: int,
11771249
M: float,
1250+
df: Optional[int] = None,
11781251
) -> Tuple[float, float, float, float]:
11791252
"""Compute bounds under smoothness restriction."""
11801253
# Construct constraints
@@ -1185,7 +1258,7 @@ def _compute_smoothness_bounds(
11851258

11861259
# Compute FLCI
11871260
se = np.sqrt(l_vec @ sigma_post @ l_vec)
1188-
ci_lb, ci_ub = _compute_flci(lb, ub, se, self.alpha)
1261+
ci_lb, ci_ub = _compute_flci(lb, ub, se, self.alpha, df=df)
11891262

11901263
return lb, ub, ci_lb, ci_ub
11911264

@@ -1199,6 +1272,7 @@ def _compute_rm_bounds(
11991272
Mbar: float,
12001273
pre_periods: List,
12011274
results: Any,
1275+
df: Optional[int] = None,
12021276
) -> Tuple[float, float, float, float]:
12031277
"""Compute bounds under relative magnitudes restriction."""
12041278
# Estimate max pre-period violation from pre-trends
@@ -1209,12 +1283,18 @@ def _compute_rm_bounds(
12091283
# No pre-period violations detected - use point estimate
12101284
theta = np.dot(l_vec, beta_post)
12111285
se = np.sqrt(l_vec @ sigma_post @ l_vec)
1212-
z = stats.norm.ppf(1 - self.alpha / 2)
1286+
z = _get_critical_value(self.alpha, df)
12131287
return theta, theta, theta - z * se, theta + z * se
12141288

12151289
# Compute bounds
12161290
lb, ub, ci_lb, ci_ub = _compute_clf_ci(
1217-
beta_post, sigma_post, l_vec, Mbar, max_pre_violation, self.alpha
1291+
beta_post,
1292+
sigma_post,
1293+
l_vec,
1294+
Mbar,
1295+
max_pre_violation,
1296+
self.alpha,
1297+
df=df,
12181298
)
12191299

12201300
return lb, ub, ci_lb, ci_ub
@@ -1229,16 +1309,17 @@ def _compute_combined_bounds(
12291309
M: float,
12301310
pre_periods: List,
12311311
results: Any,
1312+
df: Optional[int] = None,
12321313
) -> Tuple[float, float, float, float]:
12331314
"""Compute bounds under combined smoothness + RM restriction."""
12341315
# Get smoothness bounds
12351316
lb_sd, ub_sd, _, _ = self._compute_smoothness_bounds(
1236-
beta_post, sigma_post, l_vec, num_pre, num_post, M
1317+
beta_post, sigma_post, l_vec, num_pre, num_post, M, df=df
12371318
)
12381319

12391320
# Get RM bounds (use M as Mbar for combined)
12401321
lb_rm, ub_rm, _, _ = self._compute_rm_bounds(
1241-
beta_post, sigma_post, l_vec, num_pre, num_post, M, pre_periods, results
1322+
beta_post, sigma_post, l_vec, num_pre, num_post, M, pre_periods, results, df=df
12421323
)
12431324

12441325
# Combined bounds are intersection
@@ -1252,7 +1333,7 @@ def _compute_combined_bounds(
12521333

12531334
# Compute FLCI on combined bounds
12541335
se = np.sqrt(l_vec @ sigma_post @ l_vec)
1255-
ci_lb, ci_ub = _compute_flci(lb, ub, se, self.alpha)
1336+
ci_lb, ci_ub = _compute_flci(lb, ub, se, self.alpha, df=df)
12561337

12571338
return lb, ub, ci_lb, ci_ub
12581339

0 commit comments

Comments
 (0)