Merge pull request #263 from igerber/survey-docs-update

igerber · web-flow · commit c184e147cf94 · 2026-04-04T12:14:45.000-04:00
Update survey documentation: compatibility matrix, roadmap, deferred work
diff --git a/ROADMAP.md b/ROADMAP.md
@@ -8,15 +8,15 @@ For past changes and release history, see [CHANGELOG.md](CHANGELOG.md).
 
 ## Current Status
 
-diff-diff v2.7.5 is a **production-ready** DiD library with feature parity with R's `did` + `HonestDiD` + `synthdid` ecosystem for core DiD analysis, plus **unique survey support** — design-based variance estimation (Taylor linearization, replicate weights) integrated across all estimators. No R or Python package offers this combination:
+diff-diff v2.8.4 is a **production-ready** DiD library with feature parity with R's `did` + `HonestDiD` + `synthdid` ecosystem for core DiD analysis, plus **unique survey support** — all estimators accept survey weights, with design-based variance estimation varying by estimator. No R or Python package offers this combination:
 
-- **Core estimators**: Basic DiD, TWFE, MultiPeriod, Callaway-Sant'Anna, Sun-Abraham, Borusyak-Jaravel-Spiess Imputation, Synthetic DiD, Triple Difference (DDD), TROP, Two-Stage DiD (Gardner 2022), Stacked DiD (Wing et al. 2024), Continuous DiD (Callaway, Goodman-Bacon & Sant'Anna 2024)
+- **Core estimators**: Basic DiD, TWFE, MultiPeriod, Callaway-Sant'Anna, Sun-Abraham, Borusyak-Jaravel-Spiess Imputation, Synthetic DiD, Triple Difference (DDD), Staggered Triple Difference (Ortiz-Villavicencio & Sant'Anna 2025), TROP, Two-Stage DiD (Gardner 2022), Stacked DiD (Wing et al. 2024), Continuous DiD (Callaway, Goodman-Bacon & Sant'Anna 2024)
 - **Valid inference**: Robust SEs, cluster SEs, wild bootstrap, multiplier bootstrap, placebo-based variance
 - **Assumption diagnostics**: Parallel trends tests, placebo tests, Goodman-Bacon decomposition
 - **Sensitivity analysis**: Honest DiD (Rambachan-Roth), Pre-trends power analysis (Roth 2022)
 - **Study design**: Power analysis tools
 - **Data utilities**: Real-world datasets (Card-Krueger, Castle Doctrine, Divorce Laws, MPDTA), DGP functions for all supported designs
-- **Survey support**: Full `SurveyDesign` with strata, PSU, FPC, weight types, replicate weights (BRR/Fay/JK1/JKn), Taylor linearization, DEFF diagnostics, subpopulation analysis — integrated across all estimators (see [survey-roadmap.md](docs/survey-roadmap.md))
+- **Survey support**: `SurveyDesign` with strata, PSU, FPC, weight types, DEFF diagnostics, subpopulation analysis. All 15 estimators accept survey weights; design-based variance estimation (TSL, replicate weights, survey-aware bootstrap) varies by estimator. Replicate weights (BRR/Fay/JK1/JKn/SDR) supported for 12 of 15; `BaconDecomposition` is diagnostic-only. See [choosing_estimator.rst](docs/choosing_estimator.rst#survey-design-support) for the full compatibility matrix.
 - **Performance**: Optional Rust backend for accelerated computation; faster than R at scale (see [CHANGELOG.md](CHANGELOG.md) for benchmarks)
 
 ---
@@ -34,19 +34,20 @@ full details.
 - **Repeated Cross-Sections** *(Implemented)*: `panel=False` support for
   CallawaySantAnna using cross-sectional DRDID (Sant'Anna & Zhao 2020,
   Section 4). Supports BRFSS, ACS annual, CPS monthly.
-- **Survey-Aware DiD Tutorial** *(Open)*: Jupyter notebook demonstrating
+- **Survey-Aware DiD Tutorial** *(Implemented)*: Jupyter notebook demonstrating
   the full workflow with realistic survey data.
 - **HonestDiD + Survey Variance** *(Implemented)*: Survey df and full
   event-study VCV propagated to sensitivity analysis, with bootstrap/replicate
   diagonal fallback.
 
-### Staggered Triple Difference (DDD)
+### Staggered Triple Difference (DDD) *(Implemented)*
 
-Extend the existing `TripleDifference` estimator to handle staggered adoption settings.
+`StaggeredTripleDifference` estimator for staggered adoption DDD settings.
 
 - Group-time ATT(g,t) for DDD designs with variation in treatment timing
 - Event study aggregation and pre-treatment placebo effects
 - Multiplier bootstrap for valid inference in staggered settings
+- Full survey support (pweight, strata/PSU/FPC, replicate weights)
 
 **Reference**: [Ortiz-Villavicencio & Sant'Anna (2025)](https://arxiv.org/abs/2505.09942). "Better Understanding Triple Differences Estimators." *Working Paper*. R package: `triplediff`.
 
diff --git a/TODO.md b/TODO.md
@@ -15,6 +15,10 @@ Current limitations that may affect users:
 | MultiPeriodDiD wild bootstrap not supported | `estimators.py:778-784` | Low | Edge case |
 | `predict()` raises NotImplementedError | `estimators.py:567-588` | Low | Rarely needed |
 
+For survey-specific limitations (NotImplementedError paths), see the
+[consolidated deferred list](docs/survey-roadmap.md#deferred-work-consolidated)
+in survey-roadmap.md.
+
 ## Code Quality
 
 ### Large Module Files
diff --git a/diff_diff/survey.py b/diff_diff/survey.py
@@ -1087,8 +1087,7 @@ def _resolve_survey_for_fit(survey_design, data, inference_mode="analytical"):
     if inference_mode == "wild_bootstrap":
         raise NotImplementedError(
             "Wild bootstrap with survey weights is not yet supported. "
-            "Use inference='analytical' with survey_design, or see "
-            "docs/survey-roadmap.md for planned Phase 5 support."
+            "Use analytical survey inference (the default) instead."
         )
 
     resolved = survey_design.resolve(data)
diff --git a/docs/choosing_estimator.rst b/docs/choosing_estimator.rst
@@ -571,5 +571,124 @@ If you're unsure which estimator to use:
    investigate why (often reveals violations of assumptions)
 
 5. **Using survey data?** - Pass a ``SurveyDesign`` to ``fit()`` for design-based
-   variance estimation. See the `survey tutorial <https://github.com/igerber/diff-diff/blob/main/docs/tutorials/16_survey_did.ipynb>`_
-   for a full walkthrough with strata, PSU, FPC, replicate weights, and subpopulation analysis.
+   variance estimation. See the :ref:`survey-design-support` section below for
+   the compatibility matrix, and the `survey tutorial <https://github.com/igerber/diff-diff/blob/main/docs/tutorials/16_survey_did.ipynb>`_
+   for a full walkthrough.
+
+.. _survey-design-support:
+
+Survey Design Support
+---------------------
+
+All estimators accept an optional ``survey_design`` parameter in ``fit()``.
+Pass a :class:`~diff_diff.SurveyDesign` object to get design-based variance
+estimation. The depth of support varies by estimator:
+
+.. list-table::
+   :header-rows: 1
+   :widths: 25 12 18 18 18
+
+   * - Estimator
+     - Weights
+     - Strata/PSU/FPC
+     - Replicate Weights
+     - Survey Bootstrap
+   * - ``DifferenceInDifferences``
+     - Full
+     - Full
+     - Full
+     - --
+   * - ``TwoWayFixedEffects``
+     - Full
+     - Full
+     - Full
+     - --
+   * - ``MultiPeriodDiD``
+     - Full
+     - Full
+     - Full
+     - --
+   * - ``CallawaySantAnna``
+     - pweight only
+     - Full
+     - Full
+     - Multiplier at PSU
+   * - ``TripleDifference``
+     - pweight only
+     - Full
+     - Full (analytical)
+     - --
+   * - ``StaggeredTripleDifference``
+     - pweight only
+     - Full
+     - Full
+     - Multiplier at PSU
+   * - ``SunAbraham``
+     - Full
+     - Full
+     - Full
+     - Rao-Wu rescaled
+   * - ``StackedDiD``
+     - pweight only
+     - Full (pweight only)
+     - Full
+     - --
+   * - ``ImputationDiD``
+     - pweight only
+     - Full
+     - Full (analytical)
+     - Multiplier at PSU
+   * - ``TwoStageDiD``
+     - pweight only
+     - Full
+     - Full (analytical)
+     - Multiplier at PSU
+   * - ``ContinuousDiD``
+     - Full
+     - Full
+     - Full (analytical)
+     - Multiplier at PSU
+   * - ``EfficientDiD``
+     - Full
+     - Full
+     - Full (analytical)
+     - Multiplier at PSU
+   * - ``SyntheticDiD``
+     - pweight only
+     - Via bootstrap
+     - --
+     - Rao-Wu rescaled
+   * - ``TROP``
+     - pweight only
+     - Via bootstrap
+     - --
+     - Rao-Wu rescaled
+   * - ``BaconDecomposition``
+     - Diagnostic
+     - Diagnostic
+     - --
+     - --
+
+**Legend:**
+
+- **Full**: All weight types (pweight/fweight/aweight) + strata/PSU/FPC + Taylor Series Linearization variance
+- **Full (pweight only)**: Full TSL with strata/PSU/FPC, but only ``pweight`` accepted (``fweight``/``aweight`` rejected because composition changes weight semantics)
+- **Via bootstrap**: Strata/PSU/FPC supported only with bootstrap variance. ``SyntheticDiD`` requires ``variance_method='bootstrap'``; ``TROP`` uses bootstrap by default. ``SyntheticDiD`` placebo does not support strata/PSU/FPC.
+- **pweight only** (Weights column): Only ``pweight`` accepted; ``fweight``/``aweight`` raise an error
+- **Diagnostic**: Weighted descriptive statistics only (no inference)
+- **--**: Not supported
+
+.. note::
+
+   ``EfficientDiD`` does not support ``covariates`` and ``survey_design``
+   simultaneously (the DR nuisance path does not yet thread survey weights).
+
+.. note::
+
+   ``SyntheticDiD`` with ``variance_method='placebo'`` does not support
+   strata/PSU/FPC. Use ``variance_method='bootstrap'`` for full survey
+   design support.
+
+For the full walkthrough with code examples, see the
+`survey tutorial <https://github.com/igerber/diff-diff/blob/main/docs/tutorials/16_survey_did.ipynb>`_.
+For deferred work and remaining limitations, see ``docs/survey-roadmap.md``.
diff --git a/docs/methodology/REGISTRY.md b/docs/methodology/REGISTRY.md
@@ -2307,10 +2307,11 @@ variance from the distribution of replicate estimates.
   design structure is fixed and dropped replicates contribute zero to the
   sum without changing the scale. Survey df uses `n_valid - 1` for
   t-based inference.
-- **Note:** Replicate-weight support matrix:
-  - **Supported**: CallawaySantAnna (reg/ipw/dr without covariates, no
-    bootstrap), ContinuousDiD (no bootstrap), EfficientDiD (no bootstrap),
-    TripleDifference (all methods), LinearRegression (OLS path),
+- **Note:** Replicate-weight support matrix (12 of 15 public estimators):
+  - **Supported**: CallawaySantAnna (reg/ipw/dr with or without covariates,
+    no bootstrap; IF-based replicate variance is covariate-agnostic),
+    ContinuousDiD (no bootstrap), EfficientDiD (no bootstrap),
+    TripleDifference (all methods), StaggeredTripleDifference (IF-based),
     DifferenceInDifferences (no-absorb via LinearRegression dispatch,
     absorb via estimator-level refit), MultiPeriodDiD (no-absorb via
     `compute_replicate_vcov`, absorb via estimator-level refit),
diff --git a/docs/survey-roadmap.md b/docs/survey-roadmap.md
diff --git a/docs/tutorials/16_survey_did.ipynb b/docs/tutorials/16_survey_did.ipynb

Original file line number	Diff line number	Diff line change
`@@ -1087,8 +1087,7 @@ def _resolve_survey_for_fit(survey_design, data, inference_mode="analytical"):`
`1087`	`1087`	`if inference_mode == "wild_bootstrap":`
`1088`	`1088`	`raise NotImplementedError(`
`1089`	`1089`	`"Wild bootstrap with survey weights is not yet supported. "`
`1090`		`- "Use inference='analytical' with survey_design, or see "`
`1091`		`- "docs/survey-roadmap.md for planned Phase 5 support."`
	`1090`	`+ "Use analytical survey inference (the default) instead."`
`1092`	`1091`	`)`
`1093`	`1092`
`1094`	`1093`	`resolved = survey_design.resolve(data)`