igerber
diff --git a/‎ROADMAP.md‎
Lines changed: 3 additions & 3 deletions b/‎ROADMAP.md‎
Lines changed: 3 additions & 3 deletions
diff --git a/‎diff_diff/chaisemartin_dhaultfoeuille.py‎
Lines changed: 625 additions & 82 deletions b/‎diff_diff/chaisemartin_dhaultfoeuille.py‎
Lines changed: 625 additions & 82 deletions
diff --git a/‎diff_diff/chaisemartin_dhaultfoeuille_results.py‎
Lines changed: 7 additions & 2 deletions b/‎diff_diff/chaisemartin_dhaultfoeuille_results.py‎
Lines changed: 7 additions & 2 deletions
diff --git a/‎diff_diff/guides/llms-full.txt‎
Lines changed: 5 additions & 5 deletions b/‎diff_diff/guides/llms-full.txt‎
Lines changed: 5 additions & 5 deletions
diff --git a/‎diff_diff/guides/llms-practitioner.txt‎
Lines changed: 27 additions & 0 deletions b/‎diff_diff/guides/llms-practitioner.txt‎
Lines changed: 27 additions & 0 deletions
diff --git a/‎diff_diff/honest_did.py‎
Lines changed: 11 additions & 1 deletion b/‎diff_diff/honest_did.py‎
Lines changed: 11 additions & 1 deletion
diff --git a/‎docs/choosing_estimator.rst‎
Lines changed: 6 additions & 6 deletions b/‎docs/choosing_estimator.rst‎
Lines changed: 6 additions & 6 deletions
@@ -181,7 +181,7 @@ The dynamic companion paper subsumes the AER 2020 paper: `DID_1 = DID_M`. The si
 
 These are referenced by the dCDH papers but live in *separate* efforts or *separate* companion papers we don't yet have:
 
-- **Survey design integration** — deferred to a separate effort after all three phases ship. Phase 1 documents "no survey support" in the compatibility matrix; the separate effort revisits when Phase 3 is complete.
+- **Survey design integration** — shipped. Supports pweight with strata/PSU/FPC via Taylor Series Linearization. Replicate weights and PSU-level bootstrap deferred.
 - **Fuzzy DiD** (within-cell-varying treatment, Web Appendix Section 1.7 of dynamic paper) → de Chaisemartin & D'Haultfœuille (2018), separate paper not yet reviewed
 - **Principled anticipation handling and trimming rules** (footnote 14 of dynamic paper) → de Chaisemartin (2021), separate paper not yet reviewed
 - **2SLS DiD** (referenced in AER appendix Section 3.4) → separate paper
@@ -191,11 +191,11 @@ These remain in **Future Estimators** below if/when we choose to extend.
 ### Architectural notes (for plan and PR reviewers)
 
 - **Single `ChaisemartinDHaultfoeuille` class** (alias `DCDH`). Not a family. New features land as `fit()` parameters or fields on the results dataclass. No `DCDHDynamic`, `DCDHCovariate`, etc. Matches the library's idiomatic pattern: `CallawaySantAnna`, `ImputationDiD`, and `EfficientDiD` are all single classes that evolved across many phases.
-- **Forward-compatible API from Phase 1.** `fit(aggregate=None, controls=None, trends_linear=None, L_max=None, ...)` accepts the Phase 2/3 parameters from day one and raises `NotImplementedError` with a clear pointer to the relevant phase until they are implemented. No signature changes between phases.
+- **Forward-compatible API from Phase 1.** `fit(aggregate=None, controls=None, trends_linear=None, L_max=None, ...)` accepts the Phase 2/3 parameters from day one and raised `NotImplementedError` with a clear pointer to the relevant phase until they were implemented. As of the dCDH work, Phase 2, Phase 3, and `survey_design` are all live; only `aggregate` remains gated with `NotImplementedError`. No signature changes between phases.
 - **Conservative CI** under Assumption 8 (independent groups), exact only under iid sampling. Documented in REGISTRY.md as a `**Note:**` deviation from "default nominal coverage." Theorem 1 of the dynamic paper.
 - **Cohort recentering for variance is essential.** Cohorts are defined by the triple `(D_{g,1}, F_g, S_g)`. The plug-in variance subtracts cohort-conditional means, **NOT a single grand mean**. Test fixtures must catch this — a wrong implementation silently produces a smaller, incorrect variance.
 - **No Rust acceleration is planned for any phase.** The estimator's hot path is groupby + BLAS-accelerated matrix-vector products, where NumPy already operates near-optimally. If profiling on large panels (`G > 100K`) reveals a bottleneck post-ship, the existing `_rust_bootstrap_weights` helper can be reused for the bootstrap loop without writing new Rust code.
-- **No survey design integration in any phase.** Handled as a separate effort after all three phases ship. Phase 1 documents the absence in the compatibility matrix so survey users do not silently apply survey weights and get wrong answers.
+- **Survey design integration shipped.** Supports pweight with strata/PSU/FPC via TSL. Replicate weights and PSU-level bootstrap deferred to a follow-up.
 
 ---
 
 
@@ -342,8 +342,13 @@ class ChaisemartinDHaultfoeuilleResults:
         ``compute_honest_did(results)`` post-hoc. Contains identified
         set bounds, robust confidence intervals, and breakdown analysis.
     survey_metadata : Any, optional
-        Always ``None`` in Phase 1 — survey integration is deferred to a
-        separate effort after all phases ship.
+        Populated when ``fit(..., survey_design=sd)`` is called; ``None``
+        otherwise. Carries the resolved survey design summary
+        (``weight_type``, strata/PSU counts, ``df_survey``, weight range,
+        and replicate-method info when applicable). ``df_survey`` is
+        threaded into survey-aware inference (t-distribution at all
+        analytical surfaces) and consumed by ``compute_honest_did()`` to
+        produce survey-aware critical values.
     bootstrap_results : DCDHBootstrapResults, optional
         Bootstrap inference results when ``n_bootstrap > 0``.
     """
 
@@ -265,12 +265,12 @@ est.fit(
     trends_linear: bool | None = None,           # Phase 3: DID^{fd}
     trends_nonparam: Any | None = None,          # Phase 3: DID^s
     honest_did: bool = False,                    # Phase 3: HonestDiD integration
-    # ---- deferred (separate effort) ----
-    survey_design: Any = None,
+    # ---- survey support ----
+    survey_design: SurveyDesign | None = None,    # pweight + strata/PSU/FPC (TSL)
 ) -> ChaisemartinDHaultfoeuilleResults
 ```
 
-`L_max` controls multi-horizon computation. Phase 3 parameters raise `NotImplementedError`.
+`L_max` controls multi-horizon computation. Phase 3 parameters (`controls`, `trends_linear`, `trends_nonparam`, `honest_did`, `heterogeneity`, `design2`) and `survey_design` are implemented; only `aggregate` remains gated with `NotImplementedError`.
 
 **Usage:**
 
@@ -322,7 +322,7 @@ print(f"sigma_fe (sign-flipping threshold): {diagnostic.sigma_fe:.3f}")
 - Validated against R `DIDmultiplegtDYN` v2.3.3 at horizon `l = 1` via `tests/test_chaisemartin_dhaultfoeuille_parity.py`
 - Phase 1 placebo SE is intentionally `NaN` with a warning. The dynamic companion paper Section 3.7.3 derives the cohort-recentered analytical variance for `DID_l` only — not for the placebo `DID_M^pl`. Phase 2 will add multiplier-bootstrap support for the placebo. Until then, the placebo point estimate is meaningful but its inference fields stay NaN-consistent **even when `n_bootstrap > 0`** (bootstrap currently covers `DID_M`, `DID_+`, and `DID_-` only)
 - The analytical CI is conservative under Assumption 8 (independent groups) of the dynamic companion paper, exact only under iid sampling
-- Survey design (`survey_design`) is not yet supported and is deferred to a separate effort after all phases ship
+- Survey design supported: pweight with strata/PSU/FPC via Taylor Series Linearization. Replicate weights and PSU-level bootstrap deferred
 
 ### SunAbraham
 
@@ -1021,7 +1021,7 @@ Returned by `SyntheticDiD.fit()`.
 | `time_weights` | `dict` | Pre-treatment time weights |
 | `pre_periods` | `list` | Pre-treatment periods |
 | `post_periods` | `list` | Post-treatment periods |
-| `variance_method` | `str` | "bootstrap" or "placebo" |
+| `variance_method` | `str` | "bootstrap", "jackknife", or "placebo" |
 | `noise_level` | `float` | Estimated noise level |
 | `zeta_omega` | `float` | Unit weight regularization |
 | `zeta_lambda` | `float` | Time weight regularization |
 
@@ -173,6 +173,12 @@ Is treatment adoption staggered (multiple cohorts, different timing)?
 |-- NO, simple 2x2 design:
 |   \-- DifferenceInDifferences (DiD)
 |
+|-- Treatment switches ON and OFF (reversible / non-absorbing)?
+|   \-- ChaisemartinDHaultfoeuille (dCDH / alias `DCDH`)
+|       -- Only library estimator for non-absorbing treatments; supports
+|          L_max multi-horizon, dynamic placebos, cost-benefit delta,
+|          HonestDiD, and `survey_design=` (pweight + strata/PSU/FPC via TSL)
+|
 |-- Few treated units (< 20)?
 |   \-- SyntheticDiD (SDiD)    -- synthetic control + DiD hybrid
 |
@@ -260,6 +266,27 @@ results = es.fit(data, outcome='y', unit='unit_id', time='period',
 print(results.summary())
 ```
 
+### Reversible (non-absorbing) treatment with survey design
+Use `ChaisemartinDHaultfoeuille` (dCDH) when treatment switches ON and OFF.
+Pass `survey_design=SurveyDesign(...)` for design-based inference via Taylor
+Series Linearization. Only `weight_type='pweight'` is supported; replicate
+weights are deferred. When combined with `n_bootstrap > 0`, dCDH emits a
+`UserWarning` and falls back to group-level multiplier bootstrap — prefer
+the analytical TSL path for survey-aware inference.
+
+```python
+from diff_diff import ChaisemartinDHaultfoeuille, SurveyDesign
+
+sd = SurveyDesign(weights='pw', strata='stratum', psu='cluster', nest=True)
+results = ChaisemartinDHaultfoeuille().fit(
+    data, outcome='y', group='unit_id', time='period',
+    treatment='treated',
+    L_max=3,                  # multi-horizon event study
+    survey_design=sd,         # survey-aware analytical SE (TSL)
+)
+print(results.summary())
+```
+
 ---
 
 ## Step 6: Sensitivity Analysis
 
@@ -967,14 +967,24 @@ def _largest_consecutive_block(times, boundary_val):
                 beta_hat = np.array(effects)
                 sigma = np.diag(np.array(ses) ** 2)
 
+                # Extract survey df. For replicate designs with undefined df
+                # (rank <= 1), use sentinel df=0 so _get_critical_value returns
+                # NaN, matching the safe_inference contract.
+                df_survey = None
+                if hasattr(results, "survey_metadata") and results.survey_metadata is not None:
+                    sm = results.survey_metadata
+                    df_survey = getattr(sm, "df_survey", None)
+                    if df_survey is None and getattr(sm, "replicate_method", None) is not None:
+                        df_survey = 0  # undefined replicate df → NaN inference
+
                 return (
                     beta_hat,
                     sigma,
                     len(pre_times),
                     len(post_times),
                     pre_times,
                     post_times,
-                    None,  # df_survey: dCDH has no survey support
+                    df_survey,
                 )
         except ImportError:
             pass
 
@@ -293,9 +293,9 @@ Phase 3 will add covariate adjustment.
 
 .. note::
 
-   ``ChaisemartinDHaultfoeuille`` does not yet support ``survey_design``;
-   passing it raises ``NotImplementedError``. Survey integration is
-   deferred to a separate effort after Phases 2 and 3 ship.
+   ``ChaisemartinDHaultfoeuille`` supports ``survey_design`` with pweight
+   and strata/PSU/FPC via Taylor Series Linearization. Replicate weights
+   are not yet supported.
 
 Synthetic DiD
 ~~~~~~~~~~~~~
@@ -726,10 +726,10 @@ estimation. The depth of support varies by estimator:
      - Full
      - Multiplier at PSU
    * - ``ChaisemartinDHaultfoeuille``
+     - pweight only
+     - Full (TSL)
      - --
-     - --
-     - --
-     - --
+     - Group-level (warning)
    * - ``TripleDifference``
      - pweight only
      - Full