Skip to content

Commit afa0f6c

Browse files
igerberclaude
andcommitted
Address PR #346 CI review round 6: P1 panel-only note + P3 theorem refs
**P1 (Methodology): Phase 2a panel-only not documented as deviation** Paper Section 2 defines HAD on panel OR repeated cross-section data, but Phase 2a ships a panel-only implementation. The existing balanced-panel validator rejects RCS inputs (disjoint unit IDs between periods) with a generic "unit(s) do not appear in both periods" error, but the scope restriction was not documented in REGISTRY.md as a `**Note:**` deviation from the paper's documented surface. Fix: - REGISTRY.md: new `**Note (Phase 2a panel-only):**` block under the HAD Phase 2a entry, explaining the restriction and pointing at the follow-up RCS PR. - `docs/methodology/papers/dechaisemartin-2026-review.md`: mirrored implementation note on the panel-or-RCS scope line. - `HeterogeneousAdoptionDiD.fit()` docstring: new preamble paragraph stating the panel-only restriction and why (unit-level first differences required). - `TODO.md`: new row queuing RCS identification-path work. - `tests/test_had.py`: two regression tests (`test_repeated_cross_section_raises` and `test_repeated_cross_section_fit_raises`) that construct RCS-shaped input and assert the validator + fit() raise with the expected error. **P3 (Docs/Tests): theorem/equation references were off** Module docstring, result-class `att` docstring, and `_fit_continuous()` docstring cited the paper's theorems/equations inconsistently with the registry's source map: - Design 1' identification: said "Theorem 3" -> should be Theorem 1 with Equation 3; Equation 7 is the sample estimator. - Design 1 continuous-near-d_lower: said "Proposition 3 / Theorem 4" -> should be Theorem 3 / Equation 11 (Theorem 4 is the QUG null test, not this estimand). Fix: updated all three docstring sites to cite Theorem 1 / Equation 3 (Design 1' identification) + Equation 7 (sample estimator), and Theorem 3 / Equation 11 (Design 1 continuous-near-d_lower / WAS_d_lower under Assumption 6). Targeted regression: 141 HAD tests + 520 total across Phase 1 and adjacent surfaces, all green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 2884c1d commit afa0f6c

5 files changed

Lines changed: 79 additions & 10 deletions

File tree

TODO.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -100,6 +100,7 @@ Deferred items from PR reviews that were not addressed before merge.
100100
| `HeterogeneousAdoptionDiD` Phase 4: Pierce-Schott (2016) replication harness; reproduce paper Figure 2 values and Table 1 coverage rates. | `benchmarks/`, `tests/` | Phase 2a | Low |
101101
| `HeterogeneousAdoptionDiD` Phase 5: `practitioner_next_steps()` integration, tutorial notebook, and `llms.txt` updates (preserving UTF-8 fingerprint). | `diff_diff/practitioner.py`, `tutorials/`, `diff_diff/guides/` | Phase 2a | Low |
102102
| `HeterogeneousAdoptionDiD` staggered-timing reduction: Phase 2a requires exactly 2 time periods and raises on `>2` periods with or without `first_treat_col`. A "last-cohort subgroup" reduction scheme (slice to max-cohort's 2-period window) could lift this in a targeted follow-up PR before full Phase 2b multi-period aggregation. | `diff_diff/had.py::_validate_had_panel` | Phase 2a | Low |
103+
| `HeterogeneousAdoptionDiD` repeated-cross-section support: paper Section 2 defines HAD on panel OR repeated cross-section, but Phase 2a is panel-only. RCS inputs (disjoint unit IDs between periods) are rejected by the balanced-panel validator with the generic "unit(s) do not appear in both periods" error. A follow-up PR will add an RCS identification path based on pre/post cell means (rather than unit-level first differences), with its own validator and a distinct `data_mode` / API surface. | `diff_diff/had.py::_validate_had_panel`, `diff_diff/had.py::_aggregate_first_difference` | Phase 2a | Medium |
103104

104105
#### Performance
105106

diff_diff/had.py

Lines changed: 24 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,8 @@
99
:func:`diff_diff.utils.safe_inference`.
1010
1111
1. Design 1' (``continuous_at_zero``): ``d_lower = 0``, boundary density
12-
continuous at zero, Assumption 3. Equation 7 / Theorem 3:
12+
continuous at zero, Assumption 3. Theorem 1 / Equation 3
13+
(identification); Equation 7 (sample estimator):
1314
1415
beta = (E[Delta Y] - lim_{d v 0} E[Delta Y | D_2 <= d]) / E[D_2]
1516
@@ -19,7 +20,8 @@
1920
2021
2. Design 1 continuous-near-d_lower (``continuous_near_d_lower``):
2122
``d_lower > 0``, continuous boundary density, Assumption 5 or 6.
22-
Proposition 3 / Theorem 4 (``WAS_{d_lower}`` under Assumption 6):
23+
Theorem 3 / Equation 11 (``WAS_{d_lower}`` under Assumption 6;
24+
Theorem 4 is the QUG null test, not this estimand):
2325
2426
beta = (E[Delta Y] - lim_{d v d_lower} E[Delta Y | D_2 <= d])
2527
/ E[D_2 - d_lower]
@@ -135,13 +137,14 @@ class HeterogeneousAdoptionDiDResults:
135137
att : float
136138
Point estimate of the WAS parameter on the beta-scale.
137139
138-
- Design 1' (paper Equation 7 / Theorem 3):
140+
- Design 1' (paper Theorem 1 / Equation 3 identification;
141+
Equation 7 sample estimator):
139142
``att = (mean(ΔY) - tau_bc) / D_bar``
140143
where ``tau_bc`` is the bias-corrected local-linear estimate
141144
of ``lim_{d v 0} E[ΔY | D_2 <= d]`` and
142145
``D_bar = (1/G) * sum(D_{g,2})``.
143-
- Design 1 continuous-near-d_lower (paper Theorem 4,
144-
``WAS_{d_lower}`` under Assumption 6):
146+
- Design 1 continuous-near-d_lower (paper Theorem 3 /
147+
Equation 11, ``WAS_{d_lower}`` under Assumption 6):
145148
``att = (mean(ΔY) - tau_bc) / mean(D_2 - d_lower)``
146149
where ``tau_bc`` is the bias-corrected local-linear estimate
147150
of ``lim_{d v d_lower} E[ΔY | D_2 <= d]``.
@@ -1006,6 +1009,17 @@ def fit(
10061009
) -> HeterogeneousAdoptionDiDResults:
10071010
"""Fit the HAD estimator on a two-period panel.
10081011
1012+
Phase 2a is **panel-only**: the paper (Section 2) defines HAD on
1013+
panel or repeated-cross-section data, but this implementation
1014+
requires a balanced two-period panel with a unit identifier so
1015+
that unit-level first differences ``ΔY_g = Y_{g,2} - Y_{g,1}``
1016+
can be formed. Repeated-cross-section inputs (disjoint unit IDs
1017+
between periods) are rejected by the balanced-panel validator.
1018+
Repeated-cross-section support is queued for a follow-up PR
1019+
(tracked in ``TODO.md``); it requires a separate identification
1020+
path based on pre/post cell means rather than unit-level
1021+
differences.
1022+
10091023
Parameters
10101024
----------
10111025
data : pd.DataFrame
@@ -1337,16 +1351,17 @@ def _fit_continuous(
13371351
Implements de Chaisemartin, Ciccia, D'Haultfoeuille, and Knau
13381352
(2026) continuous-design estimators:
13391353
1340-
- Design 1' (``continuous_at_zero``), paper Equation 7 /
1341-
Theorem 3:
1354+
- Design 1' (``continuous_at_zero``), paper Theorem 1 /
1355+
Equation 3 (identification); Equation 7 (sample estimator):
13421356
13431357
beta = (E[Delta Y] - lim_{d v 0} E[Delta Y | D_2 <= d]) / E[D_2]
13441358
13451359
Regressor passed to the local-linear boundary fit is
13461360
``d_arr``; the boundary is ``0``.
13471361
1348-
- Design 1 (``continuous_near_d_lower``), paper Proposition 3 /
1349-
Theorem 4 (``WAS_{d_lower}`` under Assumption 6):
1362+
- Design 1 (``continuous_near_d_lower``), paper Theorem 3 /
1363+
Equation 11 (``WAS_{d_lower}`` under Assumption 6; note
1364+
Theorem 4 is the QUG null test, not this estimand):
13501365
13511366
beta = (E[Delta Y] - lim_{d v d_lower} E[Delta Y | D_2 <= d])
13521367
/ E[D_2 - d_lower]

docs/methodology/REGISTRY.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2324,6 +2324,7 @@ Shipped as `did_had_pretest_workflow()` and surfaced via `practitioner_next_step
23242324
- **Note (Design 1 identification):** `continuous_near_d_lower` and `mass_point` fits emit a `UserWarning` surfacing that `WAS_{d̲}` identification requires Assumption 6 (or Assumption 5 for sign identification only) beyond parallel trends, and that neither is testable via pre-trends. `continuous_at_zero` (Design 1', Assumption 3 only) does not emit this warning.
23252325
- **Note (CI endpoints):** Because the continuous-path `att` is `(mean(ΔY) - τ̂_bc) / den`, the beta-scale CI endpoints reverse relative to the Phase 1c boundary-limit CI: `CI_lower(β̂) = (mean(ΔY) - CI_upper(τ̂_bc)) / den` and `CI_upper(β̂) = (mean(ΔY) - CI_lower(τ̂_bc)) / den`. The `HeterogeneousAdoptionDiD.fit()` implementation computes `att ± z · se` directly via `safe_inference`, which handles the reversal naturally from the transformed point estimate.
23262326
- **Note (Phase 2a scope):** Ships single-period only. `aggregate="event_study"` (Appendix B.2 multi-period extension) raises `NotImplementedError` pointing to Phase 2b. `survey=` and `weights=` kwargs raise `NotImplementedError` pointing to the follow-up survey-integration PR.
2327+
- **Note (Phase 2a panel-only):** The paper (Section 2) defines HAD on *panel or repeated cross-section* data, but Phase 2a ships a panel-only implementation: `HeterogeneousAdoptionDiD.fit()` requires a balanced two-period panel with a unit identifier so that unit-level first differences ΔY_g = Y_{g,2} - Y_{g,1} can be formed. Repeated-cross-section inputs (disjoint unit IDs between periods) are rejected by the existing balanced-panel validator with the "unit(s) do not appear in both periods" error. Repeated-cross-section support is queued for a follow-up PR (tracked in `TODO.md`); it will need a separate identification path based on pre/post cell means rather than unit-level differences.
23272328
- [ ] Phase 2b: Multi-period event-study extension (Appendix B.2). Will lift the `aggregate="event_study"` scaffolding in `HeterogeneousAdoptionDiD.fit()` and aggregate per-cohort first-differences into an event-study result.
23282329
- [ ] Phase 3: `qug_test()` (`T = D_{2,(1)} / (D_{2,(2)} - D_{2,(1)})`, rejection `{T > 1/α - 1}`).
23292330
- [ ] Phase 3: `stute_test()` Cramér-von Mises with Mammen wild bootstrap.

docs/methodology/papers/dechaisemartin-2026-review.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@
2020
**Key implementation requirements:**
2121

2222
*Assumption checks / warnings:*
23-
- Data must be panel (or repeated cross-section) with `D_{g,1} = 0` for all `g` (nobody treated in period one).
23+
- Data must be panel (or repeated cross-section) with `D_{g,1} = 0` for all `g` (nobody treated in period one). **Phase 2a implementation note:** `diff_diff.HeterogeneousAdoptionDiD` ships panel-only; repeated-cross-section inputs are rejected by the balanced-panel validator. RCS support is queued for a follow-up PR.
2424
- Treatment dose `D_{g,2} >= 0`. For Design 1' (the QUG case) the support infimum `d̲ := inf Supp(D_{g,2})` must equal 0; for Design 1 (no QUG) `d̲ > 0` and Assumption 5 or 6 must be invoked.
2525
- Assumption 1 (i.i.d. sample): `(Y_{g,1}, Y_{g,2}, D_{g,1}, D_{g,2})_{g=1,...,G}` i.i.d.
2626
- Assumption 2 (parallel trends for the least-treated): `lim_{d ↓ d̲} E[ΔY(0) | D_2 ≤ d] = E[ΔY(0)]`. Testable with pre-trends when a pre-treatment period `t=0` exists. Reduces to standard parallel trends when treatment is binary.

tests/test_had.py

Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1546,6 +1546,58 @@ def test_both_all_zero_periods_raises(self):
15461546
with pytest.raises(ValueError, match="variation"):
15471547
_validate_had_panel(panel, "outcome", "dose", "period", "unit", None)
15481548

1549+
def test_repeated_cross_section_raises(self):
1550+
"""Review P1 round 6: Phase 2a is panel-only. An RCS input (disjoint
1551+
unit IDs across periods) must be rejected by the balanced-panel
1552+
validator with the "unit(s) do not appear in both periods" error.
1553+
"""
1554+
rng = np.random.default_rng(0)
1555+
G = 100
1556+
pre = pd.DataFrame(
1557+
{
1558+
"unit": np.arange(G),
1559+
"period": 1,
1560+
"dose": np.zeros(G),
1561+
"outcome": rng.standard_normal(G),
1562+
}
1563+
)
1564+
post = pd.DataFrame(
1565+
{
1566+
"unit": np.arange(G, 2 * G),
1567+
"period": 2,
1568+
"dose": rng.uniform(0, 1, G),
1569+
"outcome": rng.standard_normal(G),
1570+
}
1571+
)
1572+
rcs = pd.concat([pre, post], ignore_index=True)
1573+
with pytest.raises(ValueError, match=r"both periods|[Uu]nbalanced"):
1574+
_validate_had_panel(rcs, "outcome", "dose", "period", "unit", None)
1575+
1576+
def test_repeated_cross_section_fit_raises(self):
1577+
"""End-to-end: fit() on an RCS panel raises ValueError."""
1578+
rng = np.random.default_rng(0)
1579+
G = 100
1580+
pre = pd.DataFrame(
1581+
{
1582+
"unit": np.arange(G),
1583+
"period": 1,
1584+
"dose": np.zeros(G),
1585+
"outcome": rng.standard_normal(G),
1586+
}
1587+
)
1588+
post = pd.DataFrame(
1589+
{
1590+
"unit": np.arange(G, 2 * G),
1591+
"period": 2,
1592+
"dose": rng.uniform(0, 1, G),
1593+
"outcome": rng.standard_normal(G),
1594+
}
1595+
)
1596+
rcs = pd.concat([pre, post], ignore_index=True)
1597+
est = HeterogeneousAdoptionDiD()
1598+
with pytest.raises(ValueError, match=r"both periods|[Uu]nbalanced"):
1599+
est.fit(rcs, "outcome", "dose", "period", "unit")
1600+
15491601

15501602
# =============================================================================
15511603
# Review P1: continuous_near_d_lower on a true mass-point sample rejects

0 commit comments

Comments
 (0)