Wave 4 Tier A drain: EfficientDiD anticipation note, generate_ddd_panel_data, TROP data-setup helper#455
Conversation
|
Overall Assessment
Executive Summary
Methodology
Code Quality Performance Maintainability
Tech Debt
Security Documentation/Tests Path to Approval
|
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment
Executive Summary
Methodology No findings. The previous DDD generator issue is fixed, and the new generator now enforces the 8-cell contract that Code Quality No findings. The TROP extraction is a straightforward consolidation of duplicated setup logic into one helper used by both fit paths ( Performance No findings. Maintainability
Tech Debt
Security No findings. Documentation/Tests No blocking findings. Public API exposure and docs for |
The `control_group="last_cohort"` path in EfficientDiD.fit (line 470) trims periods at `last_g - anticipation`, excluding anticipation-contaminated periods from the pseudo-control's pre-treatment window. REGISTRY.md previously described only the `anticipation=0` case. Aligns both the Edge Cases bullet and the algorithm Note to the code's actual behavior, cross-references the interaction in the `control_group` and `anticipation` docstring entries, and adds a regression test asserting the trim cuts at `last_g - anticipation` rather than `last_g`. Closes Tier-A backlog item 2. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a new public DGP function that generates a balanced panel of n_units observed over n_periods with two unit-level binary dimensions (`group`, `partition`, time-invariant) and a derived `post` indicator. The DDD-CPT identifying assumption holds because `group_partition_interaction` enters only as a unit-level (time-invariant) effect, leaving the triple interaction `treatment_effect * group * partition * post` as the sole source of differential group × partition trend. The existing cross-sectional `generate_ddd_data` remains unchanged. Compatible with `TripleDifference.fit(..., time="post")` directly; the binary 2×2×2 estimator surface is unchanged. Auto-routing of `power.simulate_power` to the panel DGP for `n_periods > 2` is deferred to a follow-up (TODO.md row added). Exports: top-level `diff_diff` and `diff_diff.prep` re-export; new autofunction stub in docs/api/prep.rst. Tests in tests/test_prep.py::TestGenerateDddPanelData (14 tests) including a deterministic recovery test (noise_sd=0, ATT recovery to ~1e-15) and a finite-sample recovery test. Closes Tier-A backlog item 3. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`TROP.fit()` (local path) and `_fit_global()` previously duplicated ~85 lines of data-setup logic each: panel pivoting, absorbing-state validation, treated/control unit identification, first-treatment-period detection, and pre/post period counting. The duplicated blocks were near-identical line-by-line, differing only in which index mappings the caller built (local built all four; global built only the forward maps). Extracts `_setup_trop_data(...)` in trop_local.py (alongside the existing `_validate_and_pivot_treatment` helper). Both callers now invoke it and unpack only the fields they consume. The helper returns all four index mappings (`unit_to_idx`, `period_to_idx`, `idx_to_unit`, `idx_to_period`) uniformly, eliminating a class of subtle drift bug; `_fit_global` gains two unused locals as a trade-off. The global-method-specific staggered-adoption check stays in `_fit_global` as a post-helper validation (it depends on estimator semantics, not data preparation). Bootstrap-loop dedup (~40 LoC across `_bootstrap_variance` / `_bootstrap_variance_global`) is deferred to a follow-up (TODO.md row added). Adds a parity regression test `TestTROPModuleSplit::test_setup_trop_data_internal_contract` that round-trip-verifies the index mappings, shape consistency, and treated/control partition disjointness. Behavior-preserving: TROP test suite (84 non-slow tests) is the safety net. Closes Tier-A backlog item 4. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Removes three Methodology/Correctness rows (EfficientDiD anticipation trim, generate_ddd_panel_data, TROP fit/_fit_global dedup) and the three corresponding Tier A bullets, addressed in this PR. Adds two follow-up rows for the deferred scope (TROP bootstrap-loop dedup, TripleDifference power auto-routing). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…l_data The R1 review identified that independent marginal sampling of `group` and `partition` could leave one of the four (group, partition) cells empty under valid inputs — e.g., `n_units=4, group_frac=partition_frac=0.25` rounds to `n_group_1=1` and `n_p1_g1=0`, so the (1, 1) cell collapses before `TripleDifference.fit`'s 2x2x2 cell-presence check. Switches to stratified allocation: assign `group` to its requested fraction, then within each group stratum, draw `partition=1` at `partition_frac`. Adds a targeted `ValueError` when the rounded cell counts would leave any (group, partition) cell empty (with the four cell counts in the error message so users can pick a feasible config). Adds two regression tests: - `test_infeasible_cell_counts_raise` exercises both the `n_units=4` small-marginal case and an `n_units=10, group_frac=0.1` case. - `test_smallest_feasible_config_populates_all_cells` verifies the smallest feasible config (`n_units=4, fracs=0.5`) yields one unit per cell and that `TripleDifference.fit(..., time="post")` succeeds on it (the contract the docstring advertises). Updates the `group_frac` / `partition_frac` docstring entries to describe the stratified allocation guarantee, and the `[Unreleased]` CHANGELOG entry to mention the cell-coverage invariant. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
03fce79 to
4aa3009
Compare
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment
One unmitigated P1 remains in the new Executive Summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
Path to Approval
|
R3 review flagged that `generate_ddd_panel_data` advertises direct `TripleDifference().fit(..., time="post")` usage, but `TripleDifference` is the repeated-cross-section `panel=FALSE` estimator whose default analytical SE treats each row as iid (df = n_obs - 8). With the new panel-shaped output (unit FE + within-unit serial correlation), unclustered SE understates variance and overstates power. Updates: - DGP docstring gains a `.. warning::` block stating the panel + cluster requirement; the `Examples` block now demonstrates `TripleDifference(cluster="unit").fit(...)` as the recommended pattern. - REGISTRY.md `## TripleDifference` SE section gains a "Note (panel-shaped input, `generate_ddd_panel_data`)" paragraph documenting the repeated-cross-section semantics and the cluster contract. - CHANGELOG `[Unreleased]` entry tightened to mention the cluster requirement explicitly. - New test `test_recommended_clustered_panel_path` locks the documented pattern: clustered fit succeeds, `n_clusters == n_units`, ATT point estimate is invariant to clustering, but SE differs between clustered and unclustered fits (within-unit correlation is non-trivial). Point estimate semantics unchanged. Fix is documentation + invariance test only. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment
No unmitigated P0/P1 findings remain in the changed code. The prior rerun P1 is addressed: Executive Summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
|
Summary
efficient_did: align REGISTRY note withlast_cohort × anticipationtrim (efficient_did.py:470already trims atlast_g - anticipation; REGISTRY now describes theanticipation>0case; docstring cross-references added; regression test inTestLastCohortControl).prep_dgp: addgenerate_ddd_panel_datafor panel-structured DDD power-analysis DGPs (balanced panel ofn_units × n_periods; unit-level time-invariantgroup/partition; derivedpost; DDD-CPT preserved by construction). Exported fromdiff_diffanddiff_diff.prep; autofunction stub indocs/api/prep.rst; 14 new tests including a deterministic recovery test (noise_sd=0, ATT recovery to ~1e-15); CHANGELOG entry under[Unreleased] ### Added. Cross-sectionalgenerate_ddd_dataunchanged.trop: extract shared_setup_trop_data(...)helper fromTROP.fit()(local path) and_fit_global()— ~85 LoC of near-identical data-setup logic deduped into one helper introp_local.py. The global-method staggered-adoption check stays in_fit_globalas a post-helper validation. Helper returns all four index mappings uniformly (unit_to_idx/period_to_idx/idx_to_unit/idx_to_period); parity regression test verifies round-trip + shape consistency. Behavior-preserving (84 non-slow TROP tests green).Methodology references (required if estimator / math changes)
generate_ddd_panel_datais a new utility DGP, not an estimator; TROP refactor is data-setup extraction, not methodology.last_g - anticipationtrim is the existing code's behavior (PR EfficientDiD: cluster-robust SEs, last-cohort control, Hausman pretest, small cohort warning #230); this PR aligns REGISTRY to that documented behavior. The DDD-CPT identifying assumption forgenerate_ddd_panel_datafollows the standard triple-difference identification (seegenerate_ddd_datafor the cross-sectional analog).Validation
tests/test_efficient_did.py::TestLastCohortControl::test_last_cohort_with_anticipation_trims_at_last_g_minus_anticipation(regression guard for the anticipation trim).tests/test_prep.py::TestGenerateDddPanelData(14 tests including deterministic ATT recovery, finite-sample ATT recovery, validation, balance, time-invariance).tests/test_trop.py::TestTROPModuleSplit::test_setup_trop_data_internal_contract(parity regression for the shared helper's return contract).Security / privacy
Generated with Claude Code