Skip to content

Commit 4e829a2

Browse files
igerberclaude
andcommitted
Address PR #447 R1 review (2 P2, 1 self-audit)
P2 — SE-consistency rewrite misstated the public API The rewrite listed cr1, cr2, bootstrap as vcov_type values. The actual validated set in linalg.py::_VALID_VCOV_TYPES is {"classical","hc1","hc2","hc2_bm","conley"}; cluster-robust variance is obtained via cluster= alongside the heteroscedasticity kind (hc1+cluster ⇒ CR1, hc2_bm+cluster ⇒ CR2 Bell-McCaffrey), and wild cluster bootstrap is a separate inference="wild_bootstrap" path on the same estimator. Rewrote the SE Consistency paragraph to match the actual API. P2 — Large Module Files table omitted 8 modules already ≥1000 lines The refreshed inventory in section 24-56 missed: _nprobust_port.py (1412), practitioner.py (1402), trop_global.py (1350), trop_local.py (1339), local_linear.py (1332), wooldridge.py (1305), chaisemartin_dhaultfoeuille_bootstrap.py (1175), stacked_did.py (1050). Mechanically regenerated from wc -l diff_diff/*.py >= 1000; all 35 current ≥1000-line modules now listed (verified via comm). Self-audit fix linalg.py Action cell read "Consider splitting (vcov surfaces) — unified backend, splitting would hurt cohesion" — self-contradictory. Reworded to "Consider splitting only if cohesion can be preserved" so the threshold rule and the cohesion constraint can both be honored. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent fe3ab98 commit 4e829a2

1 file changed

Lines changed: 10 additions & 2 deletions

File tree

TODO.md

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ Target: ideally < 1000 lines per module; modules ≥3000 lines are candidates fo
3131
| `had_pretests.py` | 4951 | Consider splitting (Stute / Yatchew / QUG / joint pretests) |
3232
| `had.py` | 4593 | Consider splitting (continuous / mass-point / event-study / survey paths) |
3333
| `staggered.py` | 3963 | Consider splitting — grew through survey + aggregation features |
34-
| `linalg.py` | 3601 | Consider splitting (vcov surfaces) — unified backend, splitting would hurt cohesion |
34+
| `linalg.py` | 3601 | Consider splitting (vcov surfaces) only if cohesion can be preserved — unified backend; vcov / solver paths are tightly coupled |
3535
| `diagnostic_report.py` | 3380 | Consider splitting (per-method renderers + provenance) |
3636
| `power.py` | 3196 | Consider splitting (power analysis + MDE + sample size) |
3737
| `synthetic_did.py` | 2819 | Monitor — variance methods + survey paths |
@@ -51,8 +51,16 @@ Target: ideally < 1000 lines per module; modules ≥3000 lines are candidates fo
5151
| `continuous_did.py` | 1682 | Acceptable |
5252
| `results.py` | 1676 | Acceptable |
5353
| `staggered_triple_diff.py` | 1619 | Acceptable |
54+
| `_nprobust_port.py` | 1412 | Acceptable |
55+
| `practitioner.py` | 1402 | Acceptable |
56+
| `trop_global.py` | 1350 | Acceptable |
57+
| `trop_local.py` | 1339 | Acceptable |
58+
| `local_linear.py` | 1332 | Acceptable |
59+
| `wooldridge.py` | 1305 | Acceptable |
60+
| `chaisemartin_dhaultfoeuille_bootstrap.py` | 1175 | Acceptable |
5461
| `bacon.py` | 1144 | Acceptable |
5562
| `pretrends.py` | 1133 | Acceptable |
63+
| `stacked_did.py` | 1050 | Acceptable |
5664
| `conley.py` | 1006 | Acceptable |
5765
| `visualization/` | 4316 | Subpackage (split across 7 files) — OK |
5866

@@ -211,7 +219,7 @@ Ordered paydown view across the tables above. Tier A → D is by effort × risk,
211219

212220
### Standard Error Consistency
213221

214-
`vcov_type` has subsumed the previously-proposed `se_type` knob`DifferenceInDifferences` and `TwoWayFixedEffects` expose the full surface (`hc1`, `hc2`, `hc2_bm`, `cr1`, `cr2`, `conley`, `bootstrap`). Threading `vcov_type` through the 8 standalone estimators (`CallawaySantAnna`, `SunAbraham`, `ImputationDiD`, `TwoStageDiD`, `TripleDifference`, `StackedDiD`, `WooldridgeDiD`, `EfficientDiD`) remains open and is tracked as a single methodology row in the table above (Phase 1a row).
222+
`vcov_type` has subsumed the previously-proposed `se_type` knob. `DifferenceInDifferences` and `TwoWayFixedEffects` accept `vcov_type ∈ {"classical", "hc1", "hc2", "hc2_bm", "conley"}` (the validated set in `linalg.py::_VALID_VCOV_TYPES`); cluster-robust variance is obtained by passing `cluster=` alongside the heteroscedasticity kind (`hc1 + cluster` ⇒ CR1 Liang-Zeger; `hc2_bm + cluster` ⇒ CR2 Bell-McCaffrey, gated by the open weighted-CR2 / absorbed-FE rows in the table above); wild cluster bootstrap is a separate `inference="wild_bootstrap"` path on the same estimator. Threading `vcov_type` through the 8 standalone estimators (`CallawaySantAnna`, `SunAbraham`, `ImputationDiD`, `TwoStageDiD`, `TripleDifference`, `StackedDiD`, `WooldridgeDiD`, `EfficientDiD`) remains open and is tracked as a single methodology row in the table above (Phase 1a row).
215223

216224
### Type Annotations
217225

0 commit comments

Comments
 (0)