Skip to content

Commit bf13fe5

Browse files
igerberclaude
andcommitted
Update METHODOLOGY_REVIEW.md: HonestDiD review complete
Document all 6 corrections (DeltaRM first-diffs, LP equality constraints, DeltaSD boundary, optimal FLCI, REGISTRY equations, performance). Note outstanding ARP calibration work and R benchmark comparison. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 8a9bea6 commit bf13fe5

1 file changed

Lines changed: 69 additions & 5 deletions

File tree

METHODOLOGY_REVIEW.md

Lines changed: 69 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ Each estimator in diff-diff should be periodically reviewed to ensure:
3030
| StackedDiD | `stacked_did.py` | `stacked-did-weights` | **Complete** | 2026-02-19 |
3131
| TROP | `trop.py` | (forthcoming) | Not Started | - |
3232
| BaconDecomposition | `bacon.py` | `bacondecomp::bacon()` | Not Started | - |
33-
| HonestDiD | `honest_did.py` | `HonestDiD` package | Not Started | - |
33+
| HonestDiD | `honest_did.py` | `HonestDiD` package | **Complete** | 2026-03-31 |
3434
| PreTrendsPower | `pretrends.py` | `pretrends` package | Not Started | - |
3535
| PowerAnalysis | `power.py` | `pwr` / `DeclareDesign` | Not Started | - |
3636

@@ -618,14 +618,78 @@ variables appear to the left of the `|` separator.
618618
| Module | `honest_did.py` |
619619
| Primary Reference | Rambachan & Roth (2023) |
620620
| R Reference | `HonestDiD` package |
621-
| Status | Not Started |
622-
| Last Review | - |
621+
| Status | **Complete** |
622+
| Last Review | 2026-03-31 |
623+
624+
**Verified Components:**
625+
- [x] Delta^SD: second-difference constraints [1,-2,1] with delta_0=0 boundary handling
626+
- [x] Delta^SD: T+Tbar-1 constraint rows (bridge constraint at t=0)
627+
- [x] Delta^RM: constrains first differences (not levels), union of polyhedra per Lemma 2.2
628+
- [x] Identified set LP: pins delta_pre = beta_pre via equality constraints (Equations 5-6)
629+
- [x] M=0 for Delta^SD: linear extrapolation gives finite point-identified bounds
630+
- [x] Mbar=0 for Delta^RM: point identification (all post first-diffs = 0)
631+
- [x] Optimal FLCI for Delta^SD: folded normal cv_alpha, Nelder-Mead over pre-period weights
632+
- [x] Sensitivity grid: bounds computed for each M in grid, breakdown value via binary search
633+
- [x] Survey variance: t-distribution critical values from df_survey
634+
- [x] CallawaySantAnna integration: universal base period, reference period filtering
635+
- [x] Three-period analytical case matches paper Section 2.3
636+
- [ ] ARP hybrid for Delta^RM: infrastructure implemented, moment inequality transformation needs calibration
637+
- [ ] R comparison: pending (benchmark scripts need updating)
638+
639+
**Test Coverage:**
640+
- 63 existing tests in `tests/test_honest_did.py` (14 classes) — all passing
641+
- 17 new methodology verification tests in `tests/test_methodology_honest_did.py`
642+
- R benchmark tests (pending)
623643

624644
**Corrections Made:**
625-
- (None yet)
645+
1. **DeltaRM: first differences, not levels** (`honest_did.py`, `_construct_constraints_rm_component`):
646+
The paper's Delta^RM constrains `|delta_{t+1} - delta_t|` (consecutive first differences)
647+
bounded by Mbar × max pre-treatment first difference. The code constrained `|delta_post|`
648+
(absolute levels) bounded by Mbar × max `|beta_pre|`. Completely rewritten using
649+
union-of-polyhedra decomposition per Lemma 2.2.
650+
651+
2. **LP pins delta_pre = beta_pre** (`honest_did.py`, `_solve_bounds_lp`):
652+
The paper's identified set LP (Equations 5-6) fixes pre-treatment violations to the observed
653+
pre-treatment coefficients. The code had no equality constraint — delta_pre was unconstrained.
654+
For Delta^SD(M=0), this made the LP unbounded. Added A_eq/b_eq equality constraints.
655+
656+
3. **DeltaSD constraint matrix: delta_0=0 boundary** (`honest_did.py`, `_construct_A_sd`):
657+
The code built second-difference matrices treating [delta_{-T},...,delta_{-1},delta_1,...,delta_{Tbar}]
658+
as consecutive, missing delta_0=0 at the boundary. Three boundary rows were wrong:
659+
- t=-1: `d_{-2} - 2*d_{-1} + 0` (uses delta_0=0)
660+
- t=0: `d_{-1} + d_1` (bridge constraint, was missing)
661+
- t=1: `0 - 2*d_1 + d_2` (uses delta_0=0)
662+
Now produces T+Tbar-1 rows (was T+Tbar-2).
663+
664+
4. **Optimal FLCI for Delta^SD** (`honest_did.py`, `_compute_optimal_flci`):
665+
Replaced naive FLCI (`lb - z*se, ub + z*se`) with the paper's optimal FLCI (Section 4.1):
666+
jointly optimizes affine estimator direction v and half-length chi using folded normal
667+
critical values cv_alpha(bias/se). Significantly narrower CIs.
668+
669+
5. **REGISTRY.md equations** (`docs/methodology/REGISTRY.md`):
670+
DeltaSD equation was first differences (should be second differences). DeltaRM equation
671+
was absolute levels (should be first differences). Both corrected with full formulations.
672+
673+
6. **Performance** (`honest_did.py`):
674+
Sensitivity grid reduced from ~9 minutes to 0.1 seconds via: Newton's method for cv_alpha
675+
(5 iterations vs 100), centrosymmetric bias LP (1 solve vs 2), M=0 short-circuit,
676+
looser Nelder-Mead tolerances.
626677

627678
**Outstanding Concerns:**
628-
- (None yet)
679+
- ARP hybrid confidence sets for Delta^RM: infrastructure implemented (`_arp_confidence_set`,
680+
`_enumerate_vertices`, `_compute_arp_test`) but disabled pending calibration of the moment
681+
inequality transformation. Currently uses conservative naive FLCI for RM CIs.
682+
- R benchmark comparison not yet run (Python benchmark needs API update)
683+
- Combined method uses single M for both SD and RM (DeltaSDRM dataclass has separate M/Mbar)
684+
685+
**Deviations from R's HonestDiD:**
686+
1. **Delta^RM CI**: R uses full ARP conditional/hybrid confidence sets. Python uses naive FLCI
687+
(conservative — wider CIs, valid coverage). ARP implementation exists but needs calibration.
688+
2. **Optimal FLCI**: R uses the same approach (Armstrong & Kolesar 2018). Python implementation
689+
matches the methodology but uses Nelder-Mead optimization vs R's custom solver. Numerical
690+
differences expected at tolerance level.
691+
3. **Base period handling**: Python warns (doesn't error) when CallawaySantAnna results use
692+
`base_period != "universal"`. R's HonestDiD requires universal base period.
629693

630694
---
631695

0 commit comments

Comments
 (0)