Add base_period parameter to CallawaySantAnna for pre-treatment effects#97
Conversation
Implement the base_period parameter matching R's did::att_gt() API to enable computation of pre-treatment ATT(g,t) values for parallel trends assessment. Two modes are supported: - "varying" (default): Pre-treatment uses t-1 as base (consecutive comparisons) - "universal": All comparisons use g-anticipation-1 as base Both modes produce identical post-treatment ATT(g,t) values. They differ only in how pre-treatment effects are computed. The overall ATT aggregation only includes post-treatment effects, matching R's behavior. Changes: - Add base_period parameter to CallawaySantAnna.__init__ with validation - Modify _compute_att_gt_fast to select base period based on mode - Update fit() to compute pre-treatment ATT(g,t) where t < g - anticipation - Filter _aggregate_simple and bootstrap to only aggregate post-treatment effects - Add base_period to CallawaySantAnnaResults and display in summary() - Update methodology registry with base_period edge case documentation - Add 11 new tests for pre-treatment effects Validated against R's did package v2.3.0 with max numerical difference of 4.91e-05. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
Overall assessment: Executive summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
|
- Remove fallback to non-consecutive base periods in varying mode - Add base_period parameter to CallawaySantAnna docstring - Add validation and warning for empty post-treatment effect sets - Add tests for no-fallback behavior and NaN result with warning Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall assessment: Executive Summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
|
- Fix bootstrap to return early with NaN when no post-treatment effects - Fix t-stat/p-value to use NaN (not 0/1) when SE is undefined - Add comprehensive tests for bootstrap edge case - Document edge case in Methodology Registry Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall assessment: Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
Open Questions
Executive Summary
|
Fixes: - P1 (Critical): Group aggregation now uses t >= g - anticipation instead of t >= g for post-treatment filtering (staggered_aggregation.py, staggered_bootstrap.py) - P2: Per-effect t_stat now returns NaN (not 0.0) when SE is non-finite or zero, consistent with overall_t_stat behavior (4 locations in staggered.py) Tests: - Added TestCallawaySantAnnaAnticipation with 2 tests for anticipation boundaries - Added TestCallawaySantAnnaTStatNaN with 2 tests for NaN t_stat consistency Documentation: - Updated REGISTRY.md to document anticipation handling in group aggregation - Updated REGISTRY.md to document per-effect t_stat NaN behavior Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall assessment:
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
|
…reatment - Fix aggregated t_stat in _aggregate_event_study and _aggregate_by_group to use NaN when SE is non-finite or zero (was defaulting to 0.0) - Refactor bootstrap to continue computing per-effect SEs when no post-treatment effects exist (only overall ATT stats are NaN) - Add tests for aggregated t_stat NaN behavior - Add test verifying bootstrap runs for pre-treatment effects - Update methodology REGISTRY.md documentation Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall assessment: Executive Summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
|
Fix critical bug where control_group="not_yet_treated" incorrectly included treated cohort g in controls for pre-treatment periods (t < g). When computing ATT(g,t), cohort g should never be in the control group, regardless of whether t < g or t >= g. The fix adds explicit exclusion of cohort g from the not-yet-treated control mask. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall assessment: ✅ Looks good
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
|
Add notes explaining that the canonical ATT(g,t) equation (using g-1 as base) applies to post-treatment effects and base_period="universal". With base_period="varying" (default), pre-treatment uses t-1 for consecutive comparisons useful in parallel trends diagnostics. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Create /pre-merge-check skill for automated checks before submitting PRs - Pattern checks for NaN handling issues - Test coverage verification - Context-specific checklists based on changed files - Update PR review prompt with edge case checklist from PR #97 analysis - Empty result set handling - NaN/Inf propagation checks - Parameter interaction verification - Control group logic validation - Pattern consistency checks - Add Task Implementation Workflow section to CLAUDE.md - 4-phase workflow: Planning, Implementation, Pre-Merge Review, Submit - Quick reference commands for common pattern checks Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…rence Replace `else 0.0` with `else np.nan` when SE is non-finite or zero in t-stat calculations across sun_abraham.py, triple_diff.py, and diagnostics.py. Add CI guards returning (NaN, NaN) for 4 downstream confidence interval computations. Matches the CallawaySantAnna pattern established in PR #97. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Summary
base_periodparameter toCallawaySantAnnamatching R'sdid::att_gt()APIMethodology references (required if estimator / math changes)
did::att_gt()base_periodparameter exactlyValidation
tests/test_staggered.py- addedTestCallawaySantAnnaPreTreatmentclass with 11 testsdidpackage v2.3.0 with max numerical difference of 4.91e-05 across all ATT(g,t), event study effects, and overall ATTSecurity / privacy
Generated with Claude Code