Commit 3ab7a86
Address PR #409 R7 review (P2 D1) — bounded p-value drift bands
Two bootstrap p-value drift tests had lower-bound-only assertions:
- `test_overall_stute_fails_to_reject`: was `p > 0.50`, tutorial quotes
~0.686 → would silently pass if p drifted to 0.99
- `test_event_study_homogeneity_fails_to_reject`: was `p > 0.50`,
tutorial quotes ~0.763 → same silent-stale risk
The third bootstrap test (`test_event_study_pretrends_fails_to_reject`)
already used a bounded band `0.0 <= p <= 0.25`. Mirror that pattern on
the other two with bounded bands per
`feedback_bootstrap_drift_tests_need_backend_tolerance` (>= 0.15
width):
- Stute: 0.53 <= p <= 0.84 (band ~0.31 around 0.686)
- Homogeneity: 0.61 <= p <= 0.92 (band ~0.31 around 0.763)
Both bands wide enough for Rust ↔ pure-Python RNG path differences;
both narrow enough that drift in either direction (toward rejection
or toward an even cleaner pass) flags the prose as stale.
All 16 drift tests pass on both backends within the new bands.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>1 parent f9f951f commit 3ab7a86
1 file changed
Lines changed: 14 additions & 9 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
204 | 204 | | |
205 | 205 | | |
206 | 206 | | |
207 | | - | |
208 | | - | |
209 | | - | |
210 | | - | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
211 | 213 | | |
212 | | - | |
213 | | - | |
214 | | - | |
| 214 | + | |
215 | 215 | | |
216 | 216 | | |
217 | 217 | | |
| |||
292 | 292 | | |
293 | 293 | | |
294 | 294 | | |
295 | | - | |
| 295 | + | |
| 296 | + | |
| 297 | + | |
| 298 | + | |
| 299 | + | |
| 300 | + | |
296 | 301 | | |
297 | 302 | | |
298 | 303 | | |
299 | | - | |
| 304 | + | |
300 | 305 | | |
301 | 306 | | |
302 | 307 | | |
| |||
0 commit comments