Skip to content

Commit 7eeb42a

Browse files
igerberclaude
andcommitted
Align Phase 1c parity-contract narrative across wrapper, JSON, tests
P3 follow-up from AI review. Three small inconsistencies to resolve: 1. `bias_corrected_local_linear` docstring still described tau_cl/se_cl as bit-parity and said Python consumes R's z directly. The actual contract is atol=1e-12 on all four scalars (DGP 1-3) and the wrapper computes its own z via scipy.stats.norm.ppf; R's qnorm is stored in the JSON for audit only. Docstring updated to match. 2. Committed golden JSON metadata still had the old "consume R's critical value directly" string because the generator was edited without regenerating. Regenerated so JSON metadata matches the corrected audit-export wording in the R script. 3. Parity tests for DGP 4 and DGP 5 did not assert CI bounds. Added ci_low / ci_high assertions at the same tolerance as the corresponding se_rb assertion (bit-parity for DGP 4, 1e-12 for DGP 5), so the audit surface matches what the registry states. Behavior unchanged; tests strengthened and docs aligned. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 12a8efe commit 7eeb42a

3 files changed

Lines changed: 25 additions & 7 deletions

File tree

benchmarks/data/nprobust_lprobust_golden.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@
1010
"dgp5": 20260421
1111
},
1212
"generator": "benchmarks/R/generate_nprobust_lprobust_golden.R",
13-
"algorithm": "nprobust::lprobust(..., bwselect='mse-dpi') at a single eval point, p=1, deriv=0, kernel='epa', vce='nn' unless noted. z = qnorm(1 - alpha/2) exported so the Python side consumes R's critical value directly."
13+
"algorithm": "nprobust::lprobust(..., bwselect='mse-dpi') at a single eval point, p=1, deriv=0, kernel='epa', vce='nn' unless noted. The Python wrapper computes its own z_{1-alpha/2} via scipy.stats.norm.ppf inside safe_inference(); R's z is exported here for audit so a reviewer can verify the two critical values agree to machine precision."
1414
},
1515
"dgp1": {
1616
"n": 2000,

diff_diff/local_linear.py

Lines changed: 11 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1020,12 +1020,17 @@ def bias_corrected_local_linear(
10201020
Notes
10211021
-----
10221022
Parity against ``nprobust::lprobust(..., bwselect="mse-dpi")`` is tiered
1023-
(see ``docs/methodology/REGISTRY.md``): bit-parity on ``tau_cl``/``se_cl``
1024-
(same arithmetic path as Phase 1b's bit-parity-verified primitives);
1025-
``atol=1e-12`` on ``tau_bc``/``se_rb`` (new outer-product step); and
1026-
``atol=1e-13`` on CI bounds (R's ``z_{1-alpha/2}`` is stored in the
1027-
golden JSON so Python consumes it directly rather than calling
1028-
``scipy.stats.norm.ppf``).
1023+
(see ``docs/methodology/REGISTRY.md``): ``atol=1e-12`` on ``tau_cl``,
1024+
``tau_bc``, ``se_cl``, and ``se_rb`` across the three unclustered
1025+
golden DGPs; ``atol=1e-13`` on CI bounds. The Python wrapper computes
1026+
its own ``z_{1-alpha/2}`` via ``scipy.stats.norm.ppf`` inside
1027+
``safe_inference()``; R's ``qnorm`` value is stored in the golden JSON
1028+
for audit, and the parity harness compares Python's CI bounds to R's
1029+
pre-computed CI bounds, so any residual drift is purely the
1030+
floating-point arithmetic in ``tau.bc +/- z * se.rb``, not a
1031+
critical-value disagreement. Clustered DGP 4 achieves bit-parity
1032+
(``atol=1e-14``) when cluster IDs happen to be in first-appearance
1033+
order; otherwise BLAS reduction ordering can drift to ``atol=1e-10``.
10291034
"""
10301035
if weights is not None:
10311036
raise NotImplementedError(

tests/test_bias_corrected_lprobust.py

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -192,6 +192,13 @@ def test_clustered_parity_dgp_4(self, golden):
192192
atol=1e-14, rtol=1e-14)
193193
np.testing.assert_allclose(fit.se_robust, g["se_rb"],
194194
atol=1e-14, rtol=1e-14)
195+
# CI bounds at bit-parity too (Python's scipy ppf and R's qnorm
196+
# agree to ULP; the remaining drift is pure tau.bc +/- z * se.rb
197+
# floating-point arithmetic).
198+
np.testing.assert_allclose(fit.ci_low, g["ci_low"],
199+
atol=1e-14, rtol=1e-14)
200+
np.testing.assert_allclose(fit.ci_high, g["ci_high"],
201+
atol=1e-14, rtol=1e-14)
195202

196203
def test_shifted_boundary_parity_dgp_5(self, golden):
197204
"""Design 1 continuous-near-d_lower: boundary = d.min() > 0."""
@@ -210,6 +217,12 @@ def test_shifted_boundary_parity_dgp_5(self, golden):
210217
atol=1e-12, rtol=1e-12)
211218
np.testing.assert_allclose(fit.se_robust, g["se_rb"],
212219
atol=1e-12, rtol=1e-12)
220+
# CI bounds at the same tolerance (Python scipy ppf vs R qnorm
221+
# agree to ULP; tau.bc +/- z * se.rb inherits se_rb's drift).
222+
np.testing.assert_allclose(fit.ci_low, g["ci_low"],
223+
atol=1e-12, rtol=1e-12)
224+
np.testing.assert_allclose(fit.ci_high, g["ci_high"],
225+
atol=1e-12, rtol=1e-12)
213226

214227

215228
# =============================================================================

0 commit comments

Comments
 (0)