Add TripleDifferenceResults exemplar + lock alias asdict-exclusion (re-audit of #406) by igerber · Pull Request #428 · igerber/diff-diff

igerber · 2026-05-14T00:58:55Z

Summary

Audit follow-up to PR #406. Restored CI reviewer's two actionable P3s:

REGISTRY + CHANGELOG: The flat-schema exemplar lists omit `TripleDifferenceResults`, even though `diff_diff/triple_diff.py:L43-L86` already uses the same native flat schema (`att` / `se` / `t_stat` / `p_value` / `conf_int`). Add it to both lists.
Alias regression suite: The existing `tests/test_result_aliases.py` locks read-through and read-only semantics but doesn't assert that aliases stay out of `dataclasses.fields()` / `asdict()` output. The registry explicitly documents that contract. If a future refactor converted an `@property` alias to a real field, serializers and field-walkers would silently start surfacing duplicate keys with no test catching it. Add a parametrized regression covering one Pattern B class (`CallawaySantAnnaResults`), the double-alias case (`ContinuousDiDResults`), and the `avg_*` mapping (`MultiPeriodDiDResults`) - asserts the five flat-alias names never appear in `fields(res)` or `asdict(res).keys()`.

No runtime behavior change.

Test plan

CI - `pytest tests/test_result_aliases.py::test_aliases_excluded_from_dataclass_fields_and_asdict` (3 parametrized cases) all pass locally.

🤖 Generated with Claude Code

…ct-exclusion Restored CI reviewer's two P3s on PR #406: 1. The flat-schema exemplar lists in REGISTRY.md and CHANGELOG.md omit `TripleDifferenceResults`, even though `diff_diff/triple_diff.py:43-86` already uses the same native flat schema (`att` / `se` / `t_stat` / `p_value` / `conf_int`). Add it to both lists so the documented exemplar set is complete. 2. The new alias regression suite at `tests/test_result_aliases.py` locks read-through and read-only semantics, but does not assert that aliases stay out of `dataclasses.fields()` / `asdict()` output. The registry note explicitly documents that contract; if a future refactor converted an `@property` alias to a real field, serializers and field-walkers would silently start surfacing duplicate keys with no test catching it. Add a parametrized regression covering one Pattern B class (`CallawaySantAnnaResults`), the double-alias case (`ContinuousDiDResults`), and the `avg_*` mapping (`MultiPeriodDiDResults`) - asserts the five flat-alias names never appear in `fields(res)` or `asdict(res).keys()`. No runtime behavior change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

github-actions · 2026-05-14T01:02:23Z

Overall Assessment

✅ Looks good — no unmitigated P0/P1 findings.

Executive Summary

This is a documentation + regression-test PR; I did not see any estimator, weighting, variance/SE, identification, or default-behavior changes.
The REGISTRY.md / CHANGELOG.md update correctly adds TripleDifferenceResults to the flat-schema exemplar list, matching the existing dataclass fields in diff_diff/triple_diff.py:L43-L86.
The new regression in tests/test_result_aliases.py:L314-L356 correctly locks flat-alias exclusion from dataclasses.fields() / dataclasses.asdict() for representative Pattern B/C/D classes.
P3 informational: the ContinuousDiDResults "double-alias" case is only partially pinned, because the new test does not also assert exclusion of the documented overall_* alias properties in diff_diff/continuous_did_results.py:L168-L183.

Methodology

No findings. Affected methods: none. The diff changes only documentation and tests, and the registry text at docs/methodology/REGISTRY.md:L5-L5 is consistent with the existing flat-schema TripleDifferenceResults implementation at diff_diff/triple_diff.py:L43-L86.

Code Quality

No findings.

Performance

No findings.

Maintainability

No findings beyond the test-gap note below.

Tech Debt

No findings. The PR does not introduce new deferred-work items, and none are needed for correctness.

Security

No findings.

Documentation/Tests

P3 The new exclusion regression only checks the flat aliases att / se / conf_int / p_value / t_stat for ContinuousDiDResults in tests/test_result_aliases.py:L314-L356. The class also has documented overall_se / overall_conf_int / overall_p_value / overall_t_stat alias properties in diff_diff/continuous_did_results.py:L168-L183, and the existing suite already treats that as part of the double-alias contract via tests/test_result_aliases.py:L216-L236. Impact: a future refactor could accidentally turn one of those overall_* aliases into a dataclass field and surface it in dataclasses.fields() / asdict() without this new regression failing, so the registry contract in docs/methodology/REGISTRY.md:L5-L5 remains only partially locked. Concrete fix: extend the ContinuousDiDResults branch of the new test to also assert those four overall_* alias names are absent from both fields(res) and asdict(res).keys().

Validation note: I could not run the targeted pytest in this sandbox because pytest is unavailable, and direct Python import also failed because numpy is not installed, so this review is static.

…aliases R0 review on the prior commit noted the new exclusion test covered the five flat aliases (att / se / conf_int / p_value / t_stat) but omitted the ContinuousDiDResults `overall_se` / `overall_conf_int` / `overall_p_value` / `overall_t_stat` aliases that the double-alias contract documents. Extend the ContinuousDiDResults branch to also lock those four out of `fields()` / `asdict()` output. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

github-actions · 2026-05-14T01:34:43Z

🔁 AI review rerun (requested by @igerber)

Head SHA: 5431a4c08ad0e43556d404405d0564891badc441

Overall Assessment

✅ Looks good — no unmitigated P0/P1 findings.

Executive Summary

This re-review is doc/test-only; I did not find any estimator, weighting, variance/SE, identification, or default-behavior changes in the diff.
The prior P3 is resolved: the new regression now asserts that ContinuousDiDResults’ extra overall_* aliases stay out of dataclasses.fields() and dataclasses.asdict() in tests/test_result_aliases.py:L314-L366, matching the alias properties implemented in diff_diff/continuous_did_results.py:L146-L183.
docs/methodology/REGISTRY.md:L5 and CHANGELOG.md:L14 now correctly include TripleDifferenceResults in the flat-schema exemplar list, consistent with the actual result dataclass in diff_diff/triple_diff.py:L43-L86.
No new unmitigated P0/P1/P2/P3 issues identified in the changed files.
Validation was static only: pytest is unavailable and numpy is not installed in this sandbox.

Methodology

No findings. Affected methods: none. The diff changes documentation and tests only, and the updated flat-schema exemplar text in docs/methodology/REGISTRY.md:L5 / CHANGELOG.md:L14 matches TripleDifferenceResults as implemented in diff_diff/triple_diff.py:L43-L86.

Code Quality

No findings. The new regression cleanly reuses the existing helper-driven construction pattern in tests/test_result_aliases.py:L56-L93 and scopes the new exclusion check to the alias contract in tests/test_result_aliases.py:L314-L366.

Performance

No findings. The added fields() / asdict() assertions are test-only and have no runtime impact on library code.

Maintainability

No findings. The prior re-review gap is now explicitly pinned for the ContinuousDiDResults double-alias surface in tests/test_result_aliases.py:L349-L357.

Tech Debt

No findings. I did not see a newly introduced deferred-work item that requires TODO.md tracking for this PR.

Security

No findings.

Documentation/Tests

No findings. The registry/changelog documentation is now internally consistent with the implementation, and the previous alias-exclusion gap for ContinuousDiDResults is closed by tests/test_result_aliases.py:L314-L366.

…st helper bugs Holistic re-audit of merged igerber#406 (inference-field aliases on staggered result classes) + igerber#428 (post-merge cleanup adding TripleDifferenceResults to alias mapping examples). Per-PR CI on igerber#428 couldn't see the combined post-PR holistic state. Local agentic codex review surfaced residuals across 4 rounds (R1-R4); a 5th round flagged Sphinx autosummary regen which is auto-handled on next docs build (not addressed here). **Test helper (R2)** — `_required_init_kwargs()` in `tests/test_result_aliases.py` had two bugs that are masked today but brittle as result dataclasses evolve: - Default-factory not honored: the `f.default is not f.default_factory and f.default is not MISSING` check returns False for factory-only fields (where default is MISSING and default_factory is a real callable), so the helper pre-filled those fields with a sentinel and the factory never ran. Replaced with explicit `_dc.MISSING` checks on both default and default_factory. - Dispatch order: `"float"` matched before `"Tuple"`, so `Tuple[float, float]` annotations were classified as scalar and got `0.0` instead of `(0.0, 0.0)`. Reordered the synthetic-value dispatch so container annotations (Tuple/List/Dict/DataFrame/ndarray) are checked before the scalar fallback. **Read-only assertion (R1)** — `test_aliases_are_read_only` checked `setattr`-fails on the flat aliases but not on the new `ContinuousDiDResults.overall_*` aliases. Extended the Continuous-specific branch. **Bundled guide documentation (R3)** — REGISTRY and CHANGELOG made the flat alias contract official, but bundled `practitioner.py`, `llms-practitioner.txt`, and `llms-full.txt` documented only the canonical `overall_*` / `overall_att_*` / `avg_*` names. Added a single "Flat-alias compatibility note" under the `## Results Objects` header in `llms-full.txt` (avoids per-class table bloat); extended the result-class snippet in `llms-practitioner.txt`; clarified the Step-4 covariate-comparison snippet in `practitioner.py`. **Registry scope correction (R4)** — the top-level REGISTRY note said "every scalar treatment-effect result class" exposes flat aliases, but flat-native classes (`DiDResults`, `SyntheticDiDResults`, `TROPResults`, `TripleDifferenceResults`, `HeterogeneousAdoptionDiDResults`) already carry these as native dataclass fields — they're unchanged by the alias contract. Narrowed the note to scope only the prefixed families. **Typo fix (R4)** — the R3 `llms-practitioner.txt` note named a nonexistent `overall_att_att` field. Replaced with the actual canonical field names (`overall_att` is the point estimate; `overall_att_se` etc. are the inference fields). 5 files, +51/-13. No behavior change; all edits are documentation alignment and test helper / test coverage hardening on the surface igerber#406 + igerber#428 already established.

igerber added the ready-for-ci Triggers CI test workflows label May 14, 2026

igerber merged commit c3a732c into main May 14, 2026
29 of 30 checks passed

igerber deleted the fix-audit-406 branch May 14, 2026 11:28

igerber mentioned this pull request May 14, 2026

Fix #406 holistic audit residuals: alias-doc completeness + test helper bugs #437

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add TripleDifferenceResults exemplar + lock alias asdict-exclusion (re-audit of #406)#428

Add TripleDifferenceResults exemplar + lock alias asdict-exclusion (re-audit of #406)#428
igerber merged 2 commits into
mainfrom
fix-audit-406

igerber commented May 14, 2026

Uh oh!

github-actions Bot commented May 14, 2026

Uh oh!

github-actions Bot commented May 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

igerber commented May 14, 2026

Summary

Test plan

Uh oh!

github-actions Bot commented May 14, 2026

Uh oh!

github-actions Bot commented May 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant