Fix #406 holistic audit residuals: alias-doc completeness + test helper bugs

igerber · igerber · commit e258ecdd7b3c · 2026-05-14T18:37:12.000-04:00
Holistic re-audit of merged #406 (inference-field aliases on staggered result classes) + #428 (post-merge cleanup adding TripleDifferenceResults to alias mapping examples). Per-PR CI on #428 couldn't see the combined post-PR holistic state. Local agentic codex review surfaced residuals across 4 rounds (R1-R4); a 5th round flagged Sphinx autosummary regen which is auto-handled on next docs build (not addressed here). **Test helper (R2)** — `_required_init_kwargs()` in `tests/test_result_aliases.py` had two bugs that are masked today but brittle as result dataclasses evolve: - Default-factory not honored: the `f.default is not f.default_factory and f.default is not MISSING` check returns False for factory-only fields (where default is MISSING and default_factory is a real callable), so the helper pre-filled those fields with a sentinel and the factory never ran. Replaced with explicit `_dc.MISSING` checks on both default and default_factory. - Dispatch order: `"float"` matched before `"Tuple"`, so `Tuple[float, float]` annotations were classified as scalar and got `0.0` instead of `(0.0, 0.0)`. Reordered the synthetic-value dispatch so container annotations (Tuple/List/Dict/DataFrame/ndarray) are checked before the scalar fallback. **Read-only assertion (R1)** — `test_aliases_are_read_only` checked `setattr`-fails on the flat aliases but not on the new `ContinuousDiDResults.overall_*` aliases. Extended the Continuous-specific branch. **Bundled guide documentation (R3)** — REGISTRY and CHANGELOG made the flat alias contract official, but bundled `practitioner.py`, `llms-practitioner.txt`, and `llms-full.txt` documented only the canonical `overall_*` / `overall_att_*` / `avg_*` names. Added a single "Flat-alias compatibility note" under the `## Results Objects` header in `llms-full.txt` (avoids per-class table bloat); extended the result-class snippet in `llms-practitioner.txt`; clarified the Step-4 covariate-comparison snippet in `practitioner.py`. **Registry scope correction (R4)** — the top-level REGISTRY note said "every scalar treatment-effect result class" exposes flat aliases, but flat-native classes (`DiDResults`, `SyntheticDiDResults`, `TROPResults`, `TripleDifferenceResults`, `HeterogeneousAdoptionDiDResults`) already carry these as native dataclass fields — they're unchanged by the alias contract. Narrowed the note to scope only the prefixed families. **Typo fix (R4)** — the R3 `llms-practitioner.txt` note named a nonexistent `overall_att_att` field. Replaced with the actual canonical field names (`overall_att` is the point estimate; `overall_att_se` etc. are the inference fields). 5 files, +51/-13. No behavior change; all edits are documentation alignment and test helper / test coverage hardening on the surface #406 + #428 already established.
diff --git a/diff_diff/guides/llms-full.txt b/diff_diff/guides/llms-full.txt
@@ -967,6 +967,18 @@ results = bacon_decompose(data, outcome='y', unit='id', time='t', first_treat='f
 
 ## Results Objects
 
+**Flat-alias compatibility note.** Every staggered result class in this
+section (those with canonical `overall_*` / `overall_att_*` / `avg_*`
+prefixed inference fields) ALSO exposes the unprefixed flat names
+`att` / `se` / `conf_int` / `p_value` / `t_stat` as read-only `@property`
+aliases over the canonical fields. The canonical prefixed fields remain
+the documented and computed surface; the flat aliases are pure
+read-throughs for compatibility with external adapters that
+`getattr(res, "se", None)`-style query the inference surface (e.g.
+`balance.interop.diff_diff.as_balance_diagnostic()`). Tables below list
+the canonical names; assume the flat aliases are present on every
+staggered class unless explicitly noted otherwise.
+
 ### DiDResults
 
 Returned by `DifferenceInDifferences.fit()` and `TwoWayFixedEffects.fit()`.
diff --git a/diff_diff/guides/llms-practitioner.txt b/diff_diff/guides/llms-practitioner.txt
@@ -70,6 +70,16 @@ results.cohort_effects        # Per-cohort effects (via to_dataframe(level='coho
 # Other staggered (ImputationDiD, TwoStageDiD, etc.):
 results.overall_att       # Overall ATT
 results.overall_se        # Standard error
+
+# Flat-alias compatibility: every staggered class above (canonical
+# `overall_*` / `overall_att_*` / `avg_*` prefixed) also exposes the
+# unprefixed flat names `att` / `se` / `conf_int` / `p_value` /
+# `t_stat` as read-only `@property` aliases over the canonical fields.
+# The canonical prefixed names remain the documented and computed
+# surface; the flat aliases are pure read-throughs for compatibility
+# with adapters that `getattr(res, "se", None)`-style query inference.
+results.att               # alias of overall_att / avg_att (or overall_att for ContinuousDiD ATT side)
+results.se                # alias of overall_se / avg_se (or overall_att_se for ContinuousDiD ATT side)
 ```
 
 ---
diff --git a/diff_diff/practitioner.py b/diff_diff/practitioner.py
@@ -271,7 +271,8 @@ def _covariates_step() -> Dict[str, Any]:
             "# Re-estimate without covariates and compare:\n"
             "result_no_cov = estimator.fit(data, ..., covariates=None)\n"
             "# Compare ATT with and without covariates.\n"
-            "# Use .att (basic DiD) or .overall_att (staggered estimators)."
+            "# Use .att (basic DiD; also a read-only flat-alias on staggered\n"
+            "# classes) or .overall_att (canonical name on staggered results)."
         ),
         priority="medium",
         step_name="robustness",
diff --git a/docs/methodology/REGISTRY.md b/docs/methodology/REGISTRY.md
@@ -2,7 +2,7 @@
 
 This document provides the academic foundations and key implementation requirements for each estimator in diff-diff. It serves as a reference for contributors and users who want to understand the theoretical basis of the methods.
 
-**Result-class field naming.** Headline scalar inference fields appear under one of four native naming patterns: flat `att` / `se` / `conf_int` / `p_value` / `t_stat` (`DiDResults`, `SyntheticDiDResults`, `TROPResults`, `TripleDifferenceResults`, `HeterogeneousAdoptionDiDResults`); `overall_*` (`CallawaySantAnnaResults` and the rest of the staggered family); `overall_att_*` (`ContinuousDiDResults`, where `att` and `acrt` are parallel response curves); and `avg_*` (`MultiPeriodDiDResults`). Every scalar treatment-effect result class covered by this naming contract additionally exposes the flat `att` / `se` / `conf_int` / `p_value` / `t_stat` names as read-only `@property` aliases for adapter / external-consumer compatibility (see PR for v3.3.3, motivated by `balance.interop.diff_diff`); `ContinuousDiDResults` further exposes `overall_*` aliases pointing at the ATT side. The native field is canonical for documentation, semantics, and computation — aliases are pure read-throughs and inherit the `safe_inference()` joint-NaN consistency contract automatically. Because aliases are `@property` descriptors (not dataclass fields), they do NOT appear in `dataclasses.fields()` or `dataclasses.asdict()` output, and assignment to an alias raises `AttributeError`; serializers and field-walkers continue to see only the canonical field set.
+**Result-class field naming.** Headline scalar inference fields appear under one of four native naming patterns: flat `att` / `se` / `conf_int` / `p_value` / `t_stat` (`DiDResults`, `SyntheticDiDResults`, `TROPResults`, `TripleDifferenceResults`, `HeterogeneousAdoptionDiDResults`); `overall_*` (`CallawaySantAnnaResults` and the rest of the staggered family); `overall_att_*` (`ContinuousDiDResults`, where `att` and `acrt` are parallel response curves); and `avg_*` (`MultiPeriodDiDResults`). Result classes in the prefixed `overall_*` / `overall_att_*` / `avg_*` families additionally expose the flat `att` / `se` / `conf_int` / `p_value` / `t_stat` names as read-only `@property` aliases over their canonical fields, for adapter / external-consumer compatibility (see PR for v3.3.3, motivated by `balance.interop.diff_diff`). The flat-native classes (`DiDResults`, `SyntheticDiDResults`, `TROPResults`, `TripleDifferenceResults`, `HeterogeneousAdoptionDiDResults`) already carry these names as native dataclass fields and are unchanged by this contract. `ContinuousDiDResults` further exposes `overall_*` aliases pointing at the ATT side (so `result.overall_se` reads `result.overall_att_se`, etc.). The native field is canonical for documentation, semantics, and computation — aliases are pure read-throughs and inherit the `safe_inference()` joint-NaN consistency contract automatically. Because aliases are `@property` descriptors (not dataclass fields), they do NOT appear in `dataclasses.fields()` or `dataclasses.asdict()` output, and assignment to an alias raises `AttributeError`; serializers and field-walkers continue to see only the canonical field set.
 
 ## Table of Contents
 
diff --git a/tests/test_result_aliases.py b/tests/test_result_aliases.py
@@ -60,24 +60,26 @@ def _required_init_kwargs(cls, overrides):
     Lets us build a minimal result instance for alias-mechanic tests without
     having to enumerate every estimator-specific field. Sentinel values for
     untouched fields are deliberately uninteresting (empty containers, zeros)
-    -- they are not exercised by these tests."""
+    -- they are not exercised by these tests.
+    """
+    import dataclasses as _dc
+
     kwargs = {}
     for f in fields(cls):
         if f.name in overrides:
             continue
-        # Skip fields with defaults; we only need to fill required positionals.
-        if f.default is not f.default_factory and f.default is not getattr(
-            __import__("dataclasses"), "MISSING", None
-        ):
-            # Field has a default value; let the dataclass apply it.
+        # A field is REQUIRED iff both default and default_factory are MISSING.
+        # When default_factory is set (e.g. list/dict factory), the dataclass
+        # will apply it; we must NOT pre-fill the field with a sentinel or we
+        # block the factory.
+        if f.default is not _dc.MISSING or f.default_factory is not _dc.MISSING:
             continue
         # Required field — supply a type-compatible sentinel.
+        # Order container annotations BEFORE the scalar `"float"` / `"int"`
+        # branches so that ``Tuple[float, float]`` is not mis-classified as
+        # scalar (``"float" in "Tuple[float, float]"`` is True).
         ann = str(f.type)
-        if "float" in ann:
-            kwargs[f.name] = 0.0
-        elif "int" in ann:
-            kwargs[f.name] = 0
-        elif "Tuple" in ann or "tuple" in ann:
+        if "Tuple" in ann or "tuple" in ann:
             kwargs[f.name] = (0.0, 0.0)
         elif "List" in ann or "list" in ann:
             kwargs[f.name] = []
@@ -87,6 +89,10 @@ def _required_init_kwargs(cls, overrides):
             kwargs[f.name] = pd.DataFrame()
         elif "ndarray" in ann or "np.ndarray" in ann:
             kwargs[f.name] = np.array([])
+        elif "float" in ann:
+            kwargs[f.name] = 0.0
+        elif "int" in ann:
+            kwargs[f.name] = 0
         else:
             kwargs[f.name] = None
     kwargs.update(overrides)
@@ -304,6 +310,15 @@ def test_aliases_are_read_only(cls, ovr):
     for name in ("att", "se", "conf_int", "p_value", "t_stat"):
         with pytest.raises(AttributeError):
             setattr(res, name, object())
+    # ContinuousDiDResults also exposes overall_se / overall_conf_int /
+    # overall_p_value / overall_t_stat as read-only aliases over the
+    # ATT-side canonical fields (no parallel `overall_att` alias is needed
+    # because `overall_att_att` would be confusing; the flat `att` covers
+    # that one). These must also reject assignment.
+    if cls.__name__ == "ContinuousDiDResults":
+        for name in ("overall_se", "overall_conf_int", "overall_p_value", "overall_t_stat"):
+            with pytest.raises(AttributeError):
+                setattr(res, name, object())
 
 
 @pytest.mark.parametrize(