Skip to content

Commit 466c05d

Browse files
committed
Fix #401 holistic audit residuals: two stale docstrings around dCDH by_path / paths_of_interest support
Holistic re-audit of merged #401 (dCDH by_path non-binary integer treatment + `paths_of_interest` Python-only selector) + #419 (cleanup PR broadening R-parser caveat). Per-PR CI review on #419 couldn't see the combined post-PR holistic state. Local agentic codex review against the combined diff (3 rounds) surfaced 2 P2 docstring fixes: - `_validate_and_aggregate_to_cells` (`chaisemartin_dhaultfoeuille.py` aggregator helper): the `Raises` block still said the helper raises on "non-binary raw treatment values", but the shipped implementation now accepts continuous `d_gt` cell means and defers integer-only enforcement to `fit()` time (the `by_path` / `paths_of_interest` contract). Updated the Raises section to enumerate the actual failure modes (missing columns, NaN, non-coercible, within-cell-varying) and explain where the integer-only check actually lives. - `path_sup_t_bands` dataclass field-adjacent comment (`chaisemartin_dhaultfoeuille_results.py`): comment still said the field is populated when `by_path` is a positive int AND `n_bootstrap > 0`, but `paths_of_interest` also activates the field. Updated to match the public docstring above (EITHER `by_path` OR `paths_of_interest` AND `n_bootstrap > 0`). No methodology changes, no behavior changes to fit() outputs — both are documentation-only edits on dataclass field comments and a helper Raises block. The methodology contract has been correct on the public docstrings + REGISTRY all along; only the source-adjacent comments lagged. Empirical observation: pilot-401 converged in 3 codex rounds (vs 11 for the #402 pilot). #401's cleanup PR #419 was a narrow R-parser caveat broadening, so the holistic state didn't have many cross-surface drift opportunities — consistent with the holistic-pilot pattern's expected return-on-investment scaling with cleanup-PR scope.
1 parent 917a786 commit 466c05d

2 files changed

Lines changed: 19 additions & 12 deletions

File tree

diff_diff/chaisemartin_dhaultfoeuille.py

Lines changed: 10 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -143,9 +143,16 @@ def _validate_and_aggregate_to_cells(
143143
Raises
144144
------
145145
ValueError
146-
On missing columns, NaN treatment / outcome values, non-numeric
147-
treatment / outcome that cannot be coerced, or non-binary raw
148-
treatment values.
146+
On missing columns; NaN values in the treatment / outcome columns;
147+
non-numeric treatment / outcome that cannot be coerced via
148+
``pd.to_numeric``; or within-cell-varying treatment (any
149+
``(group, time)`` cell where ``d_min != d_max``, since fuzzy DiD
150+
is out of scope and deferred to a separate dCdH 2018 paper).
151+
Integer-coded non-binary treatment (the ``by_path`` /
152+
``paths_of_interest`` requirement) is enforced separately at
153+
``fit()`` time, not here at aggregation time — this helper
154+
accepts continuous ``d_gt`` cell means and lets ``fit()`` decide
155+
whether the integer-only contract applies.
149156
"""
150157
# 1. Required columns
151158
missing = [c for c in (outcome, group, time, treatment) if c not in data.columns]

diff_diff/chaisemartin_dhaultfoeuille_results.py

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -625,16 +625,16 @@ class ChaisemartinDHaultfoeuilleResults:
625625
)
626626
# Per-path joint sup-t simultaneous-band metadata. Keyed by path
627627
# tuple; each entry holds `{"crit_value", "alpha", "n_bootstrap",
628-
# "method", "n_valid_horizons"}`. Populated when `by_path` is a
629-
# positive int AND `n_bootstrap > 0`. The joint band itself is
630-
# written per-horizon as `cband_conf_int` on
631-
# `path_effects[path]["horizons"][l]` (mirrors the OVERALL
632-
# `event_study_effects[l]["cband_conf_int"]` pattern at
628+
# "method", "n_valid_horizons"}`. Populated when EITHER `by_path` is
629+
# a positive int OR `paths_of_interest` is non-empty AND
630+
# `n_bootstrap > 0`. The joint band itself is written per-horizon as
631+
# `cband_conf_int` on `path_effects[path]["horizons"][l]` (mirrors
632+
# the OVERALL `event_study_effects[l]["cband_conf_int"]` pattern at
633633
# `chaisemartin_dhaultfoeuille.py:2865-2875`). Empty-state contract:
634-
# `None` when not requested (no bootstrap or `by_path is None`); `{}`
635-
# when requested but no path passed both gates (>=2 valid horizons
636-
# AND a strict majority — more than 50% — of finite sup-t draws).
637-
# The bands cover joint inference
634+
# `None` when not requested (no bootstrap, or both `by_path` and
635+
# `paths_of_interest` are `None`); `{}` when requested but no path
636+
# passed both gates (>=2 valid horizons AND a strict majority — more
637+
# than 50% — of finite sup-t draws). The bands cover joint inference
638638
# WITHIN a single path across horizons; they do NOT provide
639639
# simultaneous coverage across paths.
640640
path_sup_t_bands: Optional[Dict[Tuple[int, ...], Dict[str, Any]]] = field(

0 commit comments

Comments
 (0)