Skip to content

Commit 872abc8

Browse files
igerberclaude
andcommitted
Address PR #366 CI review round 17 (1 P1): split "no never-treated" vs "negative dose" branches; HAD only valid on the former
Reviewer correctly noted that the round-15/16 wording listed `HeterogeneousAdoptionDiD` as a routing alternative whenever `ContinuousDiD` fails on the dose-related preflights, but HAD itself requires non-negative dose support and raises on negative post-period dose at `had.py:1450-1459` (paper Section 2). On a panel with `dose_min < 0`, routing to HAD silently steers an agent into the same fit-time error. Verified the rejection at `had.py:1450-1459`. Reworded every site to split the two failure modes: - Branch (a): `has_never_treated == False` (no zero-dose controls but all observed doses non-negative). `ContinuousDiD` does not apply (Remark 3.1 not implemented). HAD IS a routing alternative on this branch (HAD's contract requires non-negative dose, satisfied here); linear DiD with a continuous covariate is another. - Branch (e): `dose_min < 0` (negative treated doses). `ContinuousDiD` does not apply AND HAD is **not** a fallback either — HAD raises on negative post-period dose (`had.py:1450-1459`). Linear DiD with a signed continuous covariate is the applicable alternative on this branch. Updated wording across: - `diff_diff/profile.py` `TreatmentDoseShape` docstring (refactored from item-by-item duplication into a numbered list with a single "Routing alternatives when (1) or (5) fails" section that splits the two branches; trimmed redundancy). - `diff_diff/guides/llms-autonomous.txt` §2 field reference (split the When-(1)-or-(5)-fails paragraph into the two branches). - `diff_diff/guides/llms-autonomous.txt` §4.7 trailing paragraph (consolidated to a pointer at §2's split discussion). - `diff_diff/guides/llms-autonomous.txt` §5.2 reasoning chain counter-example #4 (no never-treated branch: HAD applies) and counter-example #5 (negative-dose branch: HAD does NOT apply, cite `had.py:1450-1459`). - `CHANGELOG.md` Wave 2 entry. - `ROADMAP.md` AI-Agent Track building block. - `tests/test_profile_panel.py` two test docstrings/comments. Added `test_autonomous_negative_dose_path_does_not_route_to_had` in `tests/test_guides.py` asserting that §5.2 explicitly cites `had.py:1450-1459` on the negative-dose branch (used a single- line fingerprint since the prose phrase "non-negative dose support" is split across newlines in the rendered guide). Length housekeeping: trimmed counter-example #4 and #5 prose + §4.7 trailing paragraph to point at §2's split discussion; autonomous (65374 chars) < full (66031), `test_full_is_largest` green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent e712742 commit 872abc8

6 files changed

Lines changed: 129 additions & 98 deletions

File tree

CHANGELOG.md

Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

ROADMAP.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -138,7 +138,7 @@ Long-running program, framed as "building toward" rather than with discrete ship
138138
- Baker et al. (2025) 8-step workflow enforcement in `diff_diff/practitioner.py`.
139139
- `practitioner_next_steps()` context-aware guidance.
140140
- Runtime LLM guides via `get_llm_guide(...)` (`llms.txt`, `llms-full.txt`, `llms-practitioner.txt`, `llms-autonomous.txt`), bundled in the wheel.
141-
- `profile_panel(df, ...)` returns a `PanelProfile` dataclass of structural facts about the panel - factual, not opinionated. Pairs with the `"autonomous"` guide variant (reference-shaped: estimator-support matrix + per-design-feature reasoning) so agents describe the data then consult a bundled reference rather than calling a deterministic recommender. `PanelProfile.outcome_shape` and `PanelProfile.treatment_dose` extensions add descriptive distributional context (count-likeness / bounded-support hints on numeric outcomes; dose support and zero-dose presence on continuous treatments). Most fields are descriptive context. `outcome_shape.is_count_like` informs the WooldridgeDiD-QMLE-vs-linear-OLS judgment but does not gate it. `profile_panel` does not see the separate `first_treat` column that `ContinuousDiD.fit()` consumes; under the canonical `ContinuousDiD` setup (per-unit time-invariant dose `D_i` + separate `first_treat`), several preflight checks become predictive on the dose column: `has_never_treated` (proxies `P(D=0) > 0`), `treatment_varies_within_unit == False` (actual fit-time gate), `is_balanced` (actual fit-time gate), absence of the `duplicate_unit_time_rows` alert (silent last-row-wins overwrite path), and `treatment_dose.dose_min > 0` (predicts strictly-positive-treated-dose). When `has_never_treated` or `dose_min > 0` fails, `ContinuousDiD` as currently implemented does not apply (Remark 3.1 lowest-dose-as-control is not implemented). Routing alternatives include `HeterogeneousAdoptionDiD` and linear DiD with a continuous covariate; re-encoding the treatment column is an agent-side preprocessing choice that is not documented in REGISTRY as a supported fallback. The estimator's force-zero coercion on inconsistent `first_treat == 0 + nonzero dose` inputs is implementation behavior, not a documented method for manufacturing controls. The autonomous guide §5 walks through three end-to-end PanelProfile -> reasoning -> validation worked examples.
141+
- `profile_panel(df, ...)` returns a `PanelProfile` dataclass of structural facts about the panel - factual, not opinionated. Pairs with the `"autonomous"` guide variant (reference-shaped: estimator-support matrix + per-design-feature reasoning) so agents describe the data then consult a bundled reference rather than calling a deterministic recommender. `PanelProfile.outcome_shape` and `PanelProfile.treatment_dose` extensions add descriptive distributional context (count-likeness / bounded-support hints on numeric outcomes; dose support and zero-dose presence on continuous treatments). Most fields are descriptive context. `outcome_shape.is_count_like` informs the WooldridgeDiD-QMLE-vs-linear-OLS judgment but does not gate it. `profile_panel` does not see the separate `first_treat` column that `ContinuousDiD.fit()` consumes; under the canonical `ContinuousDiD` setup (per-unit time-invariant dose `D_i` + separate `first_treat`), several preflight checks become predictive on the dose column: `has_never_treated` (proxies `P(D=0) > 0`), `treatment_varies_within_unit == False` (actual fit-time gate), `is_balanced` (actual fit-time gate), absence of the `duplicate_unit_time_rows` alert (silent last-row-wins overwrite path), and `treatment_dose.dose_min > 0` (predicts strictly-positive-treated-dose). When `has_never_treated == False` but all doses are non-negative, `ContinuousDiD` does not apply (Remark 3.1 not implemented); `HeterogeneousAdoptionDiD` is a routing alternative on this branch. When `dose_min <= 0` (negative doses), neither `ContinuousDiD` nor `HeterogeneousAdoptionDiD` apply (HAD raises on negative post-period dose); linear DiD with a signed continuous covariate is the applicable alternative. Re-encoding the treatment column is an agent-side preprocessing choice that is not documented in REGISTRY as a supported fallback. The estimator's force-zero coercion on inconsistent `first_treat == 0 + nonzero dose` inputs is implementation behavior, not a documented method for manufacturing controls. The autonomous guide §5 walks through three end-to-end PanelProfile -> reasoning -> validation worked examples.
142142
- Package docstring leads with an "For AI agents" entry block so `help(diff_diff)` surfaces the agent entry points automatically.
143143
- Silent-operation warnings so agents and humans see the same signals at the same time.
144144

diff_diff/guides/llms-autonomous.txt

Lines changed: 44 additions & 54 deletions
Original file line numberDiff line numberDiff line change
@@ -241,20 +241,30 @@ view. Every field below appears as a top-level key in that dict.
241241
treated units carry their constant dose across all periods so
242242
`dose_min` over non-zero values is the smallest treated dose).
243243

244-
When `has_never_treated == False` or `dose_min <= 0`,
245-
`ContinuousDiD` as currently implemented does not apply (Remark
246-
3.1 lowest-dose-as-control is not implemented). Routing
247-
alternatives that do not require `P(D=0) > 0`:
248-
`HeterogeneousAdoptionDiD` for graded-adoption designs, or
249-
linear DiD with the treatment as a continuous covariate.
250-
Re-encoding the treatment column (shifting, absolute value,
251-
etc.) is an agent-side preprocessing choice that changes the
244+
When `has_never_treated == False` (no zero-dose controls but
245+
all observed doses non-negative), `ContinuousDiD` as currently
246+
implemented does not apply (Remark 3.1 lowest-dose-as-control
247+
is not implemented). Routing alternatives that do not require
248+
`P(D=0) > 0`: `HeterogeneousAdoptionDiD` for graded-adoption
249+
designs (HAD's own contract requires non-negative dose, which
250+
this branch satisfies), or linear DiD with the treatment as a
251+
continuous covariate. When `dose_min <= 0` (negative treated
252+
doses), the situation is different: `ContinuousDiD` does not
253+
apply, and `HeterogeneousAdoptionDiD` is **not** a fallback
254+
either — HAD raises on negative post-period dose
255+
(`had.py:1450-1459`). The applicable routing alternative on
256+
the negative-dose branch is linear DiD with the treatment as
257+
a signed continuous covariate. Re-encoding the treatment
258+
column to a non-negative scale (shifting, absolute value, etc.)
259+
is an agent-side preprocessing choice that changes the
252260
estimand and is not documented in REGISTRY as a supported
253-
fallback. Do not relabel positive- or negative-dose units as
254-
`first_treat == 0`: that triggers the force-zero coercion
255-
path, which is implementation behavior for inconsistent inputs
256-
(e.g., an accidentally-nonzero row on a never-treated unit),
257-
not a documented routing option.
261+
fallback; if the agent does re-encode, both `ContinuousDiD`
262+
and `HeterogeneousAdoptionDiD` become candidates again on the
263+
re-encoded scale. Do not relabel positive- or negative-dose
264+
units as `first_treat == 0`: that triggers the force-zero
265+
coercion path, which is implementation behavior for
266+
inconsistent inputs (e.g., an accidentally-nonzero row on a
267+
never-treated unit), not a documented routing option.
258268

259269
The agent must still validate the supplied `first_treat` column
260270
independently: it must contain at least one `first_treat == 0`
@@ -563,18 +573,10 @@ When `treatment_type == "continuous"`:
563573
overwritten with last-row-wins (a hard preflight veto, not a
564574
fit-time raise — the agent must deduplicate before fitting); (a)
565575
and (e) hold under the canonical setup. When (a) or (e) fails,
566-
`ContinuousDiD` as currently implemented does not apply (Remark
567-
3.1 lowest-dose-as-control is not implemented). Routing
568-
alternatives that do not require `P(D=0) > 0` are
569-
`HeterogeneousAdoptionDiD` (graded adoption) and linear DiD with
570-
a continuous covariate. Re-encoding the treatment column is an
571-
agent-side preprocessing choice that changes the estimand and is
572-
not documented in REGISTRY as a supported fallback. Do not
573-
relabel positive-dose or negative-dose units as
574-
`first_treat == 0` to manufacture controls: that triggers
575-
`ContinuousDiD.fit()`'s force-zero coercion path
576-
(`UserWarning`), which is implementation behavior for
577-
inconsistent inputs, not a documented methodological option.
576+
see §2 for the full routing-alternatives discussion (the two
577+
branches differ: HAD applies on the no-never-treated branch but
578+
not on the negative-dose branch, since HAD requires non-negative
579+
dose support per `had.py:1450-1459`).
578580
Note that staggered adoption IS supported natively (adoption
579581
timing is expressed via the `first_treat` column, not via
580582
within-unit dose variation), and `ContinuousDiD.fit()` applies
@@ -886,39 +888,27 @@ Reasoning chain:
886888
indicator and fall back to a binary staggered estimator.
887889
4. Counter-example: had `has_never_treated == False` (every unit
888890
eventually treated, even if some pre-treatment rows have zero
889-
dose so `treatment_dose.has_zero_dose == True`), the dose
890-
column would carry no never-treated unit. With a `first_treat`
891-
column consistent with the dose column on per-unit
892-
treated/untreated status, `ContinuousDiD.fit()` would reject
893-
the panel under both `control_group="never_treated"` and
891+
dose so `treatment_dose.has_zero_dose == True`),
892+
`ContinuousDiD.fit()` would reject the panel under both
893+
`control_group="never_treated"` and
894894
`control_group="not_yet_treated"` because Remark 3.1
895-
lowest-dose-as-control is not yet implemented. `ContinuousDiD`
896-
as currently implemented does not apply on this panel.
897-
Available routing alternatives that do not require
898-
`P(D=0) > 0`: linear DiD with the treatment as a continuous
899-
covariate, or `HeterogeneousAdoptionDiD` for graded-adoption
900-
designs. Re-encoding the treatment to a scale that contains a
901-
true never-treated group is an agent-side preprocessing choice
902-
that changes the estimand; it is not documented in REGISTRY as
903-
a supported fallback. Do not relabel not-yet-treated units as
904-
`first_treat == 0` to manufacture controls; the force-zero
905-
coercion path is implementation behavior for inconsistent
906-
inputs, not a documented method for manufacturing
907-
never-treated controls.
895+
lowest-dose-as-control is not yet implemented. On this branch
896+
(no never-treated controls but doses still non-negative),
897+
`HeterogeneousAdoptionDiD` IS a routing alternative for
898+
graded-adoption designs, and linear DiD with the treatment as
899+
a continuous covariate is another; see §2 for the full routing
900+
discussion.
908901
5. Counter-example: had `treatment_dose.dose_min < 0` (continuous
909902
panel with some negative-valued treated doses, e.g. a
910903
centered-around-zero treatment encoding), with a `first_treat`
911-
column consistent with the dose column (negative-dose units
912-
labeled `first_treat > 0`), `ContinuousDiD.fit()` would raise
913-
at line 287-294 ("Dose must be strictly positive for treated
914-
units (D > 0)"). The principled fixes are to re-encode the
915-
treatment to a non-negative support (e.g. shift or absolute
916-
value, with the methodology change documented and the new
917-
estimand reported on the re-encoded scale) or to route to a
918-
different estimator. Do not relabel negative-dose units as
919-
`first_treat == 0` to coerce them away — that is implementation
920-
behavior for inconsistent inputs, not a documented routing
921-
option.
904+
column consistent with the dose column, `ContinuousDiD.fit()`
905+
would raise at line 287-294 ("Dose must be strictly positive
906+
for treated units"). `HeterogeneousAdoptionDiD` is **not** a
907+
routing alternative here either — HAD requires non-negative
908+
dose support (`had.py:1450-1459`, paper Section 2). The
909+
applicable alternative is linear DiD with the treatment as a
910+
signed continuous covariate; see §2 for the full routing
911+
discussion.
922912
6. Fit `ContinuousDiD`; the result object exposes the dose-response
923913
curve (`ATT(d)`) and average causal response (`ACRT(d)`); choose
924914
the headline estimand based on the business question (overall

diff_diff/profile.py

Lines changed: 29 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -87,19 +87,7 @@ class TreatmentDoseShape:
8787
lowest-dose-as-control not yet implemented), because the
8888
canonical setup ties ``first_treat == 0`` to ``D_i == 0``.
8989
Failure means no never-treated controls exist on the dose
90-
column. ``ContinuousDiD`` as currently implemented does not
91-
apply (the paper's lowest-dose-as-control fallback in Remark
92-
3.1 is not implemented here). Routing alternatives that do
93-
not require ``P(D=0) > 0``: ``HeterogeneousAdoptionDiD`` for
94-
graded-adoption designs, or linear DiD with the treatment
95-
as a continuous covariate. Re-encoding the treatment column
96-
to a different scale is an agent-side preprocessing choice
97-
that changes the estimand; it is **not** documented in
98-
REGISTRY as a supported fallback. Do **not** relabel
99-
positive-dose units as ``first_treat == 0`` either: that
100-
triggers ``fit()``'s force-zero coercion path, which is
101-
implementation behavior for inconsistent inputs and is also
102-
not a documented routing option.
90+
column; see routing notes below.
10391
2. ``PanelProfile.treatment_varies_within_unit == False``
10492
(per-unit full-path dose constancy on the dose column). This
10593
IS the actual fit-time gate, matching
@@ -120,20 +108,34 @@ class TreatmentDoseShape:
120108
Predicts ``ContinuousDiD.fit()``'s strictly-positive-treated-
121109
dose requirement (raises ``ValueError`` on negative dose for
122110
``first_treat > 0`` units, ``continuous_did.py:287-294``).
123-
Under the canonical setup, treated units carry their dose
124-
across all periods so ``dose_min`` over non-zero values
125-
reflects the smallest treated dose. Failure means some
126-
treated units have negative dose; ``ContinuousDiD`` as
127-
currently implemented does not apply. Routing alternatives:
128-
``HeterogeneousAdoptionDiD`` or linear DiD with the
129-
treatment as a continuous covariate. Re-encoding the
130-
treatment to a non-negative scale is an agent-side
131-
preprocessing choice that changes the estimand; not
132-
documented in REGISTRY as a supported fallback.
133-
The estimator's force-zero coercion on ``first_treat == 0``
134-
rows with nonzero ``dose`` is implementation behavior for
135-
inconsistent inputs (e.g. an accidentally-nonzero row on a
136-
never-treated unit), not a methodological fallback.
111+
Failure means some treated units have negative dose; see
112+
routing notes below.
113+
114+
Routing alternatives when (1) or (5) fails:
115+
116+
- When (1) fails (no never-treated controls but all observed
117+
doses non-negative): ``ContinuousDiD`` does not apply (Remark
118+
3.1 lowest-dose-as-control is not implemented).
119+
``HeterogeneousAdoptionDiD`` IS a candidate for graded-adoption
120+
designs (HAD's contract requires non-negative dose, satisfied
121+
here); linear DiD with the treatment as a continuous covariate
122+
is another.
123+
- When (5) fails (negative treated doses):
124+
``HeterogeneousAdoptionDiD`` is **not** a fallback either —
125+
HAD raises on negative post-period dose (``had.py:1450-1459``,
126+
paper Section 2). Linear DiD with the treatment as a signed
127+
continuous covariate is the applicable routing alternative.
128+
- Re-encoding the treatment column (shifting, absolute value,
129+
etc.) is an agent-side preprocessing choice that changes the
130+
estimand and is not documented in REGISTRY as a supported
131+
fallback; if the agent re-encodes to non-negative support,
132+
both ``ContinuousDiD`` and ``HeterogeneousAdoptionDiD``
133+
become candidates again on the re-encoded scale.
134+
- Do **not** relabel positive- or negative-dose units as
135+
``first_treat == 0``: that triggers ``ContinuousDiD.fit()``'s
136+
force-zero coercion path, which is implementation behavior
137+
for inconsistent inputs (e.g., an accidentally-nonzero row on
138+
a never-treated unit), not a documented routing option.
137139
138140
The agent must still validate the supplied ``first_treat``
139141
column independently: it must contain at least one

tests/test_guides.py

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -163,6 +163,40 @@ def test_autonomous_count_outcome_uses_asf_outcome_scale_estimand():
163163
)
164164

165165

166+
def test_autonomous_negative_dose_path_does_not_route_to_had():
167+
"""The §5.2 negative-dose counter-example must not present
168+
`HeterogeneousAdoptionDiD` as a direct routing alternative
169+
when `dose_min < 0`. HAD's contract requires non-negative
170+
dose support and raises on negative post-period dose
171+
(`had.py:1450-1459`, paper Section 2). Routing to HAD on a
172+
negative-dose panel without re-encoding would steer the agent
173+
into an unsupported estimator path. Guards against the wording
174+
regressing back to a too-broad "HAD as fallback" framing on
175+
this branch."""
176+
text = get_llm_guide("autonomous")
177+
# Locate counter-example #5 (negative-dose path) within §5.2.
178+
sec_5_2_start = text.index("### §5.2 Continuous-dose panel")
179+
sec_5_3_start = text.index("### §5.3 Count-shaped outcome")
180+
sec_5_2 = text[sec_5_2_start:sec_5_3_start]
181+
# The negative-dose paragraph must explicitly state HAD is NOT a
182+
# routing alternative on this branch. We assert the disqualifying
183+
# phrase is present; we do not forbid `HeterogeneousAdoptionDiD`
184+
# entirely because the section may legitimately mention it as a
185+
# candidate AFTER re-encoding.
186+
assert "HAD" in sec_5_2 or "HeterogeneousAdoptionDiD" in sec_5_2, (
187+
"§5.2 must mention HAD by name on the negative-dose branch "
188+
"so its non-applicability can be explicitly called out."
189+
)
190+
assert "had.py:1450-1459" in sec_5_2, (
191+
"§5.2 must cite `had.py:1450-1459` on the negative-dose "
192+
"branch to anchor HAD's non-negative-dose contract (HAD "
193+
"raises on negative post-period dose, paper Section 2). "
194+
"Without this citation, the agent could route a "
195+
"negative-dose panel directly to HAD and hit a fit-time "
196+
"error."
197+
)
198+
199+
166200
def test_autonomous_worked_examples_avoid_recommender_language():
167201
"""Worked examples must mirror the rest of the guide's discipline:
168202
no prescriptive language in the example reasoning. Multiple paths

0 commit comments

Comments
 (0)