Skip to content

Commit 0328b4a

Browse files
igerberclaude
andcommitted
Address PR #396 R2 review (1 P1)
P1: The "Resolved estimand is not what I expected" inspection snippet computed `d.min()` over `had_data['dose']` (the full panel column), but HAD's `_detect_design()` resolves on `D_{g,F}` — per-unit dose at the first treated period — not the full panel. Per `diff_diff/had.py:1834-1888` (event-study path) and `diff_diff/had.py:4089-4091` (`_detect_design(d_arr)` where `d_arr` is `D_{g,F}`), the detector ignores the structural pre-period zeros that HAD requires (`D_{g,t} = 0` for `t < F`). Consequence: on every valid HAD panel, `had_data['dose'].min()` is always 0 and the snippet would report "Design 1' (WAS)" regardless of the true resolution — exactly the false sense of confirmation the troubleshooting page is meant to dispel. Fix: rewrote the snippet to extract `d_at_F = had_data.loc[ had_data['period'] == F].set_index('unit')['dose']` (per-unit post-period dose, mirroring `had.py:1886-1888`) and compute the threshold check on that series. Renamed printed labels from `d.min()` to `D_{g,F}.min()` and `0.01 * median(|d|)` to `0.01 * median(|D_{g,F}|)` so the diagnostic syntactically matches the registry's rule statement. Updated the surrounding **Cause** prose at `docs/troubleshooting.rst :486-495` to (a) state explicitly that detection runs on `D_{g,F}` not the panel column, (b) note that pre-period zeros on the panel column are structural and uninformative for design choice, and (c) restate the threshold rule and modal-fraction check on `D_{g,F}`. Verification: PYTHONPATH=. DIFF_DIFF_BACKEND=python pytest tests/test_doc_snippets.py reports 111 passed, 4 skipped, 0 failed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 8ccfadf commit 0328b4a

1 file changed

Lines changed: 27 additions & 19 deletions

File tree

docs/troubleshooting.rst

Lines changed: 27 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -483,16 +483,20 @@ HeterogeneousAdoptionDiD (HAD) Issues
483483
**Problem:** ``HeterogeneousAdoptionDiD`` resolves ``target_parameter`` to
484484
``"WAS_d_lower"`` when you expected ``"WAS"`` (or vice versa).
485485

486-
**Cause:** HAD auto-detects the design path from the dose distribution. The
487-
``_detect_design`` rule resolves to Design 1' (``continuous_at_zero``,
488-
targets WAS) when EITHER ``d.min() == 0`` exactly OR ``d.min()`` is a small
489-
positive value below ``0.01 * median(|d|)`` (the small-share-of-treated
490-
escape clause). Otherwise (``d.min()`` larger than that threshold) the
491-
estimator routes to Design 1, with a further check for mass-point structure
492-
(modal fraction at ``d.min()`` exceeding 2% routes to ``mass_point``;
493-
otherwise ``continuous_near_d_lower``); both Design 1 paths target
494-
``WAS_{d_lower}``. So a Design 1 resolution only fires when ``d.min()``
495-
is meaningfully positive relative to the dose scale.
486+
**Cause:** HAD auto-detects the design path from the unit-level
487+
post-treatment dose ``D_{g,F}`` (the dose at the first treated period
488+
``F``, one value per unit), NOT from the full panel ``dose`` column. The
489+
panel column carries structural pre-period zeros (HAD requires
490+
``D_{g,t} = 0`` for ``t < F``), so ``had_data['dose'].min()`` is always
491+
zero on a valid HAD panel and tells you nothing about the resolved
492+
design. ``_detect_design`` then resolves on ``D_{g,F}`` and picks Design
493+
1' (``continuous_at_zero``, targets WAS) when EITHER
494+
``D_{g,F}.min() == 0`` exactly OR ``D_{g,F}.min()`` is a small positive
495+
value below ``0.01 * median(|D_{g,F}|)`` (the small-share-of-treated
496+
escape clause). Otherwise the estimator routes to Design 1, with a
497+
further check for mass-point structure (modal fraction at ``D_{g,F}.min()``
498+
exceeding 2% routes to ``mass_point``; otherwise
499+
``continuous_near_d_lower``); both Design 1 paths target ``WAS_{d_lower}``.
496500

497501
**Solutions:**
498502

@@ -516,12 +520,16 @@ is meaningfully positive relative to the dose scale.
516520
rows.append({'unit': g, 'period': t, 'y': y, 'dose': d})
517521
had_data = pd.DataFrame(rows)
518522
519-
# Inspect the dose support before fitting
520-
d = had_data['dose'].to_numpy()
521-
print(had_data['dose'].describe())
522-
print(f"d.min() = {d.min():.6g}; "
523-
f"0.01 * median(|d|) = {0.01 * np.median(np.abs(d)):.6g}; "
524-
f"d.min() < threshold => Design 1' (WAS)")
523+
# Inspect the support the detector actually uses: per-unit dose at the
524+
# first treated period F. Pre-period zeros on the panel column are
525+
# structural and ignored by `_detect_design()`.
526+
d_at_F = had_data.loc[had_data['period'] == F].set_index('unit')['dose']
527+
print(d_at_F.describe())
528+
d_min = float(d_at_F.min())
529+
d_thr = 0.01 * float(np.median(np.abs(d_at_F)))
530+
print(f"D_{{g,F}}.min() = {d_min:.6g}; "
531+
f"0.01 * median(|D_{{g,F}}|) = {d_thr:.6g}; "
532+
f"D_{{g,F}}.min() < threshold => Design 1' (WAS)")
525533
526534
# Check the resolved estimand after fitting
527535
est = HeterogeneousAdoptionDiD()
@@ -530,9 +538,9 @@ is meaningfully positive relative to the dose scale.
530538
aggregate='event_study')
531539
print(f"Resolved: {results.target_parameter}")
532540
533-
# If you intend Design 1' but `d.min()` exceeds the threshold, verify
534-
# the dose-variable encoding (e.g. log-transformed doses where 0 was
535-
# mapped to a small positive value larger than 1% of the median).
541+
# If you intend Design 1' but `D_{g,F}.min()` exceeds the threshold,
542+
# verify the dose-variable encoding (e.g. log-transformed doses where
543+
# 0 was mapped to a small positive value larger than 1% of the median).
536544
537545
"Mass-point design selected"
538546
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

0 commit comments

Comments
 (0)