|
295 | 295 | "source": [ |
296 | 296 | "## 3. Naive vs survey-aware headline fit\n", |
297 | 297 | "\n", |
298 | | - "T20's headline path collapses to two periods (pre-mean vs post-mean per\n", |
299 | | - "state) and fits HAD with `design=\"auto\"` - the heuristic lands on\n", |
300 | | - "`continuous_near_d_lower` (Design 1) on this dose support, with the\n", |
301 | | - "target estimand `WAS_d_lower` (Weighted Average Slope at the lower\n", |
302 | | - "boundary). T22 fits the same configuration twice: once naive (no\n", |
303 | | - "`survey_design` argument), once survey-aware\n", |
304 | | - "(`survey_design=sd`). Both fits use the same local-linear estimator family at d_lower,\n", |
305 | | - "but the moment computations in `_fit_continuous` switch to weighted\n", |
306 | | - "form only when weights are present (`had.py:3747-3760` for the\n", |
307 | | - "denominator; `:3803-3808` for `dy_mean`). The naive fit uses the\n", |
308 | | - "unweighted local-linear `tau_bc`, the unweighted `dy_mean`, and\n", |
309 | | - "the unweighted denominator `E[D - d_lower]`. The survey-aware fit\n", |
310 | | - "uses the WEIGHTED `tau_bc` (via `bias_corrected_local_linear(...,\n", |
311 | | - "weights=weights_arr)`), the weighted `np.average(dy, weights=...)`,\n", |
312 | | - "and the weighted `np.average(d - d_lower, weights=...)`. On\n", |
313 | | - "this DGP the weight CV (~0.30) and the dose-distribution shape do\n", |
314 | | - "not co-vary strongly enough to shift the boundary slope materially,\n", |
315 | | - "so the two ATTs are numerically close on this DGP. The SE\n", |
316 | | - "and CI differ because the survey path additionally folds the PSU\n", |
317 | | - "clustering and FPC into the variance via the Binder TSL composition.\n" |
| 298 | + "T20's headline path collapses to two periods (pre-mean vs post-mean\n", |
| 299 | + "per state) and fits HAD with `design=\"auto\"` — the heuristic lands\n", |
| 300 | + "on `continuous_near_d_lower` (Design 1), with the target estimand\n", |
| 301 | + "`WAS_d_lower`. T22 fits the same configuration twice: once naive\n", |
| 302 | + "(no `survey_design` argument), once survey-aware. Both fits share\n", |
| 303 | + "the same local-linear machinery at d_lower; the survey path\n", |
| 304 | + "additionally consumes the weights in the local-linear `tau_bc`\n", |
| 305 | + "boundary fit, in the weighted ΔY mean, and in the weighted\n", |
| 306 | + "denominator `E_w[D - d_lower]`. On this DGP the weight CV (~0.30)\n", |
| 307 | + "and dose distribution do not co-vary strongly enough to shift the\n", |
| 308 | + "boundary slope materially, so the two ATTs land close. The SE and\n", |
| 309 | + "CI differ because the survey path folds PSU clustering and FPC\n", |
| 310 | + "into the variance via Binder TSL.\n" |
318 | 311 | ] |
319 | 312 | }, |
320 | 313 | { |
|
413 | 406 | "estimand, not a bug. We unpack it in the next section.\n", |
414 | 407 | "\n", |
415 | 408 | "\n", |
416 | | - "**A note on the non-testable identifying assumption.** Design 1\n", |
417 | | - "(`continuous_near_d_lower`) requires **Assumption 6** from de\n", |
418 | | - "Chaisemartin et al. (2026) for point identification of\n", |
419 | | - "`WAS_d_lower`, or **Assumption 5** for sign identification only.\n", |
420 | | - "Both are about local linearity of the dose-response near `d_lower`\n", |
421 | | - "and are **not testable from data** — the linearity diagnostics\n", |
422 | | - "exercised in §6 (Stute, Yatchew, joint pretrends, joint\n", |
423 | | - "homogeneity) are necessary but not sufficient. Justify Assumption\n", |
424 | | - "6 from domain knowledge: is there reason to believe the marginal\n", |
425 | | - "effect of the next $1K of supplemental spend is roughly constant\n", |
426 | | - "in the $5K-$50K range? On this DGP it is, by construction; in a\n", |
427 | | - "real analysis, this is the load-bearing methodology caveat\n", |
428 | | - "alongside the QUG-under-survey deferral §6 calls out. The library\n", |
429 | | - "fires a `UserWarning` flagging this on every Design 1 fit; we\n", |
430 | | - "let it surface in the cell above for the headline fit and\n", |
431 | | - "narrowly filter it on subsequent fits to keep the cell output\n", |
432 | | - "focused." |
| 409 | + "**A non-testable identifying assumption.** Design 1 requires\n", |
| 410 | + "**Assumption 6** for point identification of `WAS_d_lower` (or\n", |
| 411 | + "**Assumption 5** for sign identification only) — both are about\n", |
| 412 | + "local linearity of the dose-response near `d_lower` and are **not\n", |
| 413 | + "testable from data**. The §6 linearity diagnostics (Stute,\n", |
| 414 | + "Yatchew, joint pretrends/homogeneity) are necessary but not\n", |
| 415 | + "sufficient. Assumption 6 itself is justified from domain\n", |
| 416 | + "knowledge (is the marginal effect of the next $1K of supplemental\n", |
| 417 | + "spend roughly constant in the $5K-$50K range?). The library fires\n", |
| 418 | + "a `UserWarning` on every Design 1 fit; the headline cell above\n", |
| 419 | + "lets it surface, subsequent cells filter it as redundant. This is\n", |
| 420 | + "the load-bearing methodology caveat alongside the QUG-under-survey\n", |
| 421 | + "deferral (§6)." |
433 | 422 | ] |
434 | 423 | }, |
435 | 424 | { |
|
439 | 428 | "source": [ |
440 | 429 | "## 4. Why the SE inflation is modest for HAD\n", |
441 | 430 | "\n", |
442 | | - "The HAD `WAS_d_lower` estimand is the **average slope above d_lower**:\n", |
443 | | - "`WAS_{d̲} = (E[ΔY] - lim_{d↓d̲} E[ΔY | D_2 ≤ d]) / E[D_2 - d̲]`\n", |
444 | | - "(REGISTRY § HeterogeneousAdoptionDiD; `had.py:21-31`). The estimator\n", |
445 | | - "uses a **local-linear boundary fit** to estimate the\n", |
446 | | - "`lim_{d↓d_lower} E[ΔY | D_2 ≤ d]` term — the only component of the\n", |
447 | | - "estimand that requires nonparametric identification. The leading-\n", |
448 | | - "order variance is therefore dominated by the influence functions of\n", |
449 | | - "units near `d_lower`, NOT by the full panel. With dose ~ Uniform[5, 50] and\n", |
450 | | - "60 states, only a handful of states sit close to d_lower ~ 5 - and\n", |
451 | | - "those are the units whose IFs dominate `Var(WAS_d_lower)`. The\n", |
452 | | - "PSU-level cluster correlation can amplify the variance only as much\n", |
453 | | - "as those few units are correlated with PSU-mates. With 2 states/PSU\n", |
454 | | - "and only a small share of states near the boundary, the within-PSU\n", |
| 431 | + "**The intuition.** `WAS_d_lower` is the average slope above d_lower,\n", |
| 432 | + "but its leading-order variance reads off a local-linear boundary\n", |
| 433 | + "fit at `d_lower` — and that fit only weights units near the\n", |
| 434 | + "boundary. With dose ~ Uniform[5, 50] and 60 states, only a handful\n", |
| 435 | + "of states sit close to d_lower ~ 5, and those are the units whose\n", |
| 436 | + "influence functions dominate `Var(WAS_d_lower)`. The PSU-level\n", |
| 437 | + "cluster correlation can amplify the variance only as much as those\n", |
| 438 | + "few units are correlated with PSU-mates. With 2 states/PSU and\n", |
| 439 | + "only a small share of states near the boundary, the within-PSU\n", |
455 | 440 | "correlation has a small lever to act on.\n", |
456 | 441 | "\n", |
| 442 | + "**Formal definition.** `WAS_{d̲} = (E[ΔY] - lim_{d↓d̲} E[ΔY | D_2\n", |
| 443 | + "≤ d]) / E[D_2 - d̲]` (REGISTRY § HeterogeneousAdoptionDiD;\n", |
| 444 | + "`had.py:21-31`). The estimator uses a local-linear boundary fit at\n", |
| 445 | + "`d_lower` to estimate the `lim_{d↓d̲} E[ΔY | D_2 ≤ d]` term — the\n", |
| 446 | + "only component requiring nonparametric identification.\n", |
| 447 | + "\n", |
457 | 448 | "Contrast with the event-study path: each event-time horizon is a\n", |
458 | 449 | "**separate** local-linear fit on that horizon's first differences\n", |
459 | 450 | "`ΔY_{g,t} = Y_{g,t} - Y_{g,F-1}` against the common regressor\n", |
|
545 | 536 | "Refit with `aggregate=\"event_study\"` and `cband=True` to get\n", |
546 | 537 | "per-horizon ATT estimates plus a sup-t confidence band that adjusts\n", |
547 | 538 | "for the multiple-horizon comparison. The cband is computed via a\n", |
548 | | - "multiplier bootstrap that aggregates per-PSU IF tensor under the\n", |
549 | | - "survey design (Phase 4.5 B composition; the Phase 4.5 C work\n", |
550 | | - "covered the survey-aware Stute pretests demonstrated in §6).\n" |
| 539 | + "multiplier bootstrap that aggregates the per-PSU IF tensor under\n", |
| 540 | + "the survey design.\n" |
551 | 541 | ] |
552 | 542 | }, |
553 | 543 | { |
|
785 | 775 | "> rollout.\n", |
786 | 776 | "\n", |
787 | 777 | "> **For the methodologist.** The HAD pretest workflow runs two\n", |
788 | | - "> diagnostic passes; the QUG step is deferred under survey/weights\n", |
789 | | - "> per Phase 4.5 C0 (the load-bearing caveat we owe the audience).\n", |
790 | | - ">\n", |
791 | | - "> 1. **Overall (two-period) path:** `Stute` (CvM linearity test on\n", |
792 | | - "> residuals) + `Yatchew-HR` (closed-form weighted-OLS sandwich,\n", |
793 | | - "> `null=\"linearity\"` only - T22 does not exercise the\n", |
794 | | - "> `mean_independence` mode). Both fail-to-reject; verdict reads\n", |
795 | | - "> `\"Stute and Yatchew linearity diagnostics fail-to-reject\n", |
796 | | - "> (linearity-conditional verdict; QUG-under-survey deferred per\n", |
797 | | - "> Phase 4.5 C0)\"`.\n", |
798 | | - ">\n", |
799 | | - "> 2. **Event-study path:** `joint pre-trends` (joint-Stute over the\n", |
800 | | - "> three pre-launch placebo horizons) + `joint homogeneity`\n", |
801 | | - "> (joint-Stute over the four post-launch horizons). Both\n", |
802 | | - "> fail-to-reject; verdict reads `\"joint pre-trends and joint\n", |
803 | | - "> linearity diagnostics fail-to-reject (linearity-conditional\n", |
804 | | - "> verdict; QUG-under-survey deferred per Phase 4.5 C0)\"`.\n", |
805 | | - "> `report.yatchew is None` and `report.stute is None` on this\n", |
806 | | - "> path - those single-horizon tests are overall-only.\n", |
| 778 | + "> diagnostic passes — overall (`Stute` CvM + `Yatchew-HR`\n", |
| 779 | + "> closed-form) on the two-period collapse, and event-study\n", |
| 780 | + "> (`joint pre-trends` + `joint homogeneity`, both joint-Stute)\n", |
| 781 | + "> on the full panel. Both passes fail-to-reject on this DGP. Both\n", |
| 782 | + "> verdicts end in `(linearity-conditional verdict; QUG-under-survey\n", |
| 783 | + "> deferred per Phase 4.5 C0)` — the load-bearing C0 caveat. On the\n", |
| 784 | + "> event-study path `report.yatchew` and `report.stute` are `None`;\n", |
| 785 | + "> those single-horizon tests are overall-only.\n", |
807 | 786 | ">\n", |
808 | | - "> Both paths share the QUG-under-survey deferral suffix. The\n", |
809 | | - "> design-based SE on the headline fit is ~10% larger than the naive\n", |
810 | | - "> SE - smaller than the inflation a CallawaySantAnna or\n", |
| 787 | + "> The design-based SE on the headline fit is ~10% larger than the\n", |
| 788 | + "> naive SE — smaller than the inflation a CallawaySantAnna or\n", |
811 | 789 | "> LinearRegression coefficient would see at this PSU correlation,\n", |
812 | | - "> because HAD uses a local-linear boundary fit at d_lower to\n", |
813 | | - "> estimate the boundary-limit term in the `WAS-d_lower` formula\n", |
814 | | - "> (variance is dominated by the few states near the boundary, not\n", |
815 | | - "> by the full panel; see section 4).\n" |
| 790 | + "> because HAD uses a local-linear boundary fit at `d_lower` to\n", |
| 791 | + "> estimate the boundary-limit term in the `WAS_d_lower` formula\n", |
| 792 | + "> (variance is dominated by the few states near the boundary; see\n", |
| 793 | + "> §4).\n" |
816 | 794 | ] |
817 | 795 | }, |
818 | 796 | { |
|
0 commit comments