T21 P3 wording cleanups: align Section 7 + drift docstring with revised tutorial

igerber · claude · igerber · commit 62b756a613ea · 2026-05-10T11:26:23.000-04:00
Two stale shorthand phrasings inconsistent with the revised methodology framing:

- Section 7 Extensions: "single Design 1 panel" → "single panel where QUG
  led the workflow to select the continuous_at_zero (Design 1) identification
  path" (matches the corrected Section 2 wording).
- `test_event_study_pretrends_fails_to_reject` docstring quoted "close to
  alpha = 0.05 but conclusive"; the user-facing text now says
  "warrants scrutiny" - update internal docstring to match.

No methodology change, no new pins; all 15 drift tests still pass.

Co-Authored-By: Claude Opus 4.7 (1M context) &lt;noreply@anthropic.com&gt;
diff --git a/docs/tutorials/21_had_pretest_workflow.ipynb b/docs/tutorials/21_had_pretest_workflow.ipynb
@@ -2,7 +2,7 @@
  "cells": [
   {
    "cell_type": "markdown",
-   "id": "9e25598f",
+   "id": "2c409551",
    "metadata": {},
    "source": [
     "# Tutorial 21: HAD Pre-test Workflow - Running the Pre-test Diagnostics on the Brand Campaign Panel\n",
@@ -14,7 +14,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "0cc1feee",
+   "id": "1ccaad91",
    "metadata": {},
    "source": [
     "## 1. The Pre-test Battery\n",
@@ -31,7 +31,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "9ac9f15b",
+   "id": "c110fe0e",
    "metadata": {},
    "source": [
     "## 2. The Panel\n",
@@ -42,13 +42,13 @@
   {
    "cell_type": "code",
    "execution_count": 1,
-   "id": "4ced81a7",
+   "id": "05269d1c",
    "metadata": {
     "execution": {
-     "iopub.execute_input": "2026-05-09T23:46:44.813436Z",
-     "iopub.status.busy": "2026-05-09T23:46:44.813125Z",
-     "iopub.status.idle": "2026-05-09T23:46:45.859473Z",
-     "shell.execute_reply": "2026-05-09T23:46:45.859187Z"
+     "iopub.execute_input": "2026-05-10T00:17:36.394301Z",
+     "iopub.status.busy": "2026-05-10T00:17:36.394076Z",
+     "iopub.status.idle": "2026-05-10T00:17:37.818650Z",
+     "shell.execute_reply": "2026-05-10T00:17:37.818348Z"
     }
    },
    "outputs": [
@@ -116,7 +116,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "53772584",
+   "id": "91811549",
    "metadata": {},
    "source": [
     "## 3. Step 1: The Overall Workflow (Two-Period Path)\n",
@@ -129,13 +129,13 @@
   {
    "cell_type": "code",
    "execution_count": 2,
-   "id": "1d7d1a0e",
+   "id": "cbda5c0c",
    "metadata": {
     "execution": {
-     "iopub.execute_input": "2026-05-09T23:46:45.860769Z",
-     "iopub.status.busy": "2026-05-09T23:46:45.860646Z",
-     "iopub.status.idle": "2026-05-09T23:46:45.902629Z",
-     "shell.execute_reply": "2026-05-09T23:46:45.902302Z"
+     "iopub.execute_input": "2026-05-10T00:17:37.819909Z",
+     "iopub.status.busy": "2026-05-10T00:17:37.819802Z",
+     "iopub.status.idle": "2026-05-10T00:17:37.858844Z",
+     "shell.execute_reply": "2026-05-10T00:17:37.858574Z"
     }
    },
    "outputs": [
@@ -188,7 +188,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "bbc73e9e",
+   "id": "9452bc09",
    "metadata": {},
    "source": [
     "**Reading the overall verdict.** Three things to note.\n",
@@ -203,13 +203,13 @@
   {
    "cell_type": "code",
    "execution_count": 3,
-   "id": "d009ea15",
+   "id": "7dca161a",
    "metadata": {
     "execution": {
-     "iopub.execute_input": "2026-05-09T23:46:45.904054Z",
-     "iopub.status.busy": "2026-05-09T23:46:45.903927Z",
-     "iopub.status.idle": "2026-05-09T23:46:45.906185Z",
-     "shell.execute_reply": "2026-05-09T23:46:45.905937Z"
+     "iopub.execute_input": "2026-05-10T00:17:37.860034Z",
+     "iopub.status.busy": "2026-05-10T00:17:37.859953Z",
+     "iopub.status.idle": "2026-05-10T00:17:37.861749Z",
+     "shell.execute_reply": "2026-05-10T00:17:37.861541Z"
     }
    },
    "outputs": [
@@ -269,15 +269,15 @@
   },
   {
    "cell_type": "markdown",
-   "id": "d6258552",
+   "id": "bb4d7ef5",
    "metadata": {},
    "source": [
     "A note on the Yatchew row. The `T_hr` statistic is **very large and negative** (~-35,000). That looks alarming but is correct here: under perfectly linear dose-response with very heterogeneous doses (Uniform[\\$0.01K, \\$50K]) and 60 sorted-by-dose units, the differencing variance `sigma2_diff` (which captures the squared gap between adjacent-by-dose units' `dy` values) is much larger than the OLS residual variance `sigma2_lin`. The formula `T_hr = sqrt(G) * (sigma2_lin - sigma2_diff) / sigma2_W` then goes massively negative, p-value rounds to 1.0, and we comfortably fail to reject linearity. (For a different way to look at this same test, see the Yatchew side panel later in the notebook.)\n"
    ]
   },
   {
    "cell_type": "markdown",
-   "id": "9b25fada",
+   "id": "0bb3c4e3",
    "metadata": {},
    "source": [
     "## 4. Step 2: Upgrade to the Event-Study Workflow\n",
@@ -296,13 +296,13 @@
   {
    "cell_type": "code",
    "execution_count": 4,
-   "id": "6dd7f0f3",
+   "id": "e4903c58",
    "metadata": {
     "execution": {
-     "iopub.execute_input": "2026-05-09T23:46:45.907227Z",
-     "iopub.status.busy": "2026-05-09T23:46:45.907141Z",
-     "iopub.status.idle": "2026-05-09T23:46:46.040067Z",
-     "shell.execute_reply": "2026-05-09T23:46:46.039690Z"
+     "iopub.execute_input": "2026-05-10T00:17:37.862773Z",
+     "iopub.status.busy": "2026-05-10T00:17:37.862692Z",
+     "iopub.status.idle": "2026-05-10T00:17:37.988346Z",
+     "shell.execute_reply": "2026-05-10T00:17:37.988066Z"
     }
    },
    "outputs": [
@@ -342,7 +342,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "5f12d7aa",
+   "id": "b820c289",
    "metadata": {},
    "source": [
     "**Reading the event-study verdict.** Now the verdict reads `\"QUG, joint pre-trends, and joint linearity diagnostics fail-to-reject (TWFE admissible under Section 4 assumptions)\"`. The `\"deferred\"` caveat from the overall path is gone because the joint pre-trends and joint homogeneity diagnostics now ran. The structural fields confirm: `pretrends_joint` and `homogeneity_joint` are both populated.\n",
@@ -355,13 +355,13 @@
   {
    "cell_type": "code",
    "execution_count": 5,
-   "id": "cfaa750b",
+   "id": "cd1d2dde",
    "metadata": {
     "execution": {
-     "iopub.execute_input": "2026-05-09T23:46:46.041790Z",
-     "iopub.status.busy": "2026-05-09T23:46:46.041665Z",
-     "iopub.status.idle": "2026-05-09T23:46:46.043716Z",
-     "shell.execute_reply": "2026-05-09T23:46:46.043421Z"
+     "iopub.execute_input": "2026-05-10T00:17:37.989443Z",
+     "iopub.status.busy": "2026-05-10T00:17:37.989364Z",
+     "iopub.status.idle": "2026-05-10T00:17:37.991250Z",
+     "shell.execute_reply": "2026-05-10T00:17:37.990991Z"
     }
    },
    "outputs": [
@@ -434,7 +434,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "b95cbac1",
+   "id": "072e39d8",
    "metadata": {},
    "source": [
     "The pre-trends p-value (~0.07) sits close to the conventional alpha = 0.05 threshold. The test does not reject at alpha = 0.05, but the near-threshold p-value warrants scrutiny - the diagnostic is not failing in a clearly-far-from-rejection regime. In a real analysis this would warrant a closer look at the per-horizon CvM contributions (visible in `per_horizon_stats`) and possibly a Pierce-Schott-style linear-trend detrending via `trends_lin=True` (an extension we do not demonstrate here; see `did_had_pretest_workflow`'s docstring).\n",
@@ -446,7 +446,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "c0d6ddbb",
+   "id": "bba51a15",
    "metadata": {},
    "source": [
     "## 5. Side Panel: Yatchew-HR Null Modes\n",
@@ -462,13 +462,13 @@
   {
    "cell_type": "code",
    "execution_count": 6,
-   "id": "c231b096",
+   "id": "d0d4807d",
    "metadata": {
     "execution": {
-     "iopub.execute_input": "2026-05-09T23:46:46.045080Z",
-     "iopub.status.busy": "2026-05-09T23:46:46.044960Z",
-     "iopub.status.idle": "2026-05-09T23:46:46.050811Z",
-     "shell.execute_reply": "2026-05-09T23:46:46.050511Z"
+     "iopub.execute_input": "2026-05-10T00:17:37.992213Z",
+     "iopub.status.busy": "2026-05-10T00:17:37.992138Z",
+     "iopub.status.idle": "2026-05-10T00:17:37.996905Z",
+     "shell.execute_reply": "2026-05-10T00:17:37.996663Z"
     }
    },
    "outputs": [
@@ -530,7 +530,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "f0a34622",
+   "id": "45254f92",
    "metadata": {},
    "source": [
     "**Reading the side-panel comparison.**\n",
@@ -545,7 +545,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "fa7bcd99",
+   "id": "6bdc1f7d",
    "metadata": {},
    "source": [
     "## 6. Communicating the Diagnostics to Leadership\n",
@@ -565,12 +565,12 @@
   },
   {
    "cell_type": "markdown",
-   "id": "0126ef99",
+   "id": "d866da6c",
    "metadata": {},
    "source": [
     "## 7. Extensions\n",
     "\n",
-    "This tutorial covered the composite pre-test workflow on a single Design 1 panel. A few directions we did not exercise here:\n",
+    "This tutorial covered the composite pre-test workflow on a single panel where QUG led the workflow to select the `continuous_at_zero` (Design 1) identification path. A few directions we did not exercise here:\n",
     "\n",
     "- **Survey-weighted / population-weighted inference** - HAD's pre-test workflow accepts `survey_design=` (or the deprecated `survey=` / `weights=` aliases) for design-based inference. The QUG step is permanently deferred under survey weighting (extreme-value theory under complex sampling is not a settled toolkit); the linearity family runs with PSU-level Mammen multiplier bootstrap (Stute and joint variants) and weighted OLS + weighted variance components (Yatchew). A follow-up tutorial covers this path end-to-end.\n",
     "- **`trends_lin=True` (Pierce-Schott Eq 17 / 18 detrending)** - mirrors R `DIDHAD::did_had(..., trends_lin=TRUE)`. Forwards into both joint pre-trends and joint homogeneity wrappers; consumes the placebo at `base_period - 1` and skips Step 2 if no earlier placebo survives the drop. Useful when you suspect linear time trends correlated with dose but want to keep the joint-Stute machinery.\n",
@@ -587,7 +587,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "cad9c1d7",
+   "id": "0105ae26",
    "metadata": {},
    "source": [
     "## 8. Summary Checklist\n",
diff --git a/tests/test_t21_had_pretest_workflow_drift.py b/tests/test_t21_had_pretest_workflow_drift.py
@@ -270,7 +270,8 @@ def test_event_study_homogeneity_horizons_correct(event_study_report):
 
 def test_event_study_pretrends_fails_to_reject(event_study_report):
     """Section 4 narrative quotes the pre-trends p-value as 'close to
-    alpha = 0.05 but conclusive' (~0.07 from numbers.json). Use binary
+    alpha = 0.05 ... warrants scrutiny' (~0.07 from numbers.json) -
+    non-rejection is not a clean pass at this margin. Use binary
     fail-to-reject + a wide abs tolerance band - bootstrap p-values
     near alpha are the most sensitive to RNG path differences."""
     pj = event_study_report.pretrends_joint