Fix DGP homogeneity misstatement: disable dynamic effects in tutorial

igerber · claude · igerber · commit 85b76f4fe676 · 2026-03-15T13:24:27.000-04:00
The tutorial describes a homogeneous DGP with "True effect = 2.0" but
generate_staggered_data() defaults to dynamic_effects=True, making
effects grow with event time. Add dynamic_effects=False to both calls
so the DGP matches the prose.

Co-Authored-By: Claude Opus 4.6 (1M context) &lt;noreply@anthropic.com&gt;
diff --git a/docs/tutorials/15_efficient_did.ipynb b/docs/tutorials/15_efficient_did.ipynb
@@ -63,7 +63,21 @@
    "cell_type": "markdown",
    "id": "4d734cd9",
    "metadata": {},
-   "source": "## What Makes EDiD Different?\n\nConsider a staggered adoption design with cohorts treated at periods 3, 5, and 7, plus a never-treated group. To estimate ATT(g=5, t=6), **Callaway-Sant'Anna** uses a single 2x2 comparison:\n\n> *Compare the outcome change from period 4 to 6 for cohort 5 versus the never-treated group.*\n\nBut under **PT-All** (parallel trends across all pre-treatment periods), there are *additional* valid comparisons. Cohort 7 is also untreated at period 6, so it can serve as a comparison group too. And periods 2 and 3 can serve as additional valid baselines beyond CS's default period 4. (Period 1 is excluded --- it is the fixed $Y_1$ reference used in every comparison's differencing, so using it as a baseline adds no information.)\n\nEach of these comparisons provides an unbiased estimate of ATT(g=5, t=6), but with different variances. **EDiD finds the optimal linear combination** --- the one that minimizes variance --- by computing the inverse covariance matrix of these \"generated outcomes\" (the paper calls this $\\Omega^*$).\n\nThe result: **matching post-treatment ATT(g,t) with CS under PT-Post**, but **tighter standard errors under PT-All** because EDiD exploits the overidentification.\n\n> **Key equation (for the curious):** The efficient weight vector is $w^* = \\frac{\\mathbf{1}' \\Omega^{*-1}}{\\mathbf{1}' \\Omega^{*-1} \\mathbf{1}}$, where $\\Omega^*$ is the covariance matrix of the generated outcomes across all valid (comparison group, baseline) pairs. This is the classic GLS optimal weighting. See REGISTRY.md or the paper for full derivations."
+   "source": [
+    "## What Makes EDiD Different?\n",
+    "\n",
+    "Consider a staggered adoption design with cohorts treated at periods 3, 5, and 7, plus a never-treated group. To estimate ATT(g=5, t=6), **Callaway-Sant'Anna** uses a single 2x2 comparison:\n",
+    "\n",
+    "> *Compare the outcome change from period 4 to 6 for cohort 5 versus the never-treated group.*\n",
+    "\n",
+    "But under **PT-All** (parallel trends across all pre-treatment periods), there are *additional* valid comparisons. Cohort 7 is also untreated at period 6, so it can serve as a comparison group too. And periods 2 and 3 can serve as additional valid baselines beyond CS's default period 4. (Period 1 is excluded --- it is the fixed $Y_1$ reference used in every comparison's differencing, so using it as a baseline adds no information.)\n",
+    "\n",
+    "Each of these comparisons provides an unbiased estimate of ATT(g=5, t=6), but with different variances. **EDiD finds the optimal linear combination** --- the one that minimizes variance --- by computing the inverse covariance matrix of these \"generated outcomes\" (the paper calls this $\\Omega^*$).\n",
+    "\n",
+    "The result: **matching post-treatment ATT(g,t) with CS under PT-Post**, but **tighter standard errors under PT-All** because EDiD exploits the overidentification.\n",
+    "\n",
+    "> **Key equation (for the curious):** The efficient weight vector is $w^* = \\frac{\\mathbf{1}' \\Omega^{*-1}}{\\mathbf{1}' \\Omega^{*-1} \\mathbf{1}}$, where $\\Omega^*$ is the covariance matrix of the generated outcomes across all valid (comparison group, baseline) pairs. This is the classic GLS optimal weighting. See REGISTRY.md or the paper for full derivations."
+   ]
   },
   {
    "cell_type": "markdown",
@@ -82,7 +96,8 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "data = generate_staggered_data(n_units=300, n_periods=10, treatment_effect=2.0, seed=42)\n",
+    "data = generate_staggered_data(n_units=300, n_periods=10, treatment_effect=2.0,\n",
+    "                               dynamic_effects=False, seed=42)\n",
     "\n",
     "print(f\"Shape: {data.shape}\")\n",
     "print(f\"Cohorts: {sorted(data['first_treat'].unique())}\")\n",
@@ -141,15 +156,50 @@
    "cell_type": "markdown",
    "id": "e1ad14f5",
    "metadata": {},
-   "source": "## PT-All vs PT-Post\n\nEDiD supports two parallel trends assumptions:\n\n- **PT-All** (`pt_assumption=\"all\"`): Parallel trends holds across *all* pre-treatment periods. The model is overidentified --- more valid comparisons exist than needed --- and EDiD exploits this for tighter SEs.\n- **PT-Post** (`pt_assumption=\"post\"`): Parallel trends holds only from `g-1` onward (the weaker, standard assumption). EDiD uses a single baseline (`g-1`) per cohort, matching `CallawaySantAnna(control_group='never_treated')` for post-treatment ATT(g,t). Pre-treatment diagnostics may differ from CS's default `base_period='varying'`.\n\nPT-All is the default because it delivers efficiency gains when the assumption holds. Use PT-Post if you're concerned about violations in early pre-treatment periods."
+   "source": [
+    "## PT-All vs PT-Post\n",
+    "\n",
+    "EDiD supports two parallel trends assumptions:\n",
+    "\n",
+    "- **PT-All** (`pt_assumption=\"all\"`): Parallel trends holds across *all* pre-treatment periods. The model is overidentified --- more valid comparisons exist than needed --- and EDiD exploits this for tighter SEs.\n",
+    "- **PT-Post** (`pt_assumption=\"post\"`): Parallel trends holds only from `g-1` onward (the weaker, standard assumption). EDiD uses a single baseline (`g-1`) per cohort, matching `CallawaySantAnna(control_group='never_treated')` for post-treatment ATT(g,t). Pre-treatment diagnostics may differ from CS's default `base_period='varying'`.\n",
+    "\n",
+    "PT-All is the default because it delivers efficiency gains when the assumption holds. Use PT-Post if you're concerned about violations in early pre-treatment periods."
+   ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
    "id": "35f70199",
    "metadata": {},
    "outputs": [],
-   "source": "# Fit under both assumptions\nresults_all = EfficientDiD(pt_assumption=\"all\").fit(\n    data, outcome='outcome', unit='unit', time='period',\n    first_treat='first_treat', aggregate='all')\n\nresults_post = EfficientDiD(pt_assumption=\"post\").fit(\n    data, outcome='outcome', unit='unit', time='period',\n    first_treat='first_treat', aggregate='all')\n\n# Compare with Callaway-Sant'Anna\nresults_cs = CallawaySantAnna().fit(\n    data, outcome='outcome', unit='unit', time='period',\n    first_treat='first_treat')\n\nprint(\"PT-All vs PT-Post vs Callaway-Sant'Anna\")\nprint(\"=\" * 65)\nprint(f\"{'Estimator':<25} {'ATT':>10} {'SE':>10} {'CI Width':>12}\")\nprint(\"-\" * 65)\nfor name, r in [(\"EDiD (PT-All)\", results_all),\n                (\"EDiD (PT-Post)\", results_post),\n                (\"CallawaySantAnna\", results_cs)]:\n    ci_width = r.overall_conf_int[1] - r.overall_conf_int[0]\n    print(f\"{name:<25} {r.overall_att:>10.4f} {r.overall_se:>10.4f} {ci_width:>12.4f}\")\nprint()\nprint(\"PT-Post and CS produce identical post-treatment ATTs.\")"
+   "source": [
+    "# Fit under both assumptions\n",
+    "results_all = EfficientDiD(pt_assumption=\"all\").fit(\n",
+    "    data, outcome='outcome', unit='unit', time='period',\n",
+    "    first_treat='first_treat', aggregate='all')\n",
+    "\n",
+    "results_post = EfficientDiD(pt_assumption=\"post\").fit(\n",
+    "    data, outcome='outcome', unit='unit', time='period',\n",
+    "    first_treat='first_treat', aggregate='all')\n",
+    "\n",
+    "# Compare with Callaway-Sant'Anna\n",
+    "results_cs = CallawaySantAnna().fit(\n",
+    "    data, outcome='outcome', unit='unit', time='period',\n",
+    "    first_treat='first_treat')\n",
+    "\n",
+    "print(\"PT-All vs PT-Post vs Callaway-Sant'Anna\")\n",
+    "print(\"=\" * 65)\n",
+    "print(f\"{'Estimator':<25} {'ATT':>10} {'SE':>10} {'CI Width':>12}\")\n",
+    "print(\"-\" * 65)\n",
+    "for name, r in [(\"EDiD (PT-All)\", results_all),\n",
+    "                (\"EDiD (PT-Post)\", results_post),\n",
+    "                (\"CallawaySantAnna\", results_cs)]:\n",
+    "    ci_width = r.overall_conf_int[1] - r.overall_conf_int[0]\n",
+    "    print(f\"{name:<25} {r.overall_att:>10.4f} {r.overall_se:>10.4f} {ci_width:>12.4f}\")\n",
+    "print()\n",
+    "print(\"PT-Post and CS produce identical post-treatment ATTs.\")"
+   ]
   },
   {
    "cell_type": "markdown",
@@ -174,7 +224,8 @@
     "\n",
     "for seed in range(n_seeds):\n",
     "    sim_data = generate_staggered_data(n_units=200, n_periods=8,\n",
-    "                                       treatment_effect=2.0, seed=seed)\n",
+    "                                       treatment_effect=2.0,\n",
+    "                                       dynamic_effects=False, seed=seed)\n",
     "    r_edid = EfficientDiD(pt_assumption=\"all\").fit(\n",
     "        sim_data, outcome='outcome', unit='unit', time='period',\n",
     "        first_treat='first_treat')\n",
@@ -274,7 +325,16 @@
    "cell_type": "markdown",
    "id": "a89342da",
    "metadata": {},
-   "source": "## Bootstrap Inference\n\nEDiD supports multiplier bootstrap for inference. The bootstrap perturbs the influence function values with random weights to obtain bootstrap distributions of all parameters.\n\nThree weight distributions are available:\n- **Rademacher** (default): $\\pm 1$ with equal probability --- standard choice, works well in most settings\n- **Mammen**: Two-point distribution that matches third moments\n- **Webb**: Six-point distribution with wider support"
+   "source": [
+    "## Bootstrap Inference\n",
+    "\n",
+    "EDiD supports multiplier bootstrap for inference. The bootstrap perturbs the influence function values with random weights to obtain bootstrap distributions of all parameters.\n",
+    "\n",
+    "Three weight distributions are available:\n",
+    "- **Rademacher** (default): $\\pm 1$ with equal probability --- standard choice, works well in most settings\n",
+    "- **Mammen**: Two-point distribution that matches third moments\n",
+    "- **Webb**: Six-point distribution with wider support"
+   ]
   },
   {
    "cell_type": "code",
@@ -485,7 +545,37 @@
    "cell_type": "markdown",
    "id": "ef99ee47",
    "metadata": {},
-   "source": "## Summary\n\n**Key takeaways:**\n\n1. EDiD achieves the **semiparametric efficiency bound** for ATT estimation in staggered designs\n2. Under **PT-All**, EDiD exploits overidentification for tighter SEs than CS\n3. Under **PT-Post**, EDiD matches CS for post-treatment ATT(g,t); pre-treatment diagnostics use a fixed baseline and may differ from CS's default varying baseline\n4. The efficiency gain comes from optimally weighting across all valid (comparison group, baseline) pairs\n5. **Event study** and **group** aggregations work just like CS\n6. **Multiplier bootstrap** provides robust inference with Rademacher, Mammen, or Webb weights\n7. **Condition numbers** flag potentially unstable weight matrices\n8. **Anticipation** shifts the effective treatment boundary for pre-treatment effects\n9. Phase 1 is **no-covariates only** --- Phase 2 will add covariate support\n10. When in doubt, run both EDiD and CS --- if ATTs agree, report EDiD for tighter CIs\n\n**Parameter reference:**\n\n| Parameter | Default | Description |\n|-----------|---------|-------------|\n| `pt_assumption` | `\"all\"` | `\"all\"` (overidentified) or `\"post\"` (just-identified, matches CS post-treatment ATT) |\n| `alpha` | `0.05` | Significance level |\n| `n_bootstrap` | `0` | Number of bootstrap iterations (0 = analytical only) |\n| `bootstrap_weights` | `\"rademacher\"` | Bootstrap weight distribution: `\"rademacher\"`, `\"mammen\"`, `\"webb\"` |\n| `seed` | `None` | Random seed for reproducibility |\n| `anticipation` | `0` | Anticipation periods |\n\n**Reference:** Chen, X., Sant'Anna, P. H. C., & Xie, H. (2025). Efficient Difference-in-Differences and Event Study Estimators.\n\n*See also: [Choosing an Estimator](../choosing_estimator.rst) for guidance on when to use EDiD vs other estimators.*"
+   "source": [
+    "## Summary\n",
+    "\n",
+    "**Key takeaways:**\n",
+    "\n",
+    "1. EDiD achieves the **semiparametric efficiency bound** for ATT estimation in staggered designs\n",
+    "2. Under **PT-All**, EDiD exploits overidentification for tighter SEs than CS\n",
+    "3. Under **PT-Post**, EDiD matches CS for post-treatment ATT(g,t); pre-treatment diagnostics use a fixed baseline and may differ from CS's default varying baseline\n",
+    "4. The efficiency gain comes from optimally weighting across all valid (comparison group, baseline) pairs\n",
+    "5. **Event study** and **group** aggregations work just like CS\n",
+    "6. **Multiplier bootstrap** provides robust inference with Rademacher, Mammen, or Webb weights\n",
+    "7. **Condition numbers** flag potentially unstable weight matrices\n",
+    "8. **Anticipation** shifts the effective treatment boundary for pre-treatment effects\n",
+    "9. Phase 1 is **no-covariates only** --- Phase 2 will add covariate support\n",
+    "10. When in doubt, run both EDiD and CS --- if ATTs agree, report EDiD for tighter CIs\n",
+    "\n",
+    "**Parameter reference:**\n",
+    "\n",
+    "| Parameter | Default | Description |\n",
+    "|-----------|---------|-------------|\n",
+    "| `pt_assumption` | `\"all\"` | `\"all\"` (overidentified) or `\"post\"` (just-identified, matches CS post-treatment ATT) |\n",
+    "| `alpha` | `0.05` | Significance level |\n",
+    "| `n_bootstrap` | `0` | Number of bootstrap iterations (0 = analytical only) |\n",
+    "| `bootstrap_weights` | `\"rademacher\"` | Bootstrap weight distribution: `\"rademacher\"`, `\"mammen\"`, `\"webb\"` |\n",
+    "| `seed` | `None` | Random seed for reproducibility |\n",
+    "| `anticipation` | `0` | Anticipation periods |\n",
+    "\n",
+    "**Reference:** Chen, X., Sant'Anna, P. H. C., & Xie, H. (2025). Efficient Difference-in-Differences and Event Study Estimators.\n",
+    "\n",
+    "*See also: [Choosing an Estimator](../choosing_estimator.rst) for guidance on when to use EDiD vs other estimators.*"
+   ]
   }
  ],
  "metadata": {
@@ -495,4 +585,4 @@
  },
  "nbformat": 4,
  "nbformat_minor": 5
-}
+}