You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- P1: Scope PT-Post/CS equivalence to post-treatment ATT(g,t) across all
doc files (rst, README, notebook) — pre-treatment diagnostics may differ
- P2: Remove inappropriate cluster guidance from Webb bootstrap description
- P3: Fix citation initials (Chen X., Xie H.) and add full paper title
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy file name to clipboardExpand all lines: docs/tutorials/15_efficient_did.ipynb
+5-80Lines changed: 5 additions & 80 deletions
Original file line number
Diff line number
Diff line change
@@ -155,51 +155,15 @@
155
155
"cell_type": "markdown",
156
156
"id": "e1ad14f5",
157
157
"metadata": {},
158
-
"source": [
159
-
"## PT-All vs PT-Post\n",
160
-
"\n",
161
-
"EDiD supports two parallel trends assumptions:\n",
162
-
"\n",
163
-
"- **PT-All** (`pt_assumption=\"all\"`): Parallel trends holds across *all* pre-treatment periods. The model is overidentified --- more valid comparisons exist than needed --- and EDiD exploits this for tighter SEs.\n",
164
-
"- **PT-Post** (`pt_assumption=\"post\"`): Parallel trends holds only from `g-1` onward (the weaker, standard assumption). EDiD reduces to a single-baseline estimator, equivalent to Callaway-Sant'Anna.\n",
165
-
"\n",
166
-
"PT-All is the default because it delivers efficiency gains when the assumption holds. Use PT-Post if you're concerned about violations in early pre-treatment periods."
167
-
]
158
+
"source": "## PT-All vs PT-Post\n\nEDiD supports two parallel trends assumptions:\n\n- **PT-All** (`pt_assumption=\"all\"`): Parallel trends holds across *all* pre-treatment periods. The model is overidentified --- more valid comparisons exist than needed --- and EDiD exploits this for tighter SEs.\n- **PT-Post** (`pt_assumption=\"post\"`): Parallel trends holds only from `g-1` onward (the weaker, standard assumption). EDiD uses a single baseline (`g-1`) per cohort, matching `CallawaySantAnna(control_group='never_treated')` for post-treatment ATT(g,t). Pre-treatment diagnostics may differ from CS's default `base_period='varying'`.\n\nPT-All is the default because it delivers efficiency gains when the assumption holds. Use PT-Post if you're concerned about violations in early pre-treatment periods."
"print(\"Notice: PT-Post and CS produce identical ATTs. PT-All has the same\")\n",
201
-
"print(\"ATT but tighter SEs --- this is the efficiency gain.\")"
202
-
]
166
+
"source": "# Fit under both assumptions\nresults_all = EfficientDiD(pt_assumption=\"all\").fit(\n data, outcome='outcome', unit='unit', time='period',\n first_treat='first_treat', aggregate='all')\n\nresults_post = EfficientDiD(pt_assumption=\"post\").fit(\n data, outcome='outcome', unit='unit', time='period',\n first_treat='first_treat', aggregate='all')\n\n# Compare with Callaway-Sant'Anna\nresults_cs = CallawaySantAnna().fit(\n data, outcome='outcome', unit='unit', time='period',\n first_treat='first_treat')\n\nprint(\"PT-All vs PT-Post vs Callaway-Sant'Anna\")\nprint(\"=\" * 65)\nprint(f\"{'Estimator':<25} {'ATT':>10} {'SE':>10} {'CI Width':>12}\")\nprint(\"-\" * 65)\nfor name, r in [(\"EDiD (PT-All)\", results_all),\n (\"EDiD (PT-Post)\", results_post),\n (\"CallawaySantAnna\", results_cs)]:\n ci_width = r.overall_conf_int[1] - r.overall_conf_int[0]\n print(f\"{name:<25} {r.overall_att:>10.4f} {r.overall_se:>10.4f} {ci_width:>12.4f}\")\nprint()\nprint(\"PT-Post and CS produce identical post-treatment ATTs.\")"
203
167
},
204
168
{
205
169
"cell_type": "markdown",
@@ -324,16 +288,7 @@
324
288
"cell_type": "markdown",
325
289
"id": "a89342da",
326
290
"metadata": {},
327
-
"source": [
328
-
"## Bootstrap Inference\n",
329
-
"\n",
330
-
"EDiD supports multiplier bootstrap for inference. The bootstrap perturbs the influence function values with random weights to obtain bootstrap distributions of all parameters.\n",
331
-
"\n",
332
-
"Three weight distributions are available:\n",
333
-
"- **Rademacher** (default): $\\pm 1$ with equal probability --- standard choice, works well in most settings\n",
334
-
"- **Mammen**: Two-point distribution that matches third moments\n",
335
-
"- **Webb**: Six-point distribution, recommended when clusters are very few (<10)"
336
-
]
291
+
"source": "## Bootstrap Inference\n\nEDiD supports multiplier bootstrap for inference. The bootstrap perturbs the influence function values with random weights to obtain bootstrap distributions of all parameters.\n\nThree weight distributions are available:\n- **Rademacher** (default): $\\pm 1$ with equal probability --- standard choice, works well in most settings\n- **Mammen**: Two-point distribution that matches third moments\n- **Webb**: Six-point distribution with wider support"
337
292
},
338
293
{
339
294
"cell_type": "code",
@@ -544,37 +499,7 @@
544
499
"cell_type": "markdown",
545
500
"id": "ef99ee47",
546
501
"metadata": {},
547
-
"source": [
548
-
"## Summary\n",
549
-
"\n",
550
-
"**Key takeaways:**\n",
551
-
"\n",
552
-
"1. EDiD achieves the **semiparametric efficiency bound** for ATT estimation in staggered designs\n",
553
-
"2. Under **PT-All**, EDiD exploits overidentification for tighter SEs than CS\n",
554
-
"3. Under **PT-Post**, EDiD reduces to **exactly Callaway-Sant'Anna**\n",
555
-
"4. The efficiency gain comes from optimally weighting across all valid (comparison group, baseline) pairs\n",
556
-
"5. **Event study** and **group** aggregations work just like CS\n",
557
-
"6. **Multiplier bootstrap** provides robust inference with Rademacher, Mammen, or Webb weights\n",
558
-
"7. **Condition numbers** flag potentially unstable weight matrices\n",
559
-
"8. **Anticipation** shifts the effective treatment boundary for pre-treatment effects\n",
560
-
"9. Phase 1 is **no-covariates only** --- Phase 2 will add covariate support\n",
561
-
"10. When in doubt, run both EDiD and CS --- if ATTs agree, report EDiD for tighter CIs\n",
"| `seed` | `None` | Random seed for reproducibility |\n",
572
-
"| `anticipation` | `0` | Anticipation periods |\n",
573
-
"\n",
574
-
"**Reference:** Chen, J., Sant'Anna, P. H. C., & Xie, Y. (2025). Efficient Difference-in-Differences. *Working Paper*.\n",
575
-
"\n",
576
-
"*See also: [Choosing an Estimator](../choosing_estimator.rst) for guidance on when to use EDiD vs other estimators.*"
577
-
]
502
+
"source": "## Summary\n\n**Key takeaways:**\n\n1. EDiD achieves the **semiparametric efficiency bound** for ATT estimation in staggered designs\n2. Under **PT-All**, EDiD exploits overidentification for tighter SEs than CS\n3. Under **PT-Post**, EDiD matches CS for post-treatment ATT(g,t); pre-treatment diagnostics use a fixed baseline and may differ from CS's default varying baseline\n4. The efficiency gain comes from optimally weighting across all valid (comparison group, baseline) pairs\n5. **Event study** and **group** aggregations work just like CS\n6. **Multiplier bootstrap** provides robust inference with Rademacher, Mammen, or Webb weights\n7. **Condition numbers** flag potentially unstable weight matrices\n8. **Anticipation** shifts the effective treatment boundary for pre-treatment effects\n9. Phase 1 is **no-covariates only** --- Phase 2 will add covariate support\n10. When in doubt, run both EDiD and CS --- if ATTs agree, report EDiD for tighter CIs\n\n**Parameter reference:**\n\n| Parameter | Default | Description |\n|-----------|---------|-------------|\n| `pt_assumption` | `\"all\"` | `\"all\"` (overidentified) or `\"post\"` (just-identified, matches CS post-treatment ATT) |\n| `alpha` | `0.05` | Significance level |\n| `n_bootstrap` | `0` | Number of bootstrap iterations (0 = analytical only) |\n| `bootstrap_weights` | `\"rademacher\"` | Bootstrap weight distribution: `\"rademacher\"`, `\"mammen\"`, `\"webb\"` |\n| `seed` | `None` | Random seed for reproducibility |\n| `anticipation` | `0` | Anticipation periods |\n\n**Reference:** Chen, X., Sant'Anna, P. H. C., & Xie, H. (2025). Efficient Difference-in-Differences and Event Study Estimators. *Working Paper*.\n\n*See also: [Choosing an Estimator](../choosing_estimator.rst) for guidance on when to use EDiD vs other estimators.*"
0 commit comments