Skip to content

Commit e13c358

Browse files
igerberclaude
andcommitted
Address PR #199 re-review: scope equivalence phrases, fix citations
- Scope "same point estimates as CS" → "matching post-treatment ATT(g,t) with CS" - Scope "CS-equivalent" → "post-treatment CS match" in code comment - Remove "*Working Paper*" suffix from citations to match REGISTRY.md Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 41d6c84 commit e13c358

3 files changed

Lines changed: 4 additions & 18 deletions

File tree

docs/api/efficient_did.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ This module implements the efficiency-bound-attaining estimator that:
2727
- You need a formal efficiency benchmark for comparing estimators
2828

2929
**Reference:** Chen, X., Sant'Anna, P. H. C., & Xie, H. (2025). Efficient
30-
Difference-in-Differences and Event Study Estimators. *Working Paper*.
30+
Difference-in-Differences and Event Study Estimators.
3131

3232
.. module:: diff_diff.efficient_did
3333

docs/choosing_estimator.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -238,7 +238,7 @@ Use :class:`~diff_diff.EfficientDiD` when:
238238
239239
from diff_diff import EfficientDiD
240240
241-
edid = EfficientDiD(pt_assumption="all") # or "post" for CS-equivalent
241+
edid = EfficientDiD(pt_assumption="all") # or "post" for post-treatment CS match
242242
results = edid.fit(data, outcome='y', unit='unit_id',
243243
time='period', first_treat='first_treat',
244244
aggregate='all')

docs/tutorials/15_efficient_did.ipynb

Lines changed: 2 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -63,21 +63,7 @@
6363
"cell_type": "markdown",
6464
"id": "4d734cd9",
6565
"metadata": {},
66-
"source": [
67-
"## What Makes EDiD Different?\n",
68-
"\n",
69-
"Consider a staggered adoption design with cohorts treated at periods 3, 5, and 7, plus a never-treated group. To estimate ATT(g=5, t=6), **Callaway-Sant'Anna** uses a single 2x2 comparison:\n",
70-
"\n",
71-
"> *Compare the outcome change from period 4 to 6 for cohort 5 versus the never-treated group.*\n",
72-
"\n",
73-
"But under **PT-All** (parallel trends across all pre-treatment periods), there are *additional* valid comparisons. Cohort 7 is also untreated at period 6, so it can serve as a comparison group too. And periods 1, 2, 3 can all serve as valid baselines, not just period 4.\n",
74-
"\n",
75-
"Each of these comparisons provides an unbiased estimate of ATT(g=5, t=6), but with different variances. **EDiD finds the optimal linear combination** --- the one that minimizes variance --- by computing the inverse covariance matrix of these \"generated outcomes\" (the paper calls this $\\Omega^*$).\n",
76-
"\n",
77-
"The result: **same point estimates as CS under PT-Post**, but **tighter standard errors under PT-All** because EDiD exploits the overidentification.\n",
78-
"\n",
79-
"> **Key equation (for the curious):** The efficient weight vector is $w^* = \\frac{\\mathbf{1}' \\Omega^{*-1}}{\\mathbf{1}' \\Omega^{*-1} \\mathbf{1}}$, where $\\Omega^*$ is the covariance matrix of the generated outcomes across all valid (comparison group, baseline) pairs. This is the classic GLS optimal weighting. See REGISTRY.md or the paper for full derivations."
80-
]
66+
"source": "## What Makes EDiD Different?\n\nConsider a staggered adoption design with cohorts treated at periods 3, 5, and 7, plus a never-treated group. To estimate ATT(g=5, t=6), **Callaway-Sant'Anna** uses a single 2x2 comparison:\n\n> *Compare the outcome change from period 4 to 6 for cohort 5 versus the never-treated group.*\n\nBut under **PT-All** (parallel trends across all pre-treatment periods), there are *additional* valid comparisons. Cohort 7 is also untreated at period 6, so it can serve as a comparison group too. And periods 1, 2, 3 can all serve as valid baselines, not just period 4.\n\nEach of these comparisons provides an unbiased estimate of ATT(g=5, t=6), but with different variances. **EDiD finds the optimal linear combination** --- the one that minimizes variance --- by computing the inverse covariance matrix of these \"generated outcomes\" (the paper calls this $\\Omega^*$).\n\nThe result: **matching post-treatment ATT(g,t) with CS under PT-Post**, but **tighter standard errors under PT-All** because EDiD exploits the overidentification.\n\n> **Key equation (for the curious):** The efficient weight vector is $w^* = \\frac{\\mathbf{1}' \\Omega^{*-1}}{\\mathbf{1}' \\Omega^{*-1} \\mathbf{1}}$, where $\\Omega^*$ is the covariance matrix of the generated outcomes across all valid (comparison group, baseline) pairs. This is the classic GLS optimal weighting. See REGISTRY.md or the paper for full derivations."
8167
},
8268
{
8369
"cell_type": "markdown",
@@ -499,7 +485,7 @@
499485
"cell_type": "markdown",
500486
"id": "ef99ee47",
501487
"metadata": {},
502-
"source": "## Summary\n\n**Key takeaways:**\n\n1. EDiD achieves the **semiparametric efficiency bound** for ATT estimation in staggered designs\n2. Under **PT-All**, EDiD exploits overidentification for tighter SEs than CS\n3. Under **PT-Post**, EDiD matches CS for post-treatment ATT(g,t); pre-treatment diagnostics use a fixed baseline and may differ from CS's default varying baseline\n4. The efficiency gain comes from optimally weighting across all valid (comparison group, baseline) pairs\n5. **Event study** and **group** aggregations work just like CS\n6. **Multiplier bootstrap** provides robust inference with Rademacher, Mammen, or Webb weights\n7. **Condition numbers** flag potentially unstable weight matrices\n8. **Anticipation** shifts the effective treatment boundary for pre-treatment effects\n9. Phase 1 is **no-covariates only** --- Phase 2 will add covariate support\n10. When in doubt, run both EDiD and CS --- if ATTs agree, report EDiD for tighter CIs\n\n**Parameter reference:**\n\n| Parameter | Default | Description |\n|-----------|---------|-------------|\n| `pt_assumption` | `\"all\"` | `\"all\"` (overidentified) or `\"post\"` (just-identified, matches CS post-treatment ATT) |\n| `alpha` | `0.05` | Significance level |\n| `n_bootstrap` | `0` | Number of bootstrap iterations (0 = analytical only) |\n| `bootstrap_weights` | `\"rademacher\"` | Bootstrap weight distribution: `\"rademacher\"`, `\"mammen\"`, `\"webb\"` |\n| `seed` | `None` | Random seed for reproducibility |\n| `anticipation` | `0` | Anticipation periods |\n\n**Reference:** Chen, X., Sant'Anna, P. H. C., & Xie, H. (2025). Efficient Difference-in-Differences and Event Study Estimators. *Working Paper*.\n\n*See also: [Choosing an Estimator](../choosing_estimator.rst) for guidance on when to use EDiD vs other estimators.*"
488+
"source": "## Summary\n\n**Key takeaways:**\n\n1. EDiD achieves the **semiparametric efficiency bound** for ATT estimation in staggered designs\n2. Under **PT-All**, EDiD exploits overidentification for tighter SEs than CS\n3. Under **PT-Post**, EDiD matches CS for post-treatment ATT(g,t); pre-treatment diagnostics use a fixed baseline and may differ from CS's default varying baseline\n4. The efficiency gain comes from optimally weighting across all valid (comparison group, baseline) pairs\n5. **Event study** and **group** aggregations work just like CS\n6. **Multiplier bootstrap** provides robust inference with Rademacher, Mammen, or Webb weights\n7. **Condition numbers** flag potentially unstable weight matrices\n8. **Anticipation** shifts the effective treatment boundary for pre-treatment effects\n9. Phase 1 is **no-covariates only** --- Phase 2 will add covariate support\n10. When in doubt, run both EDiD and CS --- if ATTs agree, report EDiD for tighter CIs\n\n**Parameter reference:**\n\n| Parameter | Default | Description |\n|-----------|---------|-------------|\n| `pt_assumption` | `\"all\"` | `\"all\"` (overidentified) or `\"post\"` (just-identified, matches CS post-treatment ATT) |\n| `alpha` | `0.05` | Significance level |\n| `n_bootstrap` | `0` | Number of bootstrap iterations (0 = analytical only) |\n| `bootstrap_weights` | `\"rademacher\"` | Bootstrap weight distribution: `\"rademacher\"`, `\"mammen\"`, `\"webb\"` |\n| `seed` | `None` | Random seed for reproducibility |\n| `anticipation` | `0` | Anticipation periods |\n\n**Reference:** Chen, X., Sant'Anna, P. H. C., & Xie, H. (2025). Efficient Difference-in-Differences and Event Study Estimators.\n\n*See also: [Choosing an Estimator](../choosing_estimator.rst) for guidance on when to use EDiD vs other estimators.*"
503489
}
504490
],
505491
"metadata": {

0 commit comments

Comments
 (0)