Skip to content

Commit 41d6c84

Browse files
igerberclaude
andcommitted
Address PR #199 review feedback: scope PT-Post claims, fix citation
- P1: Scope PT-Post/CS equivalence to post-treatment ATT(g,t) across all doc files (rst, README, notebook) — pre-treatment diagnostics may differ - P2: Remove inappropriate cluster guidance from Webb bootstrap description - P3: Fix citation initials (Chen X., Xie H.) and add full paper title Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 942373b commit 41d6c84

4 files changed

Lines changed: 13 additions & 88 deletions

File tree

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1090,7 +1090,7 @@ results = edid.fit(data, outcome='outcome', unit='unit',
10901090
aggregate='all')
10911091
results.print_summary()
10921092

1093-
# PT-Post mode (reduces to Callaway-Sant'Anna)
1093+
# PT-Post mode (matches CS for post-treatment effects)
10941094
edid_post = EfficientDiD(pt_assumption="post")
10951095
results_post = edid_post.fit(data, outcome='outcome', unit='unit',
10961096
time='period', first_treat='first_treat')
@@ -1100,7 +1100,7 @@ results_post = edid_post.fit(data, outcome='outcome', unit='unit',
11001100

11011101
```python
11021102
EfficientDiD(
1103-
pt_assumption='all', # 'all' (overidentified) or 'post' (= CS)
1103+
pt_assumption='all', # 'all' (overidentified) or 'post' (matches CS post-treatment ATT)
11041104
alpha=0.05, # Significance level
11051105
n_bootstrap=0, # Bootstrap iterations (0 = analytical only)
11061106
bootstrap_weights='rademacher', # 'rademacher', 'mammen', or 'webb'

docs/api/efficient_did.rst

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ This module implements the efficiency-bound-attaining estimator that:
1010
2. **Optimally weights** across comparison groups and baselines via the
1111
inverse covariance matrix Ω*
1212
3. **Supports two PT assumptions**: PT-All (overidentified, tighter SEs) and
13-
PT-Post (just-identified, equivalent to Callaway-Sant'Anna)
13+
PT-Post (just-identified, matches CS for post-treatment effects)
1414
4. **Uses EIF-based inference** for analytical standard errors and multiplier
1515
bootstrap
1616

@@ -26,8 +26,8 @@ This module implements the efficiency-bound-attaining estimator that:
2626
- You want tighter confidence intervals than Callaway-Sant'Anna
2727
- You need a formal efficiency benchmark for comparing estimators
2828

29-
**Reference:** Chen, J., Sant'Anna, P. H. C., & Xie, Y. (2025). Efficient
30-
Difference-in-Differences. *Working Paper*.
29+
**Reference:** Chen, X., Sant'Anna, P. H. C., & Xie, H. (2025). Efficient
30+
Difference-in-Differences and Event Study Estimators. *Working Paper*.
3131

3232
.. module:: diff_diff.efficient_did
3333

@@ -94,7 +94,7 @@ Basic usage::
9494
aggregate='all')
9595
results.print_summary()
9696

97-
PT-Post mode (equivalent to Callaway-Sant'Anna)::
97+
PT-Post mode (matches CS for post-treatment ATT)::
9898

9999
edid_post = EfficientDiD(pt_assumption="post")
100100
results_post = edid_post.fit(data, outcome='outcome', unit='unit',
@@ -145,6 +145,6 @@ Comparison with Other Staggered Estimators
145145
- Multiplier bootstrap
146146
- Multiplier bootstrap
147147
* - PT-Post equivalence
148-
- Reduces to CS
148+
- Matches CS post-treatment ATT(g,t)
149149
- Baseline
150150
- Different framework

docs/choosing_estimator.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -65,7 +65,7 @@ Quick Reference
6565
- ATT with unit/time weights
6666
* - ``EfficientDiD``
6767
- Staggered adoption with optimal efficiency
68-
- PT-All (overidentified) or PT-Post (= CS)
68+
- PT-All (overidentified) or PT-Post
6969
- Group-time ATT(g,t), aggregations
7070
* - ``ContinuousDiD``
7171
- Continuous dose / treatment intensity

docs/tutorials/15_efficient_did.ipynb

Lines changed: 5 additions & 80 deletions
Original file line numberDiff line numberDiff line change
@@ -155,51 +155,15 @@
155155
"cell_type": "markdown",
156156
"id": "e1ad14f5",
157157
"metadata": {},
158-
"source": [
159-
"## PT-All vs PT-Post\n",
160-
"\n",
161-
"EDiD supports two parallel trends assumptions:\n",
162-
"\n",
163-
"- **PT-All** (`pt_assumption=\"all\"`): Parallel trends holds across *all* pre-treatment periods. The model is overidentified --- more valid comparisons exist than needed --- and EDiD exploits this for tighter SEs.\n",
164-
"- **PT-Post** (`pt_assumption=\"post\"`): Parallel trends holds only from `g-1` onward (the weaker, standard assumption). EDiD reduces to a single-baseline estimator, equivalent to Callaway-Sant'Anna.\n",
165-
"\n",
166-
"PT-All is the default because it delivers efficiency gains when the assumption holds. Use PT-Post if you're concerned about violations in early pre-treatment periods."
167-
]
158+
"source": "## PT-All vs PT-Post\n\nEDiD supports two parallel trends assumptions:\n\n- **PT-All** (`pt_assumption=\"all\"`): Parallel trends holds across *all* pre-treatment periods. The model is overidentified --- more valid comparisons exist than needed --- and EDiD exploits this for tighter SEs.\n- **PT-Post** (`pt_assumption=\"post\"`): Parallel trends holds only from `g-1` onward (the weaker, standard assumption). EDiD uses a single baseline (`g-1`) per cohort, matching `CallawaySantAnna(control_group='never_treated')` for post-treatment ATT(g,t). Pre-treatment diagnostics may differ from CS's default `base_period='varying'`.\n\nPT-All is the default because it delivers efficiency gains when the assumption holds. Use PT-Post if you're concerned about violations in early pre-treatment periods."
168159
},
169160
{
170161
"cell_type": "code",
171162
"execution_count": null,
172163
"id": "35f70199",
173164
"metadata": {},
174165
"outputs": [],
175-
"source": [
176-
"# Fit under both assumptions\n",
177-
"results_all = EfficientDiD(pt_assumption=\"all\").fit(\n",
178-
" data, outcome='outcome', unit='unit', time='period',\n",
179-
" first_treat='first_treat', aggregate='all')\n",
180-
"\n",
181-
"results_post = EfficientDiD(pt_assumption=\"post\").fit(\n",
182-
" data, outcome='outcome', unit='unit', time='period',\n",
183-
" first_treat='first_treat', aggregate='all')\n",
184-
"\n",
185-
"# Compare with Callaway-Sant'Anna\n",
186-
"results_cs = CallawaySantAnna().fit(\n",
187-
" data, outcome='outcome', unit='unit', time='period',\n",
188-
" first_treat='first_treat')\n",
189-
"\n",
190-
"print(\"PT-All vs PT-Post vs Callaway-Sant'Anna\")\n",
191-
"print(\"=\" * 65)\n",
192-
"print(f\"{'Estimator':<25} {'ATT':>10} {'SE':>10} {'CI Width':>12}\")\n",
193-
"print(\"-\" * 65)\n",
194-
"for name, r in [(\"EDiD (PT-All)\", results_all),\n",
195-
" (\"EDiD (PT-Post)\", results_post),\n",
196-
" (\"CallawaySantAnna\", results_cs)]:\n",
197-
" ci_width = r.overall_conf_int[1] - r.overall_conf_int[0]\n",
198-
" print(f\"{name:<25} {r.overall_att:>10.4f} {r.overall_se:>10.4f} {ci_width:>12.4f}\")\n",
199-
"print()\n",
200-
"print(\"Notice: PT-Post and CS produce identical ATTs. PT-All has the same\")\n",
201-
"print(\"ATT but tighter SEs --- this is the efficiency gain.\")"
202-
]
166+
"source": "# Fit under both assumptions\nresults_all = EfficientDiD(pt_assumption=\"all\").fit(\n data, outcome='outcome', unit='unit', time='period',\n first_treat='first_treat', aggregate='all')\n\nresults_post = EfficientDiD(pt_assumption=\"post\").fit(\n data, outcome='outcome', unit='unit', time='period',\n first_treat='first_treat', aggregate='all')\n\n# Compare with Callaway-Sant'Anna\nresults_cs = CallawaySantAnna().fit(\n data, outcome='outcome', unit='unit', time='period',\n first_treat='first_treat')\n\nprint(\"PT-All vs PT-Post vs Callaway-Sant'Anna\")\nprint(\"=\" * 65)\nprint(f\"{'Estimator':<25} {'ATT':>10} {'SE':>10} {'CI Width':>12}\")\nprint(\"-\" * 65)\nfor name, r in [(\"EDiD (PT-All)\", results_all),\n (\"EDiD (PT-Post)\", results_post),\n (\"CallawaySantAnna\", results_cs)]:\n ci_width = r.overall_conf_int[1] - r.overall_conf_int[0]\n print(f\"{name:<25} {r.overall_att:>10.4f} {r.overall_se:>10.4f} {ci_width:>12.4f}\")\nprint()\nprint(\"PT-Post and CS produce identical post-treatment ATTs.\")"
203167
},
204168
{
205169
"cell_type": "markdown",
@@ -324,16 +288,7 @@
324288
"cell_type": "markdown",
325289
"id": "a89342da",
326290
"metadata": {},
327-
"source": [
328-
"## Bootstrap Inference\n",
329-
"\n",
330-
"EDiD supports multiplier bootstrap for inference. The bootstrap perturbs the influence function values with random weights to obtain bootstrap distributions of all parameters.\n",
331-
"\n",
332-
"Three weight distributions are available:\n",
333-
"- **Rademacher** (default): $\\pm 1$ with equal probability --- standard choice, works well in most settings\n",
334-
"- **Mammen**: Two-point distribution that matches third moments\n",
335-
"- **Webb**: Six-point distribution, recommended when clusters are very few (<10)"
336-
]
291+
"source": "## Bootstrap Inference\n\nEDiD supports multiplier bootstrap for inference. The bootstrap perturbs the influence function values with random weights to obtain bootstrap distributions of all parameters.\n\nThree weight distributions are available:\n- **Rademacher** (default): $\\pm 1$ with equal probability --- standard choice, works well in most settings\n- **Mammen**: Two-point distribution that matches third moments\n- **Webb**: Six-point distribution with wider support"
337292
},
338293
{
339294
"cell_type": "code",
@@ -544,37 +499,7 @@
544499
"cell_type": "markdown",
545500
"id": "ef99ee47",
546501
"metadata": {},
547-
"source": [
548-
"## Summary\n",
549-
"\n",
550-
"**Key takeaways:**\n",
551-
"\n",
552-
"1. EDiD achieves the **semiparametric efficiency bound** for ATT estimation in staggered designs\n",
553-
"2. Under **PT-All**, EDiD exploits overidentification for tighter SEs than CS\n",
554-
"3. Under **PT-Post**, EDiD reduces to **exactly Callaway-Sant'Anna**\n",
555-
"4. The efficiency gain comes from optimally weighting across all valid (comparison group, baseline) pairs\n",
556-
"5. **Event study** and **group** aggregations work just like CS\n",
557-
"6. **Multiplier bootstrap** provides robust inference with Rademacher, Mammen, or Webb weights\n",
558-
"7. **Condition numbers** flag potentially unstable weight matrices\n",
559-
"8. **Anticipation** shifts the effective treatment boundary for pre-treatment effects\n",
560-
"9. Phase 1 is **no-covariates only** --- Phase 2 will add covariate support\n",
561-
"10. When in doubt, run both EDiD and CS --- if ATTs agree, report EDiD for tighter CIs\n",
562-
"\n",
563-
"**Parameter reference:**\n",
564-
"\n",
565-
"| Parameter | Default | Description |\n",
566-
"|-----------|---------|-------------|\n",
567-
"| `pt_assumption` | `\"all\"` | `\"all\"` (overidentified) or `\"post\"` (just-identified, = CS) |\n",
568-
"| `alpha` | `0.05` | Significance level |\n",
569-
"| `n_bootstrap` | `0` | Number of bootstrap iterations (0 = analytical only) |\n",
570-
"| `bootstrap_weights` | `\"rademacher\"` | Bootstrap weight distribution: `\"rademacher\"`, `\"mammen\"`, `\"webb\"` |\n",
571-
"| `seed` | `None` | Random seed for reproducibility |\n",
572-
"| `anticipation` | `0` | Anticipation periods |\n",
573-
"\n",
574-
"**Reference:** Chen, J., Sant'Anna, P. H. C., & Xie, Y. (2025). Efficient Difference-in-Differences. *Working Paper*.\n",
575-
"\n",
576-
"*See also: [Choosing an Estimator](../choosing_estimator.rst) for guidance on when to use EDiD vs other estimators.*"
577-
]
502+
"source": "## Summary\n\n**Key takeaways:**\n\n1. EDiD achieves the **semiparametric efficiency bound** for ATT estimation in staggered designs\n2. Under **PT-All**, EDiD exploits overidentification for tighter SEs than CS\n3. Under **PT-Post**, EDiD matches CS for post-treatment ATT(g,t); pre-treatment diagnostics use a fixed baseline and may differ from CS's default varying baseline\n4. The efficiency gain comes from optimally weighting across all valid (comparison group, baseline) pairs\n5. **Event study** and **group** aggregations work just like CS\n6. **Multiplier bootstrap** provides robust inference with Rademacher, Mammen, or Webb weights\n7. **Condition numbers** flag potentially unstable weight matrices\n8. **Anticipation** shifts the effective treatment boundary for pre-treatment effects\n9. Phase 1 is **no-covariates only** --- Phase 2 will add covariate support\n10. When in doubt, run both EDiD and CS --- if ATTs agree, report EDiD for tighter CIs\n\n**Parameter reference:**\n\n| Parameter | Default | Description |\n|-----------|---------|-------------|\n| `pt_assumption` | `\"all\"` | `\"all\"` (overidentified) or `\"post\"` (just-identified, matches CS post-treatment ATT) |\n| `alpha` | `0.05` | Significance level |\n| `n_bootstrap` | `0` | Number of bootstrap iterations (0 = analytical only) |\n| `bootstrap_weights` | `\"rademacher\"` | Bootstrap weight distribution: `\"rademacher\"`, `\"mammen\"`, `\"webb\"` |\n| `seed` | `None` | Random seed for reproducibility |\n| `anticipation` | `0` | Anticipation periods |\n\n**Reference:** Chen, X., Sant'Anna, P. H. C., & Xie, H. (2025). Efficient Difference-in-Differences and Event Study Estimators. *Working Paper*.\n\n*See also: [Choosing an Estimator](../choosing_estimator.rst) for guidance on when to use EDiD vs other estimators.*"
578503
}
579504
],
580505
"metadata": {
@@ -584,4 +509,4 @@
584509
},
585510
"nbformat": 4,
586511
"nbformat_minor": 5
587-
}
512+
}

0 commit comments

Comments
 (0)