You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
"This tutorial demonstrates the `TwoStageDiD` estimator, which implements the two-stage difference-in-differences method from Gardner (2022), \"Two-stage differences in differences\", with inference from Butts & Gardner (2022), \"did2s: Two-Stage Difference-in-Differences\".\n",
10
+
"\n",
11
+
"**When to use TwoStageDiD:**\n",
12
+
"- Staggered adoption settings where you want **GMM sandwich variance** that accounts for first-stage estimation uncertainty\n",
13
+
"- When you want **per-observation treatment effects** (`treatment_effects` DataFrame) for granular analysis\n",
14
+
"- As a **robustness check** alongside ImputationDiD: identical point estimates with different inference confirm results are not an artifact of variance estimator choice"
"The two-stage estimator follows a simple algorithm:\n",
40
+
"1. Estimate unit and time fixed effects using only **untreated observations** (never-treated + not-yet-treated periods)\n",
41
+
"2. Residualize **all** outcomes using those estimated FEs\n",
42
+
"3. Regress residualized outcomes on treatment indicators to obtain the ATT\n",
43
+
"\n",
44
+
"This avoids TWFE bias because the fixed effect model is estimated only on clean (untreated) data, preventing treated outcomes from contaminating the counterfactual."
45
+
]
46
+
},
47
+
{
48
+
"cell_type": "code",
49
+
"execution_count": null,
50
+
"metadata": {},
51
+
"outputs": [],
52
+
"source": [
53
+
"# Generate staggered adoption data with known treatment effect\n",
"Event study aggregation estimates treatment effects at each relative time horizon, enabling visualization of dynamic effects and informal pre-trend assessment."
"plot_event_study(results_es, title='Two-Stage DiD Event Study')"
84
+
]
85
+
},
86
+
{
87
+
"cell_type": "code",
88
+
"execution_count": null,
89
+
"metadata": {},
90
+
"outputs": [],
91
+
"source": [
92
+
"# View event study effects as a table\n",
93
+
"results_es.to_dataframe(level='event_study')"
94
+
]
95
+
},
96
+
{
97
+
"cell_type": "markdown",
98
+
"metadata": {},
99
+
"source": "## Per-Observation Treatment Effects\n\nBoth `TwoStageDiD` and `ImputationDiD` provide a `treatment_effects` DataFrame containing one row per treated observation with:\n- `tau_hat`: the residualized outcome (actual outcome minus estimated counterfactual)\n- The unit and time columns (using the original column names from the input data, e.g., `unit` and `period`)\n- `rel_time`: relative time since treatment\n- `weight`: aggregation weight — `1/n_valid` for observations with finite `tau_hat`, `0` for NaN rows (e.g., rank-deficient cases)\n\nThis enables granular analysis: examining which units or periods drive the aggregate effect, detecting outliers, or constructing custom aggregation schemes."
100
+
},
101
+
{
102
+
"cell_type": "code",
103
+
"execution_count": null,
104
+
"metadata": {},
105
+
"outputs": [],
106
+
"source": [
107
+
"# Per-observation treatment effects (available from the basic fit)\n",
108
+
"te = results.treatment_effects\n",
109
+
"print(f\"Shape: {te.shape}\")\n",
110
+
"print(f\"Columns: {list(te.columns)}\")\n",
111
+
"print()\n",
112
+
"te.head(10)"
113
+
]
114
+
},
115
+
{
116
+
"cell_type": "markdown",
117
+
"metadata": {},
118
+
"source": "## Comparison with Other Estimators\n\nTwoStageDiD and ImputationDiD produce **identical point estimates** because both estimate fixed effects on untreated observations and use them to residualize outcomes. The key difference is the variance estimator: TwoStageDiD uses the GMM sandwich from Butts & Gardner (2022), while ImputationDiD uses the conservative variance from Borusyak et al. (2024, Theorem 3).\n\nCallawaySantAnna uses a fundamentally different estimation approach — computing group-time ATT(g,t) effects via outcome regression, IPW, or doubly robust methods, then aggregating — so point estimates may differ, especially under heterogeneous effects. It uses analytical influence-function standard errors by default, with optional multiplier bootstrap when `n_bootstrap > 0`.\n\n*Note: Tutorial 11 compared ImputationDiD against CallawaySantAnna and SunAbraham. Here we focus on the TwoStageDiD vs ImputationDiD point-estimate identity, with CallawaySantAnna as a widely used reference point. For SunAbraham comparisons, see Tutorial 11.*"
"If treatment effects begin before the official treatment date (e.g., firms change behavior in anticipation of a policy), use the `anticipation` parameter to shift the treatment onset back."
"The key methodological distinction between TwoStageDiD and ImputationDiD is the variance estimator:\n",
198
+
"\n",
199
+
"- **ImputationDiD's conservative variance** (Theorem 3) is valid under heterogeneous treatment effects but may produce wider confidence intervals than necessary\n",
200
+
"- **TwoStageDiD's GMM sandwich** accounts for first-stage estimation uncertainty via an influence function correction term\n",
201
+
"- In practice they usually agree closely; large divergence signals potential specification concerns\n",
202
+
"- Bootstrap inference is also available via `n_bootstrap=199`"
0 commit comments