Skip to content

Commit e26c093

Browse files
igerberclaude
andcommitted
Fix review feedback: custom DGP spec, DDD prose, registry scope
- P1: Fix custom DGP example to use time-invariant `ever_treated` column instead of post-treatment exposure indicator (rank-deficient design) - P2: Correct DDD prose to document actual rounding rule `max(2, n_units // 8)` with min effective N = 16, distinguish from registry search floor of 64 - P3: Change Section 8 intro from "any built-in estimator" to "all 12 supported estimators" with forward refs to support table Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent e47302e commit e26c093

1 file changed

Lines changed: 4 additions & 4 deletions

File tree

docs/tutorials/06_power_analysis.ipynb

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -453,7 +453,7 @@
453453
{
454454
"cell_type": "markdown",
455455
"id": "cjpvh2ze7lh",
456-
"source": "## 8. Power Analysis for Any Estimator\n\nThe simulation-based approach works with **any** built-in estimator — not just basic DiD. An internal registry automatically selects the appropriate data-generating process (DGP) and fit signature for each estimator. Just swap in the estimator object and everything else is handled.\n\n### Staggered Adoption Estimators",
456+
"source": "## 8. Power Analysis for Any Estimator\n\nThe simulation-based approach works with **all 12 supported estimators** — not just basic DiD. An internal registry automatically selects the appropriate data-generating process (DGP) and fit signature for each registered estimator. Just swap in the estimator object and everything else is handled. See the support table below for the full list, and Section 11 for using custom DGPs with unsupported estimators.\n\n### Staggered Adoption Estimators",
457457
"metadata": {}
458458
},
459459
{
@@ -481,7 +481,7 @@
481481
{
482482
"cell_type": "markdown",
483483
"id": "6qpu05hi18s",
484-
"source": "### Triple Difference\n\n`TripleDifference` uses a fixed 2×2×2 factorial design (group × partition × time). Sample sizes are **snapped to multiples of 8** (one unit per cell minimum). The `effective_n_units` field in results tracks any rounding.",
484+
"source": "### Triple Difference\n\n`TripleDifference` uses a fixed 2×2×2 factorial design (group × partition × time). Sample sizes are **rounded via `n_per_cell = max(2, n_units // 8)`**, so the minimum effective N is 16 (2 units per cell × 8 cells). The `effective_n_units` field in results tracks any rounding. Note that `simulate_sample_size()` uses a higher search floor of 64 from the registry.",
485485
"metadata": {}
486486
},
487487
{
@@ -495,7 +495,7 @@
495495
{
496496
"cell_type": "markdown",
497497
"id": "6kb8ovmue4m",
498-
"source": "### Supported Estimators\n\nThe following 12 estimators are supported by the simulation power analysis registry. Each is automatically paired with the correct data-generating process:\n\n| DGP Family | Estimators | Min N |\n|---|---|---|\n| **Basic DiD** (`generate_did_data`) | DifferenceInDifferences, TwoWayFixedEffects, MultiPeriodDiD | 20 |\n| **Staggered** (`generate_staggered_data`) | CallawaySantAnna, SunAbraham, ImputationDiD, TwoStageDiD, StackedDiD, EfficientDiD | 40 |\n| **Factor Model** (`generate_factor_data`) | TROP, SyntheticDiD | 30 |\n| **Triple Difference** (`generate_ddd_data`) | TripleDifference | 64 |\n\n> **Note:** `ContinuousDiD` is not in the registry because continuous/dose-response treatments require a different DGP structure. `BaconDecomposition` and `HonestDiD` are diagnostic/sensitivity tools rather than treatment effect estimators. For unsupported estimators, you can pass a custom `data_generator` and `result_extractor` (see Section 11).",
498+
"source": "### Supported Estimators\n\nThe following 12 estimators are supported by the simulation power analysis registry. Each is automatically paired with the correct data-generating process:\n\n| DGP Family | Estimators | Min N |\n|---|---|---|\n| **Basic DiD** (`generate_did_data`) | DifferenceInDifferences, TwoWayFixedEffects, MultiPeriodDiD | 20 |\n| **Staggered** (`generate_staggered_data`) | CallawaySantAnna, SunAbraham, ImputationDiD, TwoStageDiD, StackedDiD, EfficientDiD | 40 |\n| **Factor Model** (`generate_factor_data`) | TROP, SyntheticDiD | 30 |\n| **Triple Difference** (`generate_ddd_data`) | TripleDifference | 16* |\n\n\\* DDD effective N rounds to `max(2, n_units // 8) * 8` with minimum 16. `simulate_sample_size()` uses a higher search floor of 64.\n\n> **Note:** `ContinuousDiD` is not in the registry because continuous/dose-response treatments require a different DGP structure. `BaconDecomposition` and `HonestDiD` are diagnostic/sensitivity tools rather than treatment effect estimators. For unsupported estimators, you can pass a custom `data_generator` and `result_extractor` (see Section 11).",
499499
"metadata": {}
500500
},
501501
{
@@ -631,7 +631,7 @@
631631
{
632632
"cell_type": "code",
633633
"id": "v06p7ubbj9p",
634-
"source": "def my_dgp(n_units, n_periods, treatment_effect, treatment_fraction,\n treatment_period, noise_sd, seed=None):\n \"\"\"Custom DGP with heterogeneous unit effects.\"\"\"\n rng = np.random.default_rng(seed)\n n_treat = int(n_units * treatment_fraction)\n\n rows = []\n for i in range(n_units):\n unit_fe = rng.normal(0, 3) # heterogeneous unit effect\n treated_unit = i < n_treat\n for t in range(n_periods):\n post = int(t >= treatment_period)\n effect = treatment_effect * post if treated_unit else 0.0\n y = unit_fe + 2.0 * t + effect + rng.normal(0, noise_sd)\n rows.append({\n \"unit\": i, \"period\": t, \"outcome\": y,\n \"treated\": int(treated_unit and post), \"post\": post,\n })\n return pd.DataFrame(rows)\n\n# Use the custom DGP with simulate_power\ncustom_results = simulate_power(\n estimator=DifferenceInDifferences(),\n n_units=80,\n n_periods=4,\n treatment_effect=4.0,\n sigma=3.0,\n n_simulations=100,\n seed=42,\n progress=False,\n data_generator=my_dgp,\n estimator_kwargs={\"outcome\": \"outcome\", \"treatment\": \"treated\", \"time\": \"post\"},\n)\n\nprint(custom_results.summary())",
634+
"source": "def my_dgp(n_units, n_periods, treatment_effect, treatment_fraction,\n treatment_period, noise_sd, seed=None):\n \"\"\"Custom DGP with heterogeneous unit effects.\"\"\"\n rng = np.random.default_rng(seed)\n n_treat = int(n_units * treatment_fraction)\n\n rows = []\n for i in range(n_units):\n unit_fe = rng.normal(0, 3) # heterogeneous unit effect\n treated_unit = i < n_treat\n for t in range(n_periods):\n post = int(t >= treatment_period)\n effect = treatment_effect * post if treated_unit else 0.0\n y = unit_fe + 2.0 * t + effect + rng.normal(0, noise_sd)\n rows.append({\n \"unit\": i, \"period\": t, \"outcome\": y,\n \"ever_treated\": int(treated_unit), \"post\": post,\n })\n return pd.DataFrame(rows)\n\n# Use the custom DGP with simulate_power\ncustom_results = simulate_power(\n estimator=DifferenceInDifferences(),\n n_units=80,\n n_periods=4,\n treatment_effect=4.0,\n sigma=3.0,\n n_simulations=100,\n seed=42,\n progress=False,\n data_generator=my_dgp,\n estimator_kwargs={\"outcome\": \"outcome\", \"treatment\": \"ever_treated\", \"time\": \"post\"},\n)\n\nprint(custom_results.summary())",
635635
"metadata": {},
636636
"execution_count": null,
637637
"outputs": []

0 commit comments

Comments
 (0)