Skip to content

Commit 84fffb5

Browse files
igerberclaude
andcommitted
Accept numpy integers in cohort_periods and add tutorial cross-references
- Accept np.integer alongside int in cohort_periods validation, matching the existing generate_staggered_data() behavior (P2) - Add regression test for numpy integer cohort periods - Add tutorial 16 cross-links in README.md, quickstart.rst, and choosing_estimator.rst for discoverability (P3) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 93f8103 commit 84fffb5

5 files changed

Lines changed: 15 additions & 1 deletion

File tree

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -140,6 +140,7 @@ We provide Jupyter notebook tutorials in `docs/tutorials/`:
140140
| `12_two_stage_did.ipynb` | Two-Stage DiD (Gardner 2022), GMM sandwich variance, per-observation effects |
141141
| `13_stacked_did.ipynb` | Stacked DiD (Wing et al. 2024), Q-weights, sub-experiment inspection, trimming, clean control definitions |
142142
| `15_efficient_did.ipynb` | Efficient DiD (Chen et al. 2025), optimal weighting, PT-All vs PT-Post, efficiency gains, bootstrap inference |
143+
| `16_survey_did.ipynb` | Survey-aware DiD with complex sampling designs (strata, PSU, FPC, weights), replicate weights, subpopulation analysis, DEFF diagnostics |
143144

144145
## Data Preparation
145146

diff_diff/prep_dgp.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1223,7 +1223,7 @@ def generate_survey_did_data(
12231223
if not cohort_periods:
12241224
raise ValueError("cohort_periods must be a non-empty list of integers")
12251225
for cp in cohort_periods:
1226-
if not isinstance(cp, int) or isinstance(cp, bool):
1226+
if isinstance(cp, bool) or not isinstance(cp, (int, np.integer)):
12271227
raise ValueError(
12281228
f"cohort_periods must contain integers, got {cp!r}"
12291229
)

docs/choosing_estimator.rst

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -569,3 +569,7 @@ If you're unsure which estimator to use:
569569

570570
4. **Compare estimators** - If results differ substantially across estimators,
571571
investigate why (often reveals violations of assumptions)
572+
573+
5. **Using survey data?** - Pass a ``SurveyDesign`` to ``fit()`` for design-based
574+
variance estimation. See the `survey tutorial <https://github.com/igerber/diff-diff/blob/main/docs/tutorials/16_survey_did.ipynb>`_
575+
for a full walkthrough with strata, PSU, FPC, replicate weights, and subpopulation analysis.

docs/quickstart.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -205,3 +205,4 @@ Next Steps
205205
- :doc:`choosing_estimator` - Learn which estimator to use for your design
206206
- :doc:`r_comparison` - See how diff-diff compares to R packages
207207
- :doc:`api/index` - Explore the full API reference
208+
- `Survey-aware DiD tutorial <https://github.com/igerber/diff-diff/blob/main/docs/tutorials/16_survey_did.ipynb>`_ - Using DiD with complex survey designs (strata, PSU, FPC, weights)

tests/test_prep.py

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1372,3 +1372,11 @@ def test_cohort_period_non_integer(self):
13721372

13731373
with pytest.raises(ValueError, match="must contain integers"):
13741374
generate_survey_did_data(cohort_periods=[2.5], seed=42)
1375+
1376+
def test_numpy_integer_cohort_periods(self):
1377+
"""Test that numpy integer cohort periods are accepted."""
1378+
from diff_diff.prep import generate_survey_did_data
1379+
1380+
periods = np.array([3, 5], dtype=np.int64)
1381+
data = generate_survey_did_data(cohort_periods=list(periods), seed=42)
1382+
assert len(data) == 200 * 8

0 commit comments

Comments
 (0)