You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Round 4: doc/contract cleanups (joiners_leavers DataFrame, stale docstrings)
P2: split joiners_leavers DataFrame into n_cells + n_obs columns
- to_dataframe(level="joiners_leavers") previously had a single n_obs
column with mixed semantics by row (DID_M used switcher cell count;
DID_+/DID_- used raw observation counts). Two columns with consistent
units across all rows: n_cells (count of switching (g, t) cells) and
n_obs (sum of n_gt over the same cells). DID_M row uses union of
joiner + leaver cells. Updated test_to_dataframe_joiners_leavers to
pin the new contract.
P3: stale docstrings on results object
- DCDHBootstrapResults class docstring now states explicitly that
placebo bootstrap fields ALWAYS remain None in Phase 1 (the previous
wording said they were "populated when available"). Per-field
docstrings for placebo_se / placebo_ci / placebo_p_value now point
back to the class-level note.
- n_groups_dropped_never_switching docstring now reflects the Round 2
full-IF fix: never-switching groups participate in the variance via
stable-control roles and the field is reported for backwards
compatibility only — no actual exclusion happens.
- n_groups_dropped_singleton_baseline docstring clarifies the
variance-only filter scope (cell DataFrame retains them as
period-based stable controls).
P3: misleading R-script + prep_dgp comments
- benchmarks/R/generate_dcdh_dynr_test_values.R: clarified that the
Python and R generators mirror each other STRUCTURALLY (same pattern
logic, same FE/effect/noise model), not at the RNG level. R's
set.seed and NumPy's default_rng use different RNGs. Parity tests
load the R script's golden-value JSON so both sides operate on
byte-identical input regardless of how it was originally generated.
- prep_dgp.py generate_reversible_did_data: clarified that the default
single_switch pattern is A5-safe by construction (every group has at
most one transition). Other patterns (random/cycles/marketing) ARE
allowed to violate A5 and exist primarily as stress tests for the
drop_larger_lower=True filter. The cohort-recentered variance
formula is derived under A5, which is why drop_larger_lower defaults
to True.
Tests: 103 dCDH passing (no new tests; the existing
test_to_dataframe_joiners_leavers was strengthened to assert the new
n_cells / n_obs contract).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
0 commit comments