Skip to content

Commit 2e9447e

Browse files
authored
Merge pull request #318 from igerber/business-report
Add BusinessReport and DiagnosticReport (experimental preview)
2 parents ed94e43 + dcfb4fe commit 2e9447e

31 files changed

Lines changed: 13301 additions & 51 deletions

CHANGELOG.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
77

88
## [Unreleased]
99

10+
### Added
11+
- **`BusinessReport` and `DiagnosticReport` (experimental preview)** - practitioner-ready output layer. `BusinessReport(results, ...)` produces plain-English narrative summaries (`.summary()`, `.full_report()`, `.export_markdown()`, `.to_dict()`) from any of the 16 fitted result types. `DiagnosticReport(results, ...)` orchestrates the existing diagnostic battery (parallel trends, pre-trends power, HonestDiD sensitivity, Goodman-Bacon, heterogeneity, design-effect, EPV) plus estimator-native diagnostics for SyntheticDiD (`pre_treatment_fit`, weight concentration, in-time placebo, zeta sensitivity) and TROP (factor-model fit metrics). Both classes expose an AI-legible `to_dict()` schema (single source of truth; prose renders from the dict). BR auto-constructs DR by default so summaries mention pre-trends, robustness, and design-effect findings in one call. See `docs/methodology/REPORTING.md` for methodology deviations including the no-traffic-light-gates decision, pre-trends verdict thresholds (0.05 / 0.30), and power-aware phrasing driven by `compute_pretrends_power`. **Both schemas are marked experimental in this release** - wording, verdict thresholds, and schema shape will change; do not anchor downstream tooling on them yet.
12+
1013
### Changed
1114
- Add Zenodo DOI badge to README; upgrade the BibTeX citation block with the concept DOI (`10.5281/zenodo.19646175`) and list author as Isaac Gerber (matching `CITATION.cff`). Add `doi:` and `identifiers:` entries (concept + versioned) to `CITATION.cff`. DOI was minted by Zenodo when v3.1.3 was released.
1215
- **`ChaisemartinDHaultfoeuille` heterogeneity + within-group-varying PSU/strata now supported under Binder TSL** - `fit(heterogeneity=..., survey_design=...)` no longer raises `NotImplementedError` when the resolved design's PSU or strata vary across the cells of a group. On the **Binder TSL** branch (`compute_survey_if_variance`), the heterogeneity WLS coefficient IF is expanded to observation level via the cell-period allocator `ψ_i = ψ_g * (w_i / W_{g, out_idx})` on the post-period cell — the DID_l post-period single-cell convention shipped in v3.1.x. Under PSU=group the PSU-level Binder TSL variance is byte-identical to the previous release (PSU-level aggregate telescopes to `ψ_g`); under within-group-varying PSU, mass lands in the post-period PSU of the transition. The **Rao-Wu replicate-weight** branch (`compute_replicate_if_variance`) retains the legacy group-level allocator `ψ_i = ψ_g * (w_i / W_g)`: replicate variance computes `θ_r = sum_i ratio_ir * ψ_i` at observation level and is therefore not PSU-telescoping, so the cell-period allocator would silently change the replicate SE whenever a replicate column's ratios vary within group (e.g., per-row replicate matrices). Replicate + heterogeneity fits therefore produce byte-identical SE to the previous release, and the newly-unblocked `heterogeneity=` + within-group-varying PSU combination is unreachable under replicate designs by construction (`SurveyDesign` rejects `replicate_weights` combined with explicit `strata/psu/fpc`).

README.md

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -93,6 +93,38 @@ Measuring campaign lift? Evaluating a product launch? diff-diff handles the caus
9393
- **[Brand awareness survey tutorial](docs/tutorials/17_brand_awareness_survey.ipynb)** - Full example with complex survey design, brand funnel analysis, and staggered rollouts
9494
- **Have BRFSS/ACS/CPS individual records?** Use [`aggregate_survey()`](docs/api/prep.rst) to roll respondent-level microdata into a geographic-period panel with inverse-variance precision weights. The returned second-stage design uses analytic weights (`aweight`), so it works directly with `DifferenceInDifferences`, `TwoWayFixedEffects`, `MultiPeriodDiD`, `SunAbraham`, `ContinuousDiD`, and `EfficientDiD` (estimators marked **Full** in the [survey support matrix](docs/choosing_estimator.rst))
9595

96+
### Experimental preview: `BusinessReport` and `DiagnosticReport`
97+
98+
diff-diff ships two preview classes, `BusinessReport` and `DiagnosticReport`, that produce plain-English output and a structured `to_dict()` schema from any fitted result. **Both are experimental in this release** — wording, verdict thresholds, and schema shape will change as the library learns from real practitioner usage. Do not anchor downstream tooling on the schema yet; the experimental flag is noted in the CHANGELOG.
99+
100+
```python
101+
from diff_diff import CallawaySantAnna, BusinessReport
102+
103+
cs = CallawaySantAnna(base_period="universal").fit(
104+
df, outcome="revenue", unit="store", time="month",
105+
first_treat="first_treat", aggregate="event_study",
106+
)
107+
report = BusinessReport(
108+
cs,
109+
outcome_label="Revenue per store",
110+
outcome_unit="$",
111+
business_question="Did the loyalty program lift revenue?",
112+
treatment_label="the loyalty program",
113+
# Optional: pass the panel + column names so the auto-constructed
114+
# DiagnosticReport can run data-dependent checks (2x2 pre-trends,
115+
# Goodman-Bacon decomposition, EfficientDiD Hausman pretest).
116+
# Without these the auto path still runs but skips those checks.
117+
data=df,
118+
outcome="revenue",
119+
unit="store",
120+
time="month",
121+
first_treat="first_treat",
122+
)
123+
print(report.summary())
124+
```
125+
126+
`BusinessReport` auto-constructs a `DiagnosticReport` so the summary mentions pre-trends, sensitivity, and design-effect findings in one call. Methodology (phrasing rules, verdict thresholds, schema stability) is documented in [docs/methodology/REPORTING.md](docs/methodology/REPORTING.md). Feedback on wording, applicability, and missing diagnostics is welcome — this is the part of the library most likely to evolve in the next few releases.
127+
96128
Already know DiD? The [academic quickstart](docs/quickstart.rst) and [estimator guide](docs/choosing_estimator.rst) cover the full technical details.
97129

98130
## Features

ROADMAP.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -57,6 +57,7 @@ See [Survey Design Support](docs/choosing_estimator.rst#survey-design-support) f
5757

5858
Major landings since the prior roadmap revision. See [CHANGELOG.md](CHANGELOG.md) for the full history.
5959

60+
- **`BusinessReport` and `DiagnosticReport`** - practitioner-ready output layer. Plain-English stakeholder summaries + unified diagnostic runner with a stable AI-legible `to_dict()` schema. `BusinessReport` auto-constructs `DiagnosticReport` by default so summaries mention pre-trends, robustness, and design-effect findings in one call. Estimator-native validation surfaces are routed through: SyntheticDiD uses `pre_treatment_fit` / `in_time_placebo` / `sensitivity_to_zeta_omega`; EfficientDiD uses its native `hausman_pretest`; TROP exposes factor-model fit metrics. See `docs/methodology/REPORTING.md` for methodology deviations including no-traffic-light gates, pre-trends verdict thresholds, and power-aware phrasing.
6061
- **ChaisemartinDHaultfoeuille (dCDH)** - full feature set: `DID_M` contemporaneous-switch, multi-horizon `DID_l` event study, analytical SE, multiplier bootstrap, TWFE decomposition diagnostic, dynamic placebos, normalized estimator, cost-benefit aggregate, sup-t bands, covariate adjustment (`DID^X`), group-specific linear trends (`DID^{fd}`), state-set-specific trends, heterogeneity testing, non-binary treatment, HonestDiD integration, and survey support (TSL + pweight).
6162
- **SyntheticDiD jackknife variance** (`variance_method='jackknife'`) with survey-weighted jackknife.
6263
- **SyntheticDiD validation diagnostics**.
@@ -78,8 +79,7 @@ Queued work, ordered by expected leverage. Each item is its own PR. Ordering is
7879

7980
### Practitioner-ready output
8081

81-
- **`BusinessReport` class.** Plain-English summaries of any estimator's results with markdown export. Optional rich formatting via a `[reporting]` extra; core remains numpy/pandas/scipy only. Turns raw coefficients into stakeholder-ready artifacts.
82-
- **`DiagnosticReport` with context-aware `practitioner_next_steps()`.** Unified diagnostic runner that bundles parallel-trends, placebo, HonestDiD, Bacon decomposition, DEFF, EPV, and power diagnostics into one plain-English report. `practitioner_next_steps()` substitutes actual column names from fitted results instead of generic placeholders.
82+
- **Context-aware `practitioner_next_steps()`.** Substitutes actual column names from fitted results instead of generic placeholders, so next-step guidance is executable rather than illustrative. (Standalone follow-up to the `BusinessReport` / `DiagnosticReport` landing below; tracked under the AI-Agent Track too.)
8383

8484
### Practitioner tutorials
8585

diff_diff/__init__.py

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -213,6 +213,16 @@
213213
plot_synth_weights,
214214
)
215215
from diff_diff.practitioner import practitioner_next_steps
216+
from diff_diff.business_report import (
217+
BUSINESS_REPORT_SCHEMA_VERSION,
218+
BusinessContext,
219+
BusinessReport,
220+
)
221+
from diff_diff.diagnostic_report import (
222+
DIAGNOSTIC_REPORT_SCHEMA_VERSION,
223+
DiagnosticReport,
224+
DiagnosticReportResults,
225+
)
216226
from diff_diff._guides_api import get_llm_guide
217227
from diff_diff.datasets import (
218228
clear_cache,
@@ -427,6 +437,12 @@
427437
"clear_cache",
428438
# Practitioner guidance
429439
"practitioner_next_steps",
440+
"BusinessReport",
441+
"BusinessContext",
442+
"BUSINESS_REPORT_SCHEMA_VERSION",
443+
"DiagnosticReport",
444+
"DiagnosticReportResults",
445+
"DIAGNOSTIC_REPORT_SCHEMA_VERSION",
430446
# LLM guide accessor
431447
"get_llm_guide",
432448
]

0 commit comments

Comments
 (0)