Skip to content

Commit fdaf94d

Browse files
igerberclaude
andcommitted
Address PR #347 R7: bump schema versions to 2.0 + EfficientDiD library vs ES_avg note
Two P1 findings from R7, both addressed. P1 #1 (schema version bump): the new ``headline.status`` / ``headline_metric.status`` value ``"no_scalar_by_design"`` added in R4 for the dCDH ``trends_linear=True, L_max>=2`` configuration is a breaking change per REPORTING.md stability policy (new status-enum values are breaking — agents doing exhaustive match will break on unknown enums). Bumped ``BUSINESS_REPORT_SCHEMA_VERSION`` and ``DIAGNOSTIC_REPORT_SCHEMA_VERSION`` from ``"1.0"`` to ``"2.0"``, updated the in-tree schema-version tests (one explicit ``== "1.0"`` assertion and six ``"schema_version": "1.0"`` stub dicts in BR / DR test files), added a REPORTING.md "Schema version 2.0" note, and documented the bump in the CHANGELOG Unreleased entry. The schemas remain marked experimental so the formal deprecation policy does not yet apply. P1 #2 (EfficientDiD library vs paper estimand): both EfficientDiD branches now explicitly state that BR/DR's headline ``overall_att`` is the library's cohort-size-weighted average over post-treatment ``(g, t)`` cells, NOT the paper's ``ES_avg`` uniform event-time average. The regime (PT-All / PT-Post) describes identification; the aggregation choice is a separate library-level policy that REGISTRY.md Sec. EfficientDiD documents. Added ``cohort-size-weighted`` + ``ES_avg`` / ``post-treatment`` assertions to ``test_efficient_did_pt_all`` and ``test_efficient_did_pt_post`` so the wording is pinned. 354 BR/DR + guide + target-parameter tests pass. Black and ruff clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 5c3d0ba commit fdaf94d

8 files changed

Lines changed: 49 additions & 12 deletions

File tree

CHANGELOG.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
88
## [Unreleased]
99

1010
### Added
11-
- **`target_parameter` block in BR/DR schemas (experimental)** — BusinessReport and DiagnosticReport now emit a top-level `target_parameter` block naming what the headline scalar actually represents for each of the 16 result classes. Closes BR/DR foundation gap #6 (target-parameter clarity). Fields: `name`, `definition`, `aggregation` (machine-readable dispatch tag), `headline_attribute` (raw result attribute), `reference` (citation pointer). BR's summary emits the short `name` right after the headline; DR's overall-interpretation paragraph does the same; both full reports carry a "## Target Parameter" section with the full definition. Per-estimator dispatch is sourced from REGISTRY.md and lives in the new `diff_diff/_reporting_helpers.py::describe_target_parameter`. A few branches read fit-time config (`EfficientDiDResults.pt_assumption`, `StackedDiDResults.clean_control`, `ChaisemartinDHaultfoeuilleResults.L_max` / `covariate_residuals` / `linear_trends_effects`); others emit a fixed tag (the fit-time `aggregate` kwarg on CS / Imputation / TwoStage / Wooldridge does not change the `overall_att` scalar — disambiguating horizon / group tables is tracked under gap #9). See `docs/methodology/REPORTING.md` "Target parameter" section.
11+
- **`target_parameter` block in BR/DR schemas (experimental; schema version bumped to 2.0)** — `BUSINESS_REPORT_SCHEMA_VERSION` and `DIAGNOSTIC_REPORT_SCHEMA_VERSION` bumped from `"1.0"` to `"2.0"` because the new `"no_scalar_by_design"` value on the `headline.status` / `headline_metric.status` enum (dCDH `trends_linear=True, L_max>=2` configuration) is a breaking change per the REPORTING.md stability policy. BusinessReport and DiagnosticReport now emit a top-level `target_parameter` block naming what the headline scalar actually represents for each of the 16 result classes. Closes BR/DR foundation gap #6 (target-parameter clarity). Fields: `name`, `definition`, `aggregation` (machine-readable dispatch tag), `headline_attribute` (raw result attribute), `reference` (citation pointer). BR's summary emits the short `name` right after the headline; DR's overall-interpretation paragraph does the same; both full reports carry a "## Target Parameter" section with the full definition. Per-estimator dispatch is sourced from REGISTRY.md and lives in the new `diff_diff/_reporting_helpers.py::describe_target_parameter`. A few branches read fit-time config (`EfficientDiDResults.pt_assumption`, `StackedDiDResults.clean_control`, `ChaisemartinDHaultfoeuilleResults.L_max` / `covariate_residuals` / `linear_trends_effects`); others emit a fixed tag (the fit-time `aggregate` kwarg on CS / Imputation / TwoStage / Wooldridge does not change the `overall_att` scalar — disambiguating horizon / group tables is tracked under gap #9). See `docs/methodology/REPORTING.md` "Target parameter" section.
1212

1313
## [3.2.0] - 2026-04-19
1414

diff_diff/_reporting_helpers.py

Lines changed: 18 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -275,7 +275,23 @@ def describe_target_parameter(results: Any) -> Dict[str, Any]:
275275
}
276276

277277
if name == "EfficientDiDResults":
278+
# PR #347 R7 P1: the BR/DR headline ``overall_att`` is the
279+
# library's cohort-size-weighted average over post-treatment
280+
# ``(g, t)`` cells (see ``efficient_did.py`` around line 1274
281+
# and REGISTRY.md Sec. EfficientDiD). This is distinct from
282+
# the paper's ``ES_avg`` uniform event-time average.
283+
# Disambiguating this in the stakeholder-facing definition
284+
# keeps the user from mistaking one for the other — the
285+
# regime (PT-All vs PT-Post) describes identification, not
286+
# the aggregation choice for the headline scalar.
278287
pt_assumption = getattr(results, "pt_assumption", "all")
288+
library_aggregation_note = (
289+
" The BR/DR headline ``overall_att`` is the library's "
290+
"cohort-size-weighted average of ATT(g, t) over post-"
291+
"treatment cells, NOT the paper's ``ES_avg`` uniform event-"
292+
"time average (see REGISTRY.md Sec. EfficientDiD for the "
293+
"distinction)."
294+
)
279295
if pt_assumption == "post":
280296
return {
281297
"name": "overall ATT under PT-Post (single-baseline)",
@@ -284,7 +300,7 @@ def describe_target_parameter(results: Any) -> Dict[str, Any]:
284300
"regime (parallel trends hold only in post-treatment "
285301
"periods). The baseline is period ``g - 1`` only; the "
286302
"estimator is just-identified and reduces to standard "
287-
"single-baseline DiD (Corollary 3.2)."
303+
"single-baseline DiD (Corollary 3.2)." + library_aggregation_note
288304
),
289305
"aggregation": "pt_post_single_baseline",
290306
"headline_attribute": "overall_att",
@@ -297,7 +313,7 @@ def describe_target_parameter(results: Any) -> Dict[str, Any]:
297313
"(parallel trends hold for all groups and all periods). The "
298314
"estimator is over-identified (Lemma 2.1) and applies "
299315
"optimal-combination weights to achieve the semiparametric "
300-
"efficiency bound on the no-covariate path."
316+
"efficiency bound on the no-covariate path." + library_aggregation_note
301317
),
302318
"aggregation": "pt_all_combined",
303319
"headline_attribute": "overall_att",

diff_diff/business_report.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,7 @@
4545
from diff_diff._reporting_helpers import describe_target_parameter
4646
from diff_diff.diagnostic_report import DiagnosticReport, DiagnosticReportResults
4747

48-
BUSINESS_REPORT_SCHEMA_VERSION = "1.0"
48+
BUSINESS_REPORT_SCHEMA_VERSION = "2.0"
4949

5050
__all__ = [
5151
"BusinessReport",

diff_diff/diagnostic_report.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@
4040

4141
from diff_diff._reporting_helpers import describe_target_parameter # noqa: E402 (top-level import)
4242

43-
DIAGNOSTIC_REPORT_SCHEMA_VERSION = "1.0"
43+
DIAGNOSTIC_REPORT_SCHEMA_VERSION = "2.0"
4444

4545
__all__ = [
4646
"DiagnosticReport",

docs/methodology/REPORTING.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -358,6 +358,17 @@ a library setting.
358358
anchor tooling on them prematurely; a formal deprecation policy will
359359
land within two subsequent PRs.
360360

361+
- **Note:** Schema version 2.0 (both BR and DR). The BR/DR gap #6
362+
target-parameter PR adds the `headline.status` /
363+
`headline_metric.status` value `"no_scalar_by_design"` (used for
364+
the dCDH `trends_linear=True, L_max>=2` configuration where
365+
`overall_att` is intentionally NaN). Per the stability policy
366+
above, new enum values are breaking changes, so
367+
`BUSINESS_REPORT_SCHEMA_VERSION` and
368+
`DIAGNOSTIC_REPORT_SCHEMA_VERSION` bumped from `"1.0"` to
369+
`"2.0"`. The schemas remain marked experimental, so the formal
370+
deprecation policy does not yet apply.
371+
361372
## Reference implementation(s)
362373

363374
The phrasing rules follow the guidance in:

tests/test_business_report.py

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -959,7 +959,7 @@ class DiDResults:
959959
from diff_diff.diagnostic_report import DiagnosticReportResults
960960

961961
fake_schema = {
962-
"schema_version": "1.0",
962+
"schema_version": "2.0",
963963
"estimator": "DiDResults",
964964
"headline_metric": {"name": "att", "value": 1.0},
965965
"parallel_trends": {
@@ -1058,7 +1058,7 @@ class DiDResults:
10581058
pt_block["joint_p_value"] = 0.40
10591059

10601060
fake_schema = {
1061-
"schema_version": "1.0",
1061+
"schema_version": "2.0",
10621062
"estimator": "DiDResults",
10631063
"headline_metric": {"name": "att", "value": 1.0},
10641064
"parallel_trends": pt_block,
@@ -2321,7 +2321,7 @@ def test_br_schema_tier_is_downgraded(self):
23212321
from diff_diff.diagnostic_report import DiagnosticReportResults
23222322

23232323
schema = {
2324-
"schema_version": "1.0",
2324+
"schema_version": "2.0",
23252325
"estimator": "CallawaySantAnnaResults",
23262326
"headline_metric": {"name": "overall_att", "value": 1.0},
23272327
"parallel_trends": {
@@ -4028,7 +4028,7 @@ def _fragile_dr_schema(self, breakdown_m: float, grid=None):
40284028
for row in grid
40294029
]
40304030
schema = {
4031-
"schema_version": "1.0",
4031+
"schema_version": "2.0",
40324032
"estimator": {"class_name": "CallawaySantAnnaResults", "display_name": "CS"},
40334033
"headline_metric": {},
40344034
"parallel_trends": {"status": "skipped", "reason": "stub"},
@@ -4215,7 +4215,7 @@ def _bacon_schema_with_high_forbidden_weight():
42154215
from diff_diff.diagnostic_report import DiagnosticReportResults
42164216

42174217
schema = {
4218-
"schema_version": "1.0",
4218+
"schema_version": "2.0",
42194219
"estimator": {"class_name": "Stub", "display_name": "Stub"},
42204220
"headline_metric": {},
42214221
"parallel_trends": {"status": "skipped", "reason": "stub"},

tests/test_diagnostic_report.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -173,7 +173,7 @@ def test_schema_version_constant(self, multi_period_fit):
173173
fit, _ = multi_period_fit
174174
schema = DiagnosticReport(fit).to_dict()
175175
assert schema["schema_version"] == DIAGNOSTIC_REPORT_SCHEMA_VERSION
176-
assert DIAGNOSTIC_REPORT_SCHEMA_VERSION == "1.0"
176+
assert DIAGNOSTIC_REPORT_SCHEMA_VERSION == "2.0"
177177

178178
def test_all_statuses_use_closed_enum(self, cs_fit):
179179
fit, sdf = cs_fit
@@ -1931,7 +1931,7 @@ def _render(self, sens_block):
19311931
from diff_diff.diagnostic_report import _render_overall_interpretation
19321932

19331933
schema = {
1934-
"schema_version": "1.0",
1934+
"schema_version": "2.0",
19351935
"estimator": {"class_name": "CallawaySantAnnaResults", "display_name": "CS"},
19361936
"headline_metric": {
19371937
"status": "ran",

tests/test_target_parameter.py

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -128,11 +128,21 @@ def test_efficient_did_pt_all(self):
128128
tp = describe_target_parameter(_minimal_result("EfficientDiDResults", pt_assumption="all"))
129129
assert tp["aggregation"] == "pt_all_combined"
130130
assert "PT-All" in tp["name"]
131+
# PR #347 R7 P1 regression: the definition must disambiguate
132+
# the library's cohort-size-weighted ``overall_att`` from the
133+
# paper's uniform-event-time ``ES_avg``.
134+
defn = tp["definition"]
135+
assert "cohort-size-weighted" in defn
136+
assert "ES_avg" in defn
137+
assert "post-treatment" in defn.lower()
131138

132139
def test_efficient_did_pt_post(self):
133140
tp = describe_target_parameter(_minimal_result("EfficientDiDResults", pt_assumption="post"))
134141
assert tp["aggregation"] == "pt_post_single_baseline"
135142
assert "PT-Post" in tp["name"]
143+
defn = tp["definition"]
144+
assert "cohort-size-weighted" in defn
145+
assert "ES_avg" in defn
136146

137147
def test_continuous_did(self):
138148
tp = describe_target_parameter(_minimal_result("ContinuousDiDResults"))

0 commit comments

Comments
 (0)