Skip to content

Commit da24a59

Browse files
igerberclaude
andcommitted
Address PR #402 R6 review (1 P3, mass-point + survey vcov caveat)
P3 mass-point + survey vcov requirement: per had.py:3495-3507 the mass-point design rejects the default classical vcov family on the survey_design= path with NotImplementedError (the survey path composes Binder-TSL on the HC1-scale influence function, which targets V_HC1 rather than the classical sandwich). The Step-6 sup-t / cband snippet in _handle_had_event_study and the HAD section in llms-full.txt presented weighted event-study fits as a generic survey_design= path without surfacing this constraint, so the example as written would fail at fit time on a mass-point panel. Both surfaces now make the requirement explicit: - The Step-6 snippet uses HeterogeneousAdoptionDiD(vcov_type='hc1', ...) with an inline comment explaining that hc1 is required on mass-point + survey and is a no-op on the continuous designs (which use the CCT-2014 robust SE regardless), making it a safe default for the survey-aware example. - A new "Mass-point + survey constraint" paragraph in the HAD section of llms-full.txt documents the same requirement and routing. Tests added (2 new, 92 total): - test_had_event_study_sup_t_snippet_uses_hc1_for_mass_point_survey_compatibility: asserts the sup-t / cband snippet either uses vcov_type='hc1' / robust=True or surfaces the mass-point + survey vcov requirement inline so agents adapting the snippet on a mass-point panel know to add it. - test_llms_full_had_section_documents_mass_point_survey_vcov_requirement: asserts the HAD section documents the mass-point + survey vcov requirement (vcov_type mention paired with mass-point context). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 0ecb635 commit da24a59

4 files changed

Lines changed: 76 additions & 1 deletion

File tree

diff_diff/guides/llms-full.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -657,6 +657,8 @@ es = est.fit(data, outcome_col='y', unit_col='unit',
657657

658658
**Staggered panels.** On multi-cohort panels with `aggregate="event_study"`, `fit()` auto-filters to the last treatment cohort plus never-treated units (paper Appendix B.2) and emits a `UserWarning` naming kept/dropped counts. The estimand is then a **last-cohort-only WAS**, not a multi-cohort average. For full multi-cohort staggered support, see `ChaisemartinDHaultfoeuille`.
659659

660+
**Mass-point + survey constraint.** When fitting `design="mass_point"` with `survey_design=` (or the deprecated `survey=` alias), `vcov_type="hc1"` (or `robust=True`) is required: the survey path composes the standard error via Binder-TSL on the HC1-scale influence function, so the default classical sandwich path raises `NotImplementedError`. Passing `vcov_type="hc1"` is a safe default on weighted survey examples since `vcov_type` is unused on the continuous designs (CCT-2014 robust SE is the only formula there).
661+
660662
### StackedDiD
661663

662664
Stacked DiD estimator (Wing, Freedman & Hollingsworth 2024). Addresses TWFE bias with corrective Q-weights.

diff_diff/practitioner.py

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1078,7 +1078,16 @@ def _handle_had_event_study(results: Any):
10781078
"from diff_diff import HeterogeneousAdoptionDiD, SurveyDesign\n"
10791079
"# Construct your survey design (adapt to your data):\n"
10801080
"sd = SurveyDesign(weights='weight_col')\n"
1081-
"est = HeterogeneousAdoptionDiD(n_bootstrap=999, seed=42)\n"
1081+
"# vcov_type='hc1' is REQUIRED on the mass-point design under\n"
1082+
"# survey_design= (the default classical sandwich raises\n"
1083+
"# NotImplementedError on the survey path because the\n"
1084+
"# Binder-TSL composition consumes the HC1-scale IF -\n"
1085+
"# see had.py:3495-3507). On the continuous designs the\n"
1086+
"# vcov_type kwarg is unused (CCT-2014 robust SE is the\n"
1087+
"# only formula), so passing vcov_type='hc1' is a no-op\n"
1088+
"# there and a safe default for the survey-aware example.\n"
1089+
"est = HeterogeneousAdoptionDiD(\n"
1090+
" n_bootstrap=999, seed=42, vcov_type='hc1')\n"
10821091
"es = est.fit(\n"
10831092
" data, outcome_col='y', unit_col='unit',\n"
10841093
" time_col='t', dose_col='d',\n"

tests/test_guides.py

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -452,6 +452,30 @@ def test_llms_full_had_results_class_field_lists_match_real_dataclass(self):
452452
f"is missing the public dataclass field {field.name!r}."
453453
)
454454

455+
def test_llms_full_had_section_documents_mass_point_survey_vcov_requirement(self):
456+
# Per had.py:3495-3507 the mass-point design rejects the default
457+
# classical vcov family on the survey_design= path
458+
# (NotImplementedError). The HAD section must surface this
459+
# requirement so an agent reading llms-full.txt and writing a
460+
# weighted mass-point fit knows to pass vcov_type='hc1'
461+
# explicitly. Without this caveat the documented fit() example
462+
# can fail at fit time on a mass-point panel.
463+
text = get_llm_guide("full")
464+
had_start = text.index("### HeterogeneousAdoptionDiD")
465+
had_end = text.index("### StackedDiD", had_start)
466+
had_text = text[had_start:had_end]
467+
# Must mention the mass-point + survey vcov requirement.
468+
# Accept either explicit "vcov_type" mention near "mass" wording
469+
# or the explicit "hc1" / "robust=True" pairing with mass-point.
470+
lower = had_text.lower()
471+
assert "vcov_type" in lower and ("mass-point" in lower or "mass_point" in lower), (
472+
"HAD section must document the mass-point + survey vcov "
473+
"requirement: passing vcov_type='hc1' (or robust=True) is "
474+
"required on design='mass_point' under survey_design= "
475+
"(per had.py:3495-3507). Without this caveat the documented "
476+
"weighted fit example can raise NotImplementedError."
477+
)
478+
455479
def test_llms_full_had_variance_formula_describes_all_designs(self):
456480
# Per diff_diff/had.py:3585-3629, weighted mass-point fits populate
457481
# variance_formula in {"pweight_2sls", "survey_binder_tsl_2sls"} and

tests/test_practitioner.py

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -690,6 +690,46 @@ def test_handle_continuous_step_4_snippet_is_valid_python(self, mock_continuous_
690690
if code.strip():
691691
ast.parse(code) # raises SyntaxError on failure
692692

693+
def test_had_event_study_sup_t_snippet_uses_hc1_for_mass_point_survey_compatibility(
694+
self, mock_had_event_study_results
695+
):
696+
# Per had.py:3495-3507 the mass-point design rejects the
697+
# default classical vcov family on the survey_design= path
698+
# (NotImplementedError). The Step-6 sup-t snippet shows a
699+
# generic weighted event-study fit; if it uses the default
700+
# vcov_type a copy/paste on a mass-point panel raises at
701+
# fit time. Snippet must either use vcov_type='hc1' /
702+
# robust=True OR explicitly note the requirement so agents
703+
# can adapt.
704+
output = practitioner_next_steps(mock_had_event_study_results, verbose=False)
705+
step_6_steps = [s for s in output["next_steps"] if s["baker_step"] == 6]
706+
assert len(step_6_steps) >= 1
707+
# Find the sup-t / cband step (sensitivity step).
708+
sup_t = next(
709+
(s for s in step_6_steps if "cband" in s.get("code", "")),
710+
None,
711+
)
712+
assert sup_t is not None, "sup-t / cband step not found at baker_step=6"
713+
snippet = sup_t.get("code", "")
714+
# Either the snippet itself uses vcov_type='hc1' / robust=True
715+
# OR it documents the requirement inline (so agents adapting
716+
# the snippet on a mass-point panel know to add it).
717+
ok = (
718+
"vcov_type='hc1'" in snippet
719+
or 'vcov_type="hc1"' in snippet
720+
or "robust=True" in snippet
721+
or ("mass-point" in snippet and "vcov_type" in snippet)
722+
or ("mass_point" in snippet and "vcov_type" in snippet)
723+
)
724+
assert ok, (
725+
"Sup-t / cband snippet must either use vcov_type='hc1' / "
726+
"robust=True or surface the mass-point + survey vcov "
727+
"requirement inline. Per had.py:3495-3507 the default "
728+
"classical sandwich raises NotImplementedError on the "
729+
"mass-point + survey path; the example as written would "
730+
"fail at fit time on a mass-point panel."
731+
)
732+
693733
def test_had_results_dataclass_docstrings_match_weighted_mass_point_contract(self):
694734
# PR #402 R3 fixed the llms-full.txt field descriptions to
695735
# acknowledge that weighted mass-point fits populate

0 commit comments

Comments
 (0)