Skip to content

Commit 903d3ab

Browse files
authored
Merge pull request #372 from igerber/docs-refresh
Docs refresh: trim README to a 188-line landing page; redirect contributor conventions
2 parents 435dfc2 + 0638426 commit 903d3ab

21 files changed

Lines changed: 846 additions & 3171 deletions

.claude/commands/dev-checklists.md

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -152,8 +152,11 @@ Final checklist before approving a PR:
152152
153153
3. **Documentation Sync**:
154154
- [ ] Docstrings updated for all changed signatures
155-
- [ ] README updated if user-facing behavior changes
156-
- [ ] REGISTRY.md updated if methodology edge cases change
155+
- [ ] `diff_diff/guides/llms.txt` updated if a new estimator/feature appears in the public API (this is the AI-agent contract; it cascades to RTD)
156+
- [ ] `docs/api/*.rst` updated for new modules / signatures
157+
- [ ] `docs/references.rst` updated if a new scholarly source is cited
158+
- [ ] `README.md` updated ONLY if (a) new estimator catalog one-liner, (b) hero/badges/tagline change, or (c) top-level capability paragraph (Diagnostics & Sensitivity, Survey Support). Do NOT add usage examples, parameter tables, or per-estimator sections.
159+
- [ ] `REGISTRY.md` updated if methodology edge cases change
157160
158161
## Quick Reference: Common Patterns to Check
159162

.claude/commands/docs-check.md

Lines changed: 84 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -12,35 +12,54 @@ Verify that documentation is complete and includes appropriate scholarly referen
1212
The user may provide an optional argument: `$ARGUMENTS`
1313

1414
- If empty or "all": Run all checks (including map validation)
15-
- If "readme": Check README.md sections only
16-
- If "refs" or "references": Check scholarly references only
15+
- If "readme": Check the README catalog one-liner only
16+
- If "refs" or "references": Check scholarly references in `docs/references.rst` only
1717
- If "api": Check API documentation (RST files) only
1818
- If "tutorials": Check tutorial coverage only
19-
- If "map": Validate docs/doc-deps.yaml integrity only
19+
- If "map": Validate `docs/doc-deps.yaml` integrity only
20+
21+
## Documentation surface map (post 2026-04 docs refresh)
22+
23+
The README is a **landing page**, not the documentation. Each estimator/feature has documentation across multiple authoritative surfaces:
24+
25+
- **`diff_diff/guides/llms.txt`** - AI-agent contract (one-line catalog entry per estimator with paper citation + RTD link). Source of truth that mirrors into RTD via `html_extra_path` and into the wheel via `get_llm_guide()`.
26+
- **`docs/api/*.rst`** - Sphinx API reference (autoclass).
27+
- **`docs/references.rst`** - Bibliography (one entry per scholarly source, organized by sub-section).
28+
- **`docs/tutorials/*.ipynb`** - Hands-on examples.
29+
- **`README.md`** - **One-line catalog entry only** under `## Estimators` (or `## Diagnostics & Sensitivity` for diagnostic-class features). No usage examples, no parameter tables, no per-estimator section.
2030

2131
## Estimators and Required Documentation
2232

2333
The following estimators/features MUST have documentation:
2434

25-
### Core Estimators (require README section + API docs + references)
26-
27-
| Estimator | README Section | API RST | Reference Category |
28-
|-----------|----------------|---------|-------------------|
29-
| DifferenceInDifferences | "Basic Difference-in-Differences" | estimators.rst | "Difference-in-Differences" |
30-
| TwoWayFixedEffects | "Two-Way Fixed Effects" | estimators.rst | "Two-Way Fixed Effects" |
31-
| MultiPeriodDiD | "Multi-Period" | estimators.rst | "Multi-Period and Staggered" |
32-
| SyntheticDiD | "Synthetic DiD" or "Synthetic Difference" | estimators.rst | "Synthetic Difference-in-Differences" |
33-
| CallawaySantAnna | "Callaway" or "Staggered" | staggered.rst | "Multi-Period and Staggered" |
34-
| SunAbraham | "Sun" and "Abraham" | staggered.rst | "Multi-Period and Staggered" |
35-
| TripleDifference | "Triple Diff" or "DDD" | triple_diff.rst | "Triple Difference" |
36-
| TROP | "TROP" or "Triply Robust" | trop.rst | "Triply Robust Panel" |
37-
| HonestDiD | "Honest DiD" or "sensitivity" | honest_did.rst | "Honest DiD" |
38-
| BaconDecomposition | "Bacon" or "decomposition" | estimators.rst | "Multi-Period and Staggered" |
39-
40-
### Supporting Features (require README mention + API docs)
41-
42-
| Feature | README Mention | API RST |
43-
|---------|----------------|---------|
35+
### Core Estimators (require llms.txt entry + README catalog line + API docs + references)
36+
37+
| Estimator | llms.txt entry | README catalog | API RST | Reference Category |
38+
|-----------|---------------|----------------|---------|-------------------|
39+
| DifferenceInDifferences | "DifferenceInDifferences" | "DifferenceInDifferences" | estimators.rst | "Difference-in-Differences" |
40+
| TwoWayFixedEffects | "TwoWayFixedEffects" | "TwoWayFixedEffects" | estimators.rst | "Two-Way Fixed Effects" |
41+
| MultiPeriodDiD | "MultiPeriodDiD" | "MultiPeriodDiD" | estimators.rst | "Multi-Period and Staggered" |
42+
| SyntheticDiD | "SyntheticDiD" | "SyntheticDiD" | estimators.rst | "Synthetic Difference-in-Differences" |
43+
| CallawaySantAnna | "CallawaySantAnna" | "CallawaySantAnna" | staggered.rst | "Multi-Period and Staggered" |
44+
| SunAbraham | "SunAbraham" | "SunAbraham" | staggered.rst | "Multi-Period and Staggered" |
45+
| ImputationDiD | "ImputationDiD" | "ImputationDiD" | imputation.rst | "Multi-Period and Staggered" |
46+
| TwoStageDiD | "TwoStageDiD" | "TwoStageDiD" | two_stage.rst | "Multi-Period and Staggered" |
47+
| ChaisemartinDHaultfoeuille | "ChaisemartinDHaultfoeuille" | "ChaisemartinDHaultfoeuille" | chaisemartin_dhaultfoeuille.rst | "Multi-Period and Staggered" |
48+
| EfficientDiD | "EfficientDiD" | "EfficientDiD" | efficient_did.rst | "Multi-Period and Staggered" |
49+
| StackedDiD | "StackedDiD" | "StackedDiD" | stacked_did.rst | "Multi-Period and Staggered" |
50+
| ContinuousDiD | "ContinuousDiD" | "ContinuousDiD" | continuous_did.rst | "Multi-Period and Staggered" |
51+
| HeterogeneousAdoptionDiD | "HeterogeneousAdoptionDiD" | "HeterogeneousAdoptionDiD" | had.rst | "Heterogeneous Adoption (No-Untreated Designs)" |
52+
| TripleDifference | "TripleDifference" | "TripleDifference" | triple_diff.rst | "Triple Difference" |
53+
| StaggeredTripleDifference | "StaggeredTripleDifference" | "StaggeredTripleDifference" | staggered.rst | "Triple Difference" |
54+
| WooldridgeDiD | "WooldridgeDiD" | "WooldridgeDiD" | wooldridge_etwfe.rst | "Multi-Period and Staggered" |
55+
| TROP | "TROP" | "TROP" | trop.rst | "Triply Robust Panel" |
56+
| HonestDiD | n/a (in `## Diagnostics`) | n/a (in `## Diagnostics`) | honest_did.rst | "Honest DiD" |
57+
| BaconDecomposition | "BaconDecomposition" | "BaconDecomposition" | bacon.rst | "Multi-Period and Staggered" |
58+
59+
### Supporting Features (require llms.txt mention + API docs; README mention only if landing-page-relevant)
60+
61+
| Feature | llms.txt Mention | API RST |
62+
|---------|------------------|---------|
4463
| Wild bootstrap | "wild" and "bootstrap" | utils.rst |
4564
| Cluster-robust SE | "cluster" | utils.rst |
4665
| Parallel trends | "parallel trends" | utils.rst |
@@ -50,7 +69,7 @@ The following estimators/features MUST have documentation:
5069

5170
## Required Scholarly References
5271

53-
Each estimator category MUST have at least one scholarly reference in README.md:
72+
Each estimator category MUST have at least one scholarly reference in `docs/references.rst`:
5473

5574
### Reference Requirements
5675

@@ -95,33 +114,61 @@ Goodman-Bacon Decomposition:
95114

96115
Determine which checks to run based on `$ARGUMENTS`.
97116

98-
### 2. README Section Check
117+
### 2. llms.txt + README Catalog Check
118+
119+
For each estimator/diagnostic in the table above:
99120

100-
For each estimator in the table above:
101-
1. Read README.md
102-
2. Search for the required section/mention (case-insensitive)
103-
3. Report missing sections
121+
1. Read `diff_diff/guides/llms.txt` and verify the name appears under the right section:
122+
- **Estimators** (e.g. CallawaySantAnna, SunAbraham, TROP, BaconDecomposition): under `## Estimators`
123+
- **Diagnostics-class** (HonestDiD, and any future diagnostic-only entries): under `## Diagnostics and Sensitivity Analysis`
124+
2. Read `README.md` and verify the name appears in the matching flat catalog:
125+
- **Estimators**: in the `## Estimators` section
126+
- **Diagnostics-class** (HonestDiD): in the `## Diagnostics & Sensitivity` section
127+
3. Report missing entries
104128

105129
```bash
106-
# Example: Check if "Callaway" appears in README
107-
grep -i "callaway" README.md
130+
# Extract the README ## Estimators section. Use a flag-based awk because the
131+
# range form `awk '/^## Estimators/,/^## /'` self-terminates on the opening H2.
132+
extract_section() {
133+
awk -v target="$1" '
134+
$0 == "## " target { flag=1; next }
135+
flag && /^## / { flag=0 }
136+
flag { print }
137+
' README.md
138+
}
139+
140+
# Example: an estimator (lives in ## Estimators)
141+
extract_section "Estimators" | grep -c 'CallawaySantAnna'
142+
143+
# Example: a diagnostic (lives in ## Diagnostics & Sensitivity)
144+
extract_section "Diagnostics & Sensitivity" | grep -c 'Honest DiD'
145+
146+
# Always verify both surfaces
147+
grep -c 'CallawaySantAnna' diff_diff/guides/llms.txt
108148
```
109149

150+
Do NOT search for per-estimator README sections - they were intentionally removed in the 2026-04 docs refresh. The README's `## Estimators` and `## Diagnostics & Sensitivity` headings are the only valid catalog surfaces.
151+
110152
### 3. Scholarly References Check
111153

112154
For each reference category:
113-
1. Search README.md References section for required citations
155+
1. Search `docs/references.rst` for required citations (NOT README.md - the bibliography moved out of README in the 2026-04 docs refresh)
114156
2. Verify author names and year appear together
115157
3. Report missing references
116158

117-
Check patterns (case-insensitive):
159+
Check patterns (case-insensitive, run against `docs/references.rst`):
118160
- "Arkhangelsky.*2021" for Synthetic DiD
119161
- "Callaway.*Sant.Anna.*2021" for staggered
120162
- "Rambachan.*Roth.*2023" for Honest DiD
121163
- "Athey.*Imbens.*Qu.*Viviano.*2025" for TROP
122164
- "Goodman.Bacon.*2021" for Bacon decomposition
123165
- etc.
124166

167+
```bash
168+
# Example
169+
grep -i 'Arkhangelsky.*2021' docs/references.rst
170+
```
171+
125172
### 4. API Documentation Check
126173

127174
For each RST file in `docs/api/`:
@@ -176,12 +223,12 @@ Generate a summary report:
176223
```
177224
=== Documentation Completeness Check ===
178225
179-
README Sections:
180-
[PASS] DifferenceInDifferences - Found in "Basic Difference-in-Differences"
181-
[PASS] CallawaySantAnna - Found in "Staggered Adoption"
182-
[FAIL] NewEstimator - NOT FOUND
226+
llms.txt + README Catalog:
227+
[PASS] DifferenceInDifferences - Found in llms.txt and README Estimators catalog
228+
[PASS] CallawaySantAnna - Found in both surfaces
229+
[FAIL] NewEstimator - missing from llms.txt and README catalog
183230
184-
Scholarly References:
231+
Scholarly References (docs/references.rst):
185232
[PASS] Synthetic DiD - Arkhangelsky et al. (2021)
186233
[PASS] Honest DiD - Rambachan & Roth (2023)
187234
[FAIL] Bacon Decomposition - Missing Goodman-Bacon (2021)

.claude/commands/pre-merge-check.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -187,7 +187,9 @@ Based on your changes to: <list of changed files>
187187
```
188188
### Documentation Sync
189189
- [ ] Docstrings updated for changed function signatures
190-
- [ ] README updated if user-facing behavior changes
190+
- [ ] `diff_diff/guides/llms.txt` updated if the public API surface changed (AI-agent contract)
191+
- [ ] `docs/api/*.rst` and `docs/references.rst` updated as appropriate
192+
- [ ] `README.md` updated ONLY for landing-page-relevant changes (catalog one-liner, hero/badges/tagline, top-level capability paragraph). Per CONTRIBUTING.md, README is not the place for usage examples or per-estimator sections.
191193
```
192194

193195
#### If This Appears to Be a Bug Fix

.claude/commands/review-plan.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -225,7 +225,7 @@ Check for **missing related changes**:
225225
- Tests for new/changed functionality
226226
- `__init__.py` export updates
227227
- `get_params()` / `set_params()` updates for new parameters
228-
- Documentation updates (README, RST, tutorials, CONTRIBUTING.md, CLAUDE.md if design patterns change)
228+
- Documentation updates (`diff_diff/guides/llms.txt` for new public-API surfaces, `docs/api/*.rst`, `docs/references.rst` for new citations, tutorials, CONTRIBUTING.md, CLAUDE.md if design patterns change). README updates only if the change affects the landing page (new estimator catalog one-liner, hero/badges/tagline, top-level capability paragraph) - per CONTRIBUTING.md, README is not the place for usage examples or per-estimator sections.
229229
- For bug fixes: did the plan grep for ALL occurrences of the pattern, or just the one reported?
230230

231231
Check for **unnecessary additions**:

.claude/commands/review-pr.md

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -45,8 +45,10 @@ Analyze PRs across 6 dimensions:
4545

4646
### 4. Documentation Review
4747
- Docstrings for new/modified functions
48-
- README updates if needed
49-
- API documentation (RST files)
48+
- `diff_diff/guides/llms.txt` updated if a new public-API surface landed (AI-agent contract)
49+
- API documentation (RST files in `docs/api/`)
50+
- `docs/references.rst` updated for new scholarly citations
51+
- README updated ONLY for landing-page-relevant changes (catalog one-liner, hero/badges/tagline, top-level capability paragraph). Per CONTRIBUTING.md, README is not the place for usage examples or per-estimator sections.
5052
- Inline comments for complex logic
5153

5254
### 5. Performance
@@ -133,7 +135,7 @@ If no PR number is provided, use AskUserQuestion to request it.
133135

134136
## Part 4: Documentation Assessment
135137

136-
[Check for docstrings, README updates, API docs as needed]
138+
[Check docstrings, llms.txt for new public-API surfaces, API RST docs, references.rst for new citations, README only for landing-page-relevant changes]
137139

138140
---
139141

BRIEFING.md

Lines changed: 80 additions & 59 deletions
Original file line numberDiff line numberDiff line change
@@ -1,59 +1,80 @@
1-
# dcdh-by-path — Briefing
2-
3-
## The ask
4-
5-
Clément de Chaisemartin (dCDH author) suggested implementing the `by_path`
6-
option from R's `did_multiplegt_dyn`. It disaggregates the dynamic event-study
7-
by observed treatment trajectory so practitioners can compare paths like:
8-
9-
- `(0,1,0,0)` — one pulse
10-
- `(0,1,1,0)` — two periods on, then off
11-
- `(0,1,1,1)` — three periods on, then off
12-
- `(0,1,0,1)` vs `(0,1,1,0)` — sequencing
13-
14-
Use case: "is a single pulse enough, or do you need sustained exposure?"
15-
16-
## Where we stand today
17-
18-
`diff_diff/chaisemartin_dhaultfoeuille.py` implements `ChaisemartinDHaultfoeuille`.
19-
20-
- Supports reversible on/off treatments (the only estimator in the library
21-
that does)
22-
- **Currently drops multi-switch groups by default** (`drop_larger_lower=True`) —
23-
exactly the groups `by_path` wants to keep and compare
24-
- Stratifies by direction cohort (`DID_+`, `DID_-`, `S_g = sign(Δ)`) but not
25-
by trajectory
26-
- No `by_path`, `treatment_path`, or path-enumeration code exists anywhere
27-
- Not on ROADMAP.md; not in TODO.md
28-
29-
## Shape of the work
30-
31-
1. Parameter: likely `by_path: bool = False` (implies `drop_larger_lower=False`)
32-
2. Enumerate unique treatment histories `(D_{g,1}, …, D_{g,T})` per group;
33-
optionally accept a user-specified subset of paths of interest
34-
3. Per-path `DID_{g,l}` aggregation with influence-function SEs per path
35-
4. Result container extension: `path_effects` dict keyed by trajectory tuple,
36-
each holding ATT + SE + CI vectors
37-
5. Decide interaction with `drop_larger_lower`: probably forbid both being
38-
non-default simultaneously, or have `by_path` override
39-
6. REGISTRY.md section on path-heterogeneity methodology + deviation notes
40-
7. Methodology reference: `did_multiplegt_dyn` manual §on `by_path`; dCDH
41-
dynamic paper for the `DID_{g,l}` building block (already cited in REGISTRY)
42-
43-
## Open methodology questions (for plan mode)
44-
45-
- Which paths are enumerable? All observed, or user-specified subset only?
46-
R's default behavior on cardinality control is worth checking.
47-
- How does path stratification interact with the current cohort pooling
48-
`(D_{g,1}, F_g, S_g)` used for variance recentering — does it still apply
49-
per path?
50-
- Placebo and TWFE diagnostics: compute per-path or overall only?
51-
- Bootstrap interaction: per-path bootstrap blocks vs single bootstrap with
52-
per-path aggregation
53-
54-
## Before starting
55-
56-
- Pull the R manual section on `by_path` for `did_multiplegt_dyn` — the option
57-
spec there is load-bearing; don't infer from usage examples alone
58-
- Methodology changes: consult `docs/methodology/REGISTRY.md` first
59-
- New estimator surface → budget ~12-20 CI review rounds
1+
# docs-refresh — Briefing
2+
3+
## The goal
4+
5+
Two-part documentation sweep, sequenced as one initiative across multiple PRs:
6+
7+
1. **README.md aggressive trim**
8+
2. **RTD staleness audit + targeted fixes**
9+
10+
Tutorial work is OUT OF SCOPE — that's a separate worktree (`dcdh-tutorial`).
11+
12+
## Why now
13+
14+
Recent releases (3.0.x → 3.3.0) shipped a lot of new surface area without
15+
proportional README/RTD updates:
16+
17+
- HeterogeneousAdoptionDiD (entirely new estimator, multi-phase)
18+
- profile_panel() + llms-autonomous.txt
19+
- dCDH by_path + R parity
20+
- SDiD survey support across all three variance methods
21+
- BR/DR target_parameter (schema 2.0)
22+
- TROP backend parity
23+
24+
README is too long for skim consumption (SEO + first-impression problem).
25+
RTD likely has stale pages, missing API references, and outdated examples.
26+
27+
## Sequencing
28+
29+
### PR 1 — README aggressive trim
30+
Target a tight shape:
31+
- One-line value prop
32+
- Install (`pip install diff-diff`)
33+
- Minimal working example (5-10 lines, one estimator)
34+
- Estimator-list one-liner with link to RTD for full reference
35+
- Citation + license
36+
37+
Aggressive cuts. Anything that belongs on RTD goes to RTD (or stays there if
38+
already there). Don't try to be the docs.
39+
40+
Out of scope: rewriting RTD content that the README links to.
41+
42+
### PR 2+ — RTD staleness audit + fixes
43+
44+
Audit step (read-only):
45+
- Walk `docs/` and identify pages missing post-3.0.x estimators / surfaces
46+
- Cross-reference `docs/doc-deps.yaml` to surface known dependency drift
47+
- Categorize: missing API page, stale example, broken link, outdated narrative
48+
49+
Then fix in scoped PRs (one PR per coherent batch — e.g., "Add HAD API reference
50+
+ choosing-estimator entry", "Refresh practitioner decision tree for 3.3.0").
51+
52+
## What to read first
53+
54+
- `README.md` (current state, length)
55+
- `docs/index.rst` (RTD entry point)
56+
- `docs/doc-deps.yaml` (source-to-doc dependency map)
57+
- `docs/api/` (API reference pages — what's missing)
58+
- `docs/methodology/REGISTRY.md` (don't reformat; just cross-check it's
59+
referenced from RTD where appropriate)
60+
- `CLAUDE.md` "Documenting Deviations" section (label patterns, don't violate)
61+
62+
## Memory rules to honor
63+
64+
- Hyphens, not em dashes (writing style)
65+
- No competitor mentions in formal docs (ROADMAP / user-facing)
66+
- No version numbers as RTD section headings
67+
- diff-diff perspective (not neutral comparisons)
68+
- Tutorial-scope discipline does NOT apply here — this is reference docs
69+
70+
## Out of scope
71+
72+
- New tutorials (separate `dcdh-tutorial` worktree owns DCDH; HAD tutorial queued after)
73+
- ROADMAP.md restructuring (separate concern)
74+
- BR/DR positioning beyond "experimental preview" framing (per memory)
75+
76+
## Cleanup note
77+
78+
This BRIEFING.md was accidentally committed to main from a prior worktree
79+
session. Long-term, drop it from main and add to .gitignore so worktree
80+
briefings stay local.

CLAUDE.md

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -114,6 +114,21 @@ category (`Methodology/Correctness`, `Performance`, or `Testing/Docs`):
114114
|-------|----------|----|----------|
115115
| Description of deferred item | `file.py` | #NNN | Medium/Low |
116116

117+
## README discipline
118+
119+
`README.md` is a **landing page**, not the documentation. Target ~190 lines. The 3,119-line README that existed before the 2026-04 docs refresh grew because workflow conventions told contributors to add to README on every change.
120+
121+
When adding new functionality, the source of truth is:
122+
123+
- **`diff_diff/guides/llms.txt`** for the AI-agent contract (one-line catalog entry per estimator with paper citation + RTD link). This file is bundled in the wheel and published on RTD via `docs/conf.py` `html_extra_path`.
124+
- **`docs/api/*.rst`** for full API reference.
125+
- **`docs/references.rst`** for scholarly citations.
126+
- **`docs/tutorials/*.ipynb`** for hands-on examples.
127+
- **`CHANGELOG.md`** for release notes.
128+
- **`README.md`** for ONE LINE in the `## Estimators` flat catalog (or `## Diagnostics & Sensitivity` for diagnostic-class features). Do NOT add usage examples, parameter tables, per-estimator sections, or full bibliographies.
129+
130+
`/docs-impact` and `/docs-check` enforce these surfaces. See `CONTRIBUTING.md` "README is a landing page, not the docs" for the full convention.
131+
117132
## Testing Conventions
118133

119134
- **`ci_params` fixture** (session-scoped in `conftest.py`): Use `ci_params.bootstrap(n)` and

0 commit comments

Comments
 (0)