Skip to content

Commit 8c68855

Browse files
igerberclaude
andcommitted
Add inline regeneration script to HRS fixture README
P3: Replace reference to non-committed plan file with the exact Python command sequence used to build hrs_edid_validation.csv. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 2340292 commit 8c68855

1 file changed

Lines changed: 22 additions & 1 deletion

File tree

tests/data/README.md

Lines changed: 22 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,4 +31,25 @@ Section 6:
3131
**Columns:** `unit` (hhidpn), `time` (wave), `outcome` (oop_spend, 2005 dollars), `first_treat` (first_hosp)
3232

3333
**Regeneration:** Requires the Dobkin et al. replication kit (`.gitignore`d as `replication_data/`).
34-
The extraction logic is documented in the plan file and was executed as a one-time preprocessing step.
34+
35+
```python
36+
import pandas as pd, numpy as np
37+
df = pd.read_stata("replication_data/116186-V1/Replication-Kit/HRS/Data/HRS_long.dta")
38+
sub = df[df["wave"].isin([7, 8, 9, 10, 11])]
39+
balanced = sub.groupby("hhidpn")["wave"].nunique()
40+
sub = sub[sub["hhidpn"].isin(balanced[balanced == 5].index)]
41+
sub = sub[sub["hhidpn"].isin(sub[sub["first_hosp"].notna()]["hhidpn"].unique())]
42+
fh = sub.groupby("hhidpn")["first_hosp"].first()
43+
sub = sub[sub["hhidpn"].isin(fh[fh >= 8].index)]
44+
ages = sub.groupby("hhidpn")["age_hosp"].first()
45+
sub = sub[sub["hhidpn"].isin(ages[(ages >= 50) & (ages <= 59)].index)]
46+
sub = sub[sub["wave"] <= 10]
47+
sub["first_treat"] = sub["first_hosp"].apply(lambda x: np.inf if x == 11 else int(x))
48+
out = sub[["hhidpn", "wave", "oop_spend", "first_treat"]].copy()
49+
out.columns = ["unit", "time", "outcome", "first_treat"]
50+
out["unit"] = out["unit"].astype(int)
51+
out["time"] = out["time"].astype(int)
52+
out.sort_values(["unit", "time"]).reset_index(drop=True).to_csv(
53+
"tests/data/hrs_edid_validation.csv", index=False
54+
)
55+
```

0 commit comments

Comments
 (0)