forked from openai/parameter-golf
-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy path06_EXPERIMENT_REGISTRY.jsonl
More file actions
14 lines (14 loc) · 4.12 KB
/
06_EXPERIMENT_REGISTRY.jsonl
File metadata and controls
14 lines (14 loc) · 4.12 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
{"id":"E000","type":"bootstrap","description":"Project initialized, upstream forked, frontier state ingested","timestamp":"2026-03-23T03:30:00Z","status":"complete"}
{"id":"E001","type":"ops","description":"Verified live frontier from upstream README and issue #140, extracted exact PR heads for #414/#505/#508, and added repro-sync plus TTT legality audit tooling","timestamp":"2026-03-23T19:15:00Z","status":"complete"}
{"id":"E002","type":"ops","description":"Added durable run control: pod-local watchdog, PR #414 smoke/full run specs, and 11_RUN_CONTROL state files for resumable execution.","timestamp":"2026-03-23T20:20:00Z","status":"complete"}
{"id":"E003","type":"smoke","description":"Completed PR #414 operational smoke run on 1x H100 SXM: train/eval/artifact path worked end-to-end, but score was not promotion-worthy and is not publishable.","timestamp":"2026-03-23T20:37:00Z","status":"complete"}
{"id":"E004","type":"ops","description":"Managed PR #414 smoke on 1x H100 PCIe reached the watchdog and mirror path but failed during bootstrap because FLASH_ATTN_REF=v3.0.0 did not exist upstream; pod was stopped and the control plane was repinned to a real upstream commit.","timestamp":"2026-03-24T03:24:00Z","status":"complete"}
{"id":"E005","type":"smoke","description":"Managed PR #414 smoke completed successfully on 1x H100 PCIe after repinning FlashAttention. End-to-end operational path is now validated, but the run remains non-publishable and exposed a massive FlashAttention bootstrap tax that must be amortized before any 8x reproduction.","timestamp":"2026-03-24T05:57:26Z","status":"complete"}
{"id":"E006","type":"ops","description":"Validated FlashAttention warm-start on 1x H100 PCIe. Bootstrap-only run completed in 345.076s using the cached payload, proving the 7709.693s source-build tax can be amortized before any future expensive run.","timestamp":"2026-03-25T06:54:11Z","status":"complete"}
{"id":"E007","type":"ops","description":"Synced exact upstream record files for the current highest-priority reproduction targets PR #868 and PR #913, plus refreshed the sync tool to keep those targets in the standard manifest.","timestamp":"2026-03-27T06:05:00Z","status":"complete"}
{"id":"E008","type":"ops","description":"Synced exact upstream record files for PR #933 and refreshed the project memory to reflect the March 27-28 cache frontier shift.","timestamp":"2026-03-28T04:20:00Z","status":"complete"}
{"id":"E009","type":"full_repro","description":"Completed the first full provider-staged 8x H100 SXM PR #868 repro. The run finished cleanly and produced val_bpb 0.09749802 with a 13,416,133-byte artifact, but it is outside the project repro-tolerance band relative to the claimed 0.11814796 and therefore requires mismatch analysis before promotion.","timestamp":"2026-03-28T15:11:12Z","status":"complete"}
{"id":"E010","type":"audit","description":"Completed the PR #868 mismatch audit. The base model path matches upstream closely, but the n-gram evaluator sees 63 chunks instead of the upstream 237, pointing to eval-surface drift from an unpinned challenge-data snapshot rather than a simple top-level config mismatch.","timestamp":"2026-03-28T19:32:14Z","status":"complete"}
{"id":"E011","type":"ops","description":"Frozen a tracked PR #868 challenge-data surface, generated a pinned-manifest parity rerun spec, and armed a final $100 self-funded parity campaign so the next spend resolves the reproduction gap instead of widening search.","timestamp":"2026-03-28T20:15:00Z","status":"complete"}
{"id":"E012","type":"ops","description":"The first pinned-manifest PR #868 parity rerun failed during prepare_data because the manifest override path was passed to the pod as a literal ${REPO_DIR}/... string. Patched path expansion and regenerated the parity spec for an immediate same-campaign rerun.","timestamp":"2026-03-28T21:25:00Z","status":"complete"}
{"id": "E013", "type": "full_repro", "description": "Completed the pinned-manifest PR #868 parity rerun on 8x H100 SXM. The run still landed at val_bpb 0.09674850 with 63 chunks versus upstream 237, so the divergence persists after the frozen eval surface.", "timestamp": "2026-03-28T21:48:16Z", "status": "complete"}