Record: SP8192 + Systems Optimization — val_bpb 1.0801 (3-seed mean) by codemath3000 · Pull Request #1583 · openai/parameter-golf

codemath3000 · 2026-04-13T02:48:39Z

Summary

val_bpb: 1.0801 (3-seed mean, std 0.0001) | 8xH100 SXM, 600s | Legal TTT
Systems-level optimizations on the PR Record: SP8192 + 3-Layer Recurrence + Parallel Residuals + QK-Gain 5.25 + Legal TTT — val_bpb 1.0810 (3-seed mean) #1493 SOTA stack: fused Muon kernel, batched EMA, superchunk eval
Identical ML; faster step time yields extra training steps in the same 600s budget
Per Record Criterion 1: "For submissions that improve speed through systems optimization without changing the ML, this requirement [0.005 nats] is waived." This submission changes only systems-level code (kernel fusion, batched ops, memory preallocation) without altering model architecture, optimizer logic, loss function, or any hyperparameter, meaning the 0.005 nats threshold is waived.

Submission series: This PR is one of three related submissions applying the same systems optimizations to different base stacks (PR #1493, PR #1529, PR #1578). We submit against multiple bases so that a ready-to-merge option exists regardless of how the pending PRs are resolved. Judges should feel free to evaluate whichever base(s) they consider valid and disregard the rest.

Results

Seed	TTT BPB	Artifact
0	1.0799	15,993,737
3141	1.0801	15,995,437
42	1.0802	15,993,201
Mean	1.0801	15,994,125

Test plan

3-seed training on 8xH100 SXM (seeds 0, 3141, 42)
All artifacts under 16MB
All runs under 600s training + 600s eval
Round-trip quantization verified
Judges verify reproducibility

🤖 Generated with Claude Code

Systems-level optimizations (fused Muon, EMA foreach, superchunk eval) on the PR openai#1493 SOTA stack. Identical ML; faster step time yields extra training steps. 3-seed mean: 1.0801 BPB / 2.7899 nats. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

codemath3000 mentioned this pull request Apr 13, 2026

New record submissions for review (#1583, #1584, #1585) #1587

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Record: SP8192 + Systems Optimization — val_bpb 1.0801 (3-seed mean)#1583

Record: SP8192 + Systems Optimization — val_bpb 1.0801 (3-seed mean)#1583
codemath3000 wants to merge 1 commit intoopenai:mainfrom
codemath3000:submission/systems-opt-sp8192

codemath3000 commented Apr 13, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

codemath3000 commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Results

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

codemath3000 commented Apr 13, 2026 •

edited

Loading