Three Breadsticks: 1.1190 BPB (3-seed mean 1.1195) by newjordan · Pull Request #656 · openai/parameter-golf

newjordan · 2026-03-24T23:17:18Z

Results

Seed	Pre-TTT BPB	TTT BPB	Artifact
1337	1.1196	1.1195	15.90 MB
42	1.1199	1.1200	15.61 MB
2045	1.1191	1.1190	15.81 MB
Mean	1.1195	1.1195	—

Progression

PR	Mean BPB	Notes
#577, #533	1.1207*	Initial GPTQ submission
#578, #508	1.1215	QAT + TTT refinement
#587	1.1208	XSA + quantization tuning
#656	1.1195	Activation + eval improvements

*single seed

Architecture

11L/512d U-Net, 26.93M params. GPTQ int6+zstd, legal score-first TTT.

Reproduce

SEED=2045 torchrun --nproc_per_node=8 train_gpt.py

8xH100 SXM, 600s wallclock, ~6,900 steps.

11L/512d U-Net with leaky_relu_sq (slope 0.5), XSA last 4, bigram 1536, legal score-first TTT (freeze_blocks=0, grad_clip=0.8). 3-seed results: seed 1337: 1.1195 post-TTT (15.90MB) seed 42: 1.1200 post-TTT (15.61MB) seed 2045: 1.1190 post-TTT (15.81MB) mean: 1.1195 Run: SEED=2045 MLP_ACT=leaky_relu_sq MLP_LEAKY_SLOPE=0.5 \ XSA_LAST_N=4 BIGRAM_VOCAB_SIZE=1536 \ TTT_FREEZE_BLOCKS=0 TTT_GRAD_CLIP=0.8 \ torchrun --nproc_per_node=8 train_gpt.py Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

valerio-oai · 2026-03-25T06:20:17Z

This submission dos not include a submission.json or train logs, so I can't verify it enough to score it. Additionally, from the training code it looks like it applies GPTQ with training data calibration at eval time, which is disallowed. Closing for now.

newjordan · 2026-03-25T12:51:37Z

The GPTQ concern is wrong — GPTQ runs during training, not eval. But we need to address it clearly. - you really went pretty hard on PR closing as opposing to let us siubmit the logs Valerio. I would appreciate a bit more nuance as you operate as a judge before closing a PR you assume with no log. its a very easy commit to add a log to a PR. We should be allowed to show proof before you close these.

valerio-oai · 2026-03-25T16:21:27Z

Hi @newjordan , I will be less heavy-handed with directly closing PRs in future (I thought you could re-open them, my apologies). However, the script you were using looks like it first used up all the 600s of train time, then did GPTQ calibration, given the logs of your most recent "Podracing" run.

Definitionally, since you only have 600s of training time, the calibration must therefor be happening during eval time, which is disallowed, hence my closure.

newjordan · 2026-03-25T16:35:10Z

apologies for intensity. This competition is very important to me right now and I am being dramatic. ty (I accidently made my first Pr private and it pulled all my early wins.. so im jsut sore right now) Ty very much.

notapplica mentioned this pull request Mar 24, 2026

⛳ Parameter Golf Live AI Commentary ⛳ + Analysis / Ideas | every 10 minutes #140

Open

newjordan changed the title ~~Three Breadsticks: 1.1190 BPB~~ Three Breadsticks: 1.1190 BPB (3-seed mean 1.1195) Mar 24, 2026

newjordan mentioned this pull request Mar 25, 2026

Podracing: 1.0461 BPB (3-seed mean) #674

Closed

valerio-oai closed this Mar 25, 2026

valerio-oai mentioned this pull request Mar 25, 2026

Illegal submissions megathread #677

Open

newjordan mentioned this pull request Mar 25, 2026

Podracing: 1.0461 BPB (3-seed mean) — 5-gram eval + LeakyReLU² #706

Open

newjordan mentioned this pull request Mar 25, 2026

Podracing II: Electric Bugaloo — 0.9625 BPB (3-seed mean, all sub-0.964) #753

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Three Breadsticks: 1.1190 BPB (3-seed mean 1.1195)#656

Three Breadsticks: 1.1190 BPB (3-seed mean 1.1195)#656
newjordan wants to merge 1 commit intoopenai:mainfrom
newjordan:submission/three-breadsticks

newjordan commented Mar 24, 2026 •

edited

Loading

Uh oh!

valerio-oai commented Mar 25, 2026

Uh oh!

newjordan commented Mar 25, 2026

Uh oh!

valerio-oai commented Mar 25, 2026

Uh oh!

newjordan commented Mar 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

newjordan commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Results

Progression

Architecture

Reproduce

Uh oh!

valerio-oai commented Mar 25, 2026

Uh oh!

newjordan commented Mar 25, 2026

Uh oh!

valerio-oai commented Mar 25, 2026

Uh oh!

newjordan commented Mar 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

newjordan commented Mar 24, 2026 •

edited

Loading