Three Breadsticks: 1.1190 BPB (3-seed mean 1.1195)#656
Three Breadsticks: 1.1190 BPB (3-seed mean 1.1195)#656newjordan wants to merge 1 commit intoopenai:mainfrom
Conversation
11L/512d U-Net with leaky_relu_sq (slope 0.5), XSA last 4,
bigram 1536, legal score-first TTT (freeze_blocks=0, grad_clip=0.8).
3-seed results:
seed 1337: 1.1195 post-TTT (15.90MB)
seed 42: 1.1200 post-TTT (15.61MB)
seed 2045: 1.1190 post-TTT (15.81MB)
mean: 1.1195
Run: SEED=2045 MLP_ACT=leaky_relu_sq MLP_LEAKY_SLOPE=0.5 \
XSA_LAST_N=4 BIGRAM_VOCAB_SIZE=1536 \
TTT_FREEZE_BLOCKS=0 TTT_GRAD_CLIP=0.8 \
torchrun --nproc_per_node=8 train_gpt.py
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
This submission dos not include a submission.json or train logs, so I can't verify it enough to score it. Additionally, from the training code it looks like it applies GPTQ with training data calibration at eval time, which is disallowed. Closing for now. |
|
The GPTQ concern is wrong — GPTQ runs during training, not eval. But we need to address it clearly. - you really went pretty hard on PR closing as opposing to let us siubmit the logs Valerio. I would appreciate a bit more nuance as you operate as a judge before closing a PR you assume with no log. its a very easy commit to add a log to a PR. We should be allowed to show proof before you close these. |
|
Hi @newjordan , I will be less heavy-handed with directly closing PRs in future (I thought you could re-open them, my apologies). However, the script you were using looks like it first used up all the 600s of train time, then did GPTQ calibration, given the logs of your most recent "Podracing" run. Definitionally, since you only have 600s of training time, the calibration must therefor be happening during eval time, which is disallowed, hence my closure. |
|
apologies for intensity. This competition is very important to me right now and I am being dramatic. ty (I accidently made my first Pr private and it pulled all my early wins.. so im jsut sore right now) Ty very much. |
Results
Progression
*single seed
Architecture
11L/512d U-Net, 26.93M params. GPTQ int6+zstd, legal score-first TTT.
Reproduce
8xH100 SXM, 600s wallclock, ~6,900 steps.