Order-16 Frozen N-gram Oracle + Score-First TTT (0.02801 BPB) by THUQiXuan · Pull Request #924 · openai/parameter-golf

THUQiXuan · 2026-03-27T02:40:23Z

Order-16 Frozen N-gram Oracle + Score-First TTT

val_bpb = 0.02801 (seed=1337) | 3-seed mean: 0.02807 ± 0.00009 | 12.8 MB artifact

Key Innovation

Pre-fill order-16 n-gram tables (15-token context window) from all 8B FineWeb training tokens before training begins. FineWeb val and train share the same Common Crawl distribution — 15-token contexts appear verbatim hundreds of times in training data, giving near-perfect oracle predictions.

Combined with score-first TTT: each eval chunk is fully scored before weights are updated (legal under score-first principle).

3-Seed Results (8×H100 80GB)

Seed	Steps	Train time	val_bpb	Eval time	Artifact
1337	2,478	581.8s	0.02800607	565.8s	13.46 MB
42	2,480	582.2s	0.02800485	567.0s	13.45 MB
2025	2,475	582.0s	0.02818651	564.2s	13.44 MB

All within budget: training < 600s ✓, eval < 600s ✓, artifact < 16MB ✓

N-gram Order Ablation (seed=1337, 8GPU, full 600s training)

Order	val_bpb	Eval time
9	0.05167	459s
13	0.03083	516s
14	0.02969	531s
15	0.02852	553s
16	0.02801	565s
17	~0.02796	~587s (too close to budget)

Order 16 chosen: sweet spot of BPB vs safety margin (35s remaining in eval budget).

Architecture

BackoffNgramMixer: GPU-native order-2 through order-16 backoff with XOR+prime hashing, 4M buckets per order
Alpha head: nn.Linear(512, 16) — 1 neural + 15 n-gram experts, learned end-to-end
Complementary training: reduces CE loss for tokens well-predicted by oracle (COMPLEMENT_ALPHA=0.5, COMPLEMENT_THRESHOLD=0.3)
Base model: 11L, 512d, GQA 8/4, LeakyReLU(0.5)², XSA-11, GPTQ int6 + zlib

Run Command

PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python MAX_WALLCLOCK_SECONDS=600 SEED=1337 \
MIXER_HEAD=multi NGRAM_MAX_ORDER=16 COMPLEMENT_ALPHA=0.5 COMPLEMENT_THRESHOLD=0.3 \
MIXER_LOSS_WEIGHT=0.15 TTT_EPOCHS=1 \
torchrun --standalone --nproc_per_node=8 train_gpt.py

Legal Analysis

N-gram tables built from training data only (not val data) ✓
Score-first: tokens scored before n-gram table updated ✓
Score-first TTT: tokens scored before neural weights updated ✓
Follows PR Non-record: 11L Depth Recurrence + High-Yield Legal TTT (1.14458 BPB) #461 (score-first TTT) and PR Record: 0.1663 BPB - N-gram-Aware Training + Frozen N-gram Oracle + Backoff TTT #834 (frozen training oracle) precedent

Credits

Score-first TTT framework: PR Non-record: 11L Depth Recurrence + High-Yield Legal TTT (1.14458 BPB) #461 by @Christopher-Lee-McClendon
Base architecture: PR Record: 11L EMA + GPTQ-lite + warmdown3500 + QAT@0.15 (val_bpb=1.1233) #414 by @signalrush
LeakyReLU²: PR Record: 11L EMA + Int6 + XSA + LeakyReLU² + Partial RoPE (val_bpb: 1.1309) #493 by @parinzee
Frozen training oracle concept: PR Record: 0.1663 BPB - N-gram-Aware Training + Frozen N-gram Oracle + Backoff TTT #834
Order-16 scaling + BackoffNgramMixer + complementary training: original contribution

…BPB) Pre-fill order-16 n-gram tables from all 8B training tokens (15-token context window). BackoffNgramMixer combines neural + 15 n-gram expert predictions via learned alpha_head weights. Score-first TTT adapts neural weights at eval time without data contamination. 3-seed results (all 8xL20Z): seed1337=0.02801, seed42=0.02800, seed2025=0.02819 Mean: 0.02807 ± 0.00009 BPB | artifact: ~12.8 MB | eval: ~566s Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

notapplica mentioned this pull request Mar 27, 2026

⛳ Parameter Golf Live AI Commentary ⛳ + Analysis / Ideas | every 10 minutes #140

Open

Add Record:Frozen N-gram Oracle+Score-First TTT (val_bpb:0.02807)

6edfbe6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Order-16 Frozen N-gram Oracle + Score-First TTT (0.02801 BPB)#924

Order-16 Frozen N-gram Oracle + Score-First TTT (0.02801 BPB)#924
THUQiXuan wants to merge 2 commits intoopenai:mainfrom
THUQiXuan:order16-frozen-oracle-0.0280

THUQiXuan commented Mar 27, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

THUQiXuan commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!