Non-record: Turbo-Muon + EngramLite(10240) + VE(8,9,10) — val_bpb 1.1431 by SergheiBrinza · Pull Request #1205 · openai/parameter-golf

SergheiBrinza · 2026-04-01T01:18:36Z

Summary

Non-record submission based on the PR #1089 Turbo-Muon + EngramLite stack with hyperparameter tuning.

val_bpb: 1.1431 (3-seed mean, std 0.0007)

Seed	val_bpb (sliding)
1337	1.1425
42	1.1438
2024	1.1431

Changes from PR #1089

Higher LR (0.030 vs 0.025) for faster convergence
Wider EngramLite (10240x48 vs 8192x32) for more n-gram coverage
VE on layers 8,9,10 (vs 9,10) for additional token identity injection
Warmdown 4500 (vs 3500) for smoother weight averaging
Muon momentum warmup 1000 steps (vs 1500)

Key Finding

The increased model size (~31.6M vs 30.7M params) pushed the artifact to 16.36MB pre-compression, forcing all 66 weight groups into int5 with 0 promotions to int6/int7 and 20.5% selective pruning. This aggressive quantization likely offset the architectural gains. The 16MB budget is extremely tight — even small parameter increases can cascade into significant quality loss through the quantization pipeline.

Hardware

8xH100 80GB SXM, 600s training, ~5550 steps at 106ms/step.

… 1.1431 Based on PR openai#1089 stack with hyperparameter tuning: - Higher LR (0.030 vs 0.025) for faster convergence - Wider EngramLite (10240x48 vs 8192x32) - VE on layers 8,9,10 (vs 9,10) - Warmdown 4500 (vs 3500) - Muon momentum warmup 1000 steps (vs 1500) 3-seed mean: 1.1431 (std 0.0007) Seeds: 1337=1.1425, 42=1.1438, 2024=1.1431

SergheiBrinza added 2 commits March 21, 2026 21:32

Add submission: Mixed Quantization + BigramHash + SWA (val_bpb 1.2421)

3b714bc

SergheiBrinza force-pushed the submission/2026-04-01_TurboMuon_EngramLite_Improved branch from 2d2f0d7 to 974948e Compare April 1, 2026 01:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Non-record: Turbo-Muon + EngramLite(10240) + VE(8,9,10) — val_bpb 1.1431#1205

Non-record: Turbo-Muon + EngramLite(10240) + VE(8,9,10) — val_bpb 1.1431#1205
SergheiBrinza wants to merge 2 commits intoopenai:mainfrom
SergheiBrinza:submission/2026-04-01_TurboMuon_EngramLite_Improved

SergheiBrinza commented Apr 1, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

SergheiBrinza commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes from PR #1089

Key Finding

Hardware

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

SergheiBrinza commented Apr 1, 2026 •

edited

Loading