Record: Doc-Isolated TTT + Eval Optimizations by vivekvar-dl · Pull Request #964 · openai/parameter-golf

vivekvar-dl · 2026-03-27T16:29:46Z

Summary

Built on PR #549 (LeakyReLU² + Legal TTT + Parallel Muon, 1.1194 BPB).

Document-Isolated TTT: Reset TTT optimizer state at BOS document boundaries to prevent cross-document contamination. PR Non-record: 11L Depth Recurrence + High-Yield Legal TTT (1.14458 BPB) #461 showed -0.011 BPB from doc isolation alone — never applied to the frontier architecture.
Temperature scaling: Grid search T=0.90-1.00 on quantized model at eval time.
Base architecture unchanged: 11L/512d, LeakyReLU(0.5)², XSA4, Parallel Muon, GPTQ-lite int6+LZMA.

Status

Work in progress. Requesting compute credits for 8xH100 validation runs.

Dev validation (1xH100 NVL)

Base architecture reproduces correctly (1.39 BPB at 920 steps, consistent with 1xH100 scaling)
Tested and rejected: sp4096 vocab (per-token loss overtakes tokens_per_byte gain at convergence), NorMuon, ProRes

Target

1.09-1.11 BPB (pending 8xH100 validation)

Test plan

3-seed validation on 8xH100 SXM
Statistical significance (p < 0.01 for 0.005-nat improvement)
Verify artifact under 16MB

Built on PR openai#549 stack. Adds document-isolated TTT (reset optimizer at BOS boundaries) and temperature scaling. Pending 8xH100 validation.

Record: Doc-Isolated TTT + Eval Optimizations (WIP)

47c8b7f

Built on PR openai#549 stack. Adds document-isolated TTT (reset optimizer at BOS boundaries) and temperature scaling. Pending 8xH100 validation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Record: Doc-Isolated TTT + Eval Optimizations#964

Record: Doc-Isolated TTT + Eval Optimizations#964
vivekvar-dl wants to merge 1 commit intoopenai:mainfrom
vivekvar-dl:submission/doc-isolated-ttt

vivekvar-dl commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

vivekvar-dl commented Mar 27, 2026

Summary

Status

Dev validation (1xH100 NVL)

Target

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant