Skip to content

Commit 943495b

Browse files
committed
exp: pre-quant AdamW TTT probe on openai#1019 base
- AdamW TTT adapts full-precision EMA weights before GPTQ - Score-first approach (inference_mode then train) for compliance - Hyperparams: lr=0.0005, epochs=3, chunk=32768, cosine decay - 3-stage timing: TTT / AR self-gen+GPTQ / final eval - Uses _HessianGPT (non-banked) for TTT, rebanks for AR self-gen - Kill criteria: seed=1337 must reach <= 1.1156 BPP
1 parent 50390d6 commit 943495b

1 file changed

Lines changed: 1711 additions & 367 deletions

File tree

0 commit comments

Comments
 (0)