Commit 943495b
committed
exp: pre-quant AdamW TTT probe on openai#1019 base
- AdamW TTT adapts full-precision EMA weights before GPTQ
- Score-first approach (inference_mode then train) for compliance
- Hyperparams: lr=0.0005, epochs=3, chunk=32768, cosine decay
- 3-stage timing: TTT / AR self-gen+GPTQ / final eval
- Uses _HessianGPT (non-banked) for TTT, rebanks for AR self-gen
- Kill criteria: seed=1337 must reach <= 1.1156 BPP1 parent 50390d6 commit 943495b
1 file changed
Lines changed: 1711 additions & 367 deletions
0 commit comments