Skip to content

Commit 5135299

Browse files
committed
Path A: PR openai#1019 base + legal score-first TTT, no SLOT — pending H100 validation
1 parent e3e7c7a commit 5135299

3 files changed

Lines changed: 2358 additions & 0 deletions

File tree

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
# PR #1019 Base + Legal TTT (Score-First SGD)
2+
3+
## Architecture
4+
Built on PR #1019 (AR Self-Gen Full Hessian GPTQ + XSA-all 11L + BigramHash 3072x112).
5+
Added legal score-first TTT (SGD, lr=0.002, 3 epochs, chunk_tokens=32768, freeze_blocks=0).
6+
No SLOT. No n-gram caches.
7+
8+
## TTT Legality
9+
Score-first: each chunk is evaluated under torch.inference_mode() before any weight update.
10+
Chunk N is scored, then model adapts on chunk N for future chunks.
11+
No multi-epoch pre-eval training. No future token access.
12+
13+
## Results (pending)
14+
| Seed | val_bpb | val_loss | eval_time |
15+
|------|---------|----------|-----------|
16+
| 314 | TBD | TBD | TBD |
17+
| 42 | TBD | TBD | TBD |
18+
| 999 | TBD | TBD | TBD |
19+
| mean | TBD | TBD | TBD |
20+
21+
## Run Command
22+
```bash
23+
TTT_ENABLED=1 TTT_LR=0.002 TTT_EPOCHS=3 TTT_CHUNK_TOKENS=32768 \
24+
TTT_FREEZE_BLOCKS=0 TTT_MOMENTUM=0.9 TTT_BATCH_SEQS=32 TTT_GRAD_CLIP=1.0 \
25+
SEED=314 \
26+
DATA_PATH=./data/datasets/fineweb10B_sp1024/ \
27+
TOKENIZER_PATH=./data/tokenizers/fineweb_1024_bpe.model \
28+
VOCAB_SIZE=1024 \
29+
torchrun --standalone --nproc_per_node=8 \
30+
records/track_10min_16mb/2026-04-01_PR1019_TTT_Clean/train_gpt.py \
31+
2>&1 | tee seed314_ttt.log
32+
```
Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
{
2+
"val_bpb": 0.0,
3+
"val_loss": 0.0,
4+
"artifact_bytes": 0,
5+
"seeds": [314, 42, 999],
6+
"seed_bpbs": [0.0, 0.0, 0.0],
7+
"eval_stride": 64,
8+
"eval_times_ms": [0, 0, 0],
9+
"status": "pending_h100_run"
10+
}

0 commit comments

Comments
 (0)