Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# NonTTT VR GA MixedInt5Int6

**Mean val_bpb: 1.1428** (3 seeds: 1337, 42, 2025)

## Key Techniques

1. **Non-TTT route**: No test-time training; standard train + quantize + compress pipeline.
2. **Mixed Int5/Int6 quantization** with zstd-22 compression.
3. **GQA attention** (8 heads, 4 KV heads) with tied embeddings.
4. **Sliding window evaluation** (stride=64) for final scoring.
5. **EMA weights** applied before export.

## Results

| Seed | val_loss | val_bpb | Steps | ms/step | Artifact Bytes |
|------|----------|---------|-------|---------|----------------|
| 1337 | 1.9296 | 1.14280 | 4622 | 129.82 | 16,026,184 |
| 42 | 1.9296 | 1.14281 | 4623 | 129.79 | 16,339,774 |
| 2025 | 1.9297 | 1.14287 | 4632 | 129.55 | 16,244,044 |
| **Mean** | **1.9296** | **1.14283** | | | |

## Notes

- All seeds trained for exactly 600s wallclock (8xH100 SXM).
- Peak memory: ~27 GiB allocated.
- **Warning**: All 3 seeds produce artifacts exceeding the 16MB limit (by 26KB–340KB).
- Eval method: `final_int6_sliding_window_exact`.
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
{
"track": "10min_16mb",
"date": "2026-03-23",
"name": "NonTTT VR GA MixedInt5Int6",
"author": "Asukabot0",
"seed_results": {
"1337": {"val_loss": 1.92955753, "val_bpb": 1.14279568, "steps": 4622, "ms_per_step": 129.82},
"42": {"val_loss": 1.92958502, "val_bpb": 1.14281196, "steps": 4623, "ms_per_step": 129.79},
"2025": {"val_loss": 1.92968544, "val_bpb": 1.14287144, "steps": 4632, "ms_per_step": 129.55}
},
"mean_val_loss": 1.92960933,
"mean_val_bpb": 1.14282636,
"artifact_bytes": {
"1337": 16026184,
"42": 16339774,
"2025": 16244044
},
"code_bytes": 74850,
"eval_method": "final_int6_sliding_window_exact",
"notes": "All 3 seeds exceed 16MB limit. Mean artifact: 16,203,334 bytes."
}
Loading