Skip to content

non-record: LR warmdown on 1x A40 (1.723 bpb, 8.40MB)#313

Open
my-sonicase wants to merge 1 commit intoopenai:mainfrom
my-sonicase:submit-lr-warmdown-a40
Open

non-record: LR warmdown on 1x A40 (1.723 bpb, 8.40MB)#313
my-sonicase wants to merge 1 commit intoopenai:mainfrom
my-sonicase:submit-lr-warmdown-a40

Conversation

@my-sonicase
Copy link

This PR adds a non-record submission under track_10min_16mb.

Summary:
This submission improves over a local MLX baseline (~1.87 bpb) and demonstrates that schedule tuning alone yields significant gains under the 16MB constraint.

  • baseline architecture
  • no tokenizer or dataset changes
  • schedule tuning only:
    • WARMDOWN_ITERS=3600
    • MATRIX_LR=0.06

Result:

  • final int8+zlib roundtrip val_bpb: 1.7232
  • total submission size int8+zlib: 8,397,395 bytes

Hardware:

  • 1x A40
  • 600s wallclock-capped run

This is a reproducible non-record submission demonstrating a simple improvement from training schedule tuning under the 16MB constraint.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant