Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 35 additions & 0 deletions records/track_non_record_16mb/motif_sb1_rs2_g018/README.md.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
Tim Megna

This submission explores a minimal recurrent motif as a structural generator for sequence modeling. The motif was informed by an earlier geometric intuition: that large effective structures can be closed by a small affine walk rather than explicitly constructed.

In particular, the form |3n−2|{-2,5} is treated as an affine generator governing recurrence and reuse. This perspective motivated a compact shared-block architecture capable of producing extended structure through iteration rather than parameter expansion.

Two runs demonstrate that this motif provides stable path-like dynamics and effective closure under simple directional compositions (e.g., U/R/D/L walks), while remaining efficient in parameter usage.

Result

A minimal recurrent motif:

shared_block_size = 1
recurrence_steps = 2
recurrence_gate_init = 0.18

Final (roundtrip):
- val_loss: 4.7062
- val_bpb: 2.7873

Artifact:
- compressed: 1.92 MB
- raw: 8.47 MB
achieves:

2.787 bpb
improved compression relative to larger motif variants
significantly reduced parameter and compute footprint


This suggests that effective structure can be generated by a compact shared operator, rather than requiring explicit depth or width.

Notes on Evaluation

Logging and validation timing on the local system exhibit dependence on evaluation chunking (k) and print cadence. This affects perceived runtime during validation (e.g., heartbeat intervals), but does not impact correctness of reported loss or bpb.
1,510 changes: 1,510 additions & 0 deletions records/track_non_record_16mb/motif_sb1_rs2_g018/logs/motif_sb1_rs2_g018.txt

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
step:40/80 val_loss:5.3019 val_bpb:3.1401 train_time:5590ms step_avg:139.74ms
step:45/80 train_loss:5.1820 train_time:6264ms step_avg:139.20ms
step:50/80 train_loss:5.0359 train_time:6952ms step_avg:139.03ms
step:55/80 train_loss:5.0466 train_time:7633ms step_avg:138.77ms
step:60/80 train_loss:4.9535 train_time:8309ms step_avg:138.49ms
step:60/80 val_loss:4.9555 val_bpb:2.9349 train_time:8310ms step_avg:138.50ms
step:65/80 train_loss:4.9348 train_time:9111ms step_avg:140.17ms
step:70/80 train_loss:4.7258 train_time:9923ms step_avg:141.76ms
step:75/80 train_loss:4.5686 train_time:10613ms step_avg:141.50ms
step:80/80 train_loss:4.6509 train_time:11352ms step_avg:141.89ms
step:80/80 val_loss:4.7055 val_bpb:2.7868 train_time:11352ms step_avg:141.90ms
peak memory allocated: 382 MiB reserved: 408 MiB
Serialized model: 8413875 bytes
Code size: 60155 bytes
Total submission size: 8474030 bytes
Serialized model int8+zlib: 1862844 bytes (payload:2910240 raw_torch:2916189 payload_ratio:2.89x)
Total submission size int8+zlib: 1922999 bytes
final_int8_zlib_roundtrip val_loss:4.7062 val_bpb:2.7873 eval_time:364934ms
final_int8_zlib_roundtrip_exact val_loss:4.70623326 val_bpb:2.78729643
Loading