Non-record: Depth Recurrence 5x3 — Weight-Shared Looping Transformer (6xH200, val_bpb=1.2716)#319
Open
Arth-Singh wants to merge 1 commit intoopenai:mainfrom
Open
Non-record: Depth Recurrence 5x3 — Weight-Shared Looping Transformer (6xH200, val_bpb=1.2716)#319Arth-Singh wants to merge 1 commit intoopenai:mainfrom
Arth-Singh wants to merge 1 commit intoopenai:mainfrom