You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This code achieves a bit better (lower) loss for a worse (longer) training time. For now we decided it's not worthed the extra training time and additional code complexity. It will be reconsidered with further improvements.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Rsearch that didn't accelerate the training speed but shows promise and code that will be used if the need arises (to avoid bloating the codebase)
Research
TurboMuon - #65
Unet connection of shallow and deep layers - PR
This code achieves a bit better (lower) loss for a worse (longer) training time. For now we decided it's not worthed the extra training time and additional code complexity. It will be reconsidered with further improvements.
Baseline: 16m 44s 139ms
With unet: 18m 38s 397ms
Features
Adding Docker - Pull Request
Beta Was this translation helpful? Give feedback.
All reactions