-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mapping new ladder to old ladder #146
Open
AkshitaB
wants to merge
29
commits into
main
Choose a base branch
from
akshitab/ladder_xC
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 2 commits
Commits
Show all changes
29 commits
Select commit
Hold shift + click to select a range
95c0c55
make duration multiplier configurable
AkshitaB 28d5e21
update changelog
AkshitaB 49db66d
add to __all__
AkshitaB ee2be4c
fix command
AkshitaB fcf102a
change data parallel type
AkshitaB 70dc6da
hsdp
AkshitaB 56fc563
add duration to name
AkshitaB 285a5b9
fix bug in overriding
AkshitaB 232f217
use actual num params
AkshitaB 55c9abf
Merge branch 'main' into akshitab/ladder_xC
AkshitaB 5ca1e7f
fix
AkshitaB e15448c
remove extra files
AkshitaB 65fab16
add zloss
AkshitaB 5ae6342
fix mock batch
AkshitaB 3fa28a8
loss settings: fused=True, compile=False
AkshitaB de38c25
Merge branch 'main' into akshitab/ladder_xC
AkshitaB 829f6fc
not fused
AkshitaB faf0de5
reduce microbatch size
AkshitaB 896fa54
reduce mbz further
AkshitaB a10c5e2
reset mbz
AkshitaB 4785aaf
fix model params
AkshitaB 8d9f535
Port over instance filtering from OLMo codebase
epwalsh 2a34982
changelog
epwalsh 6650a52
record percentage masked
epwalsh 77b192b
include count from rank 0 for comparison
epwalsh 269a95f
add to configs
epwalsh bfa53da
Merge branch 'epwalsh/instance-filter' into akshitab/ladder_xC
AkshitaB b55f599
add instance filtering
AkshitaB a617ae2
use loss computation from old trainer, for debugging
AkshitaB File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason to make this a required parameter and not just part of the config, with a default like 2xC?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, we usually run them as a cell in the grid of {model_sizes} x {chinchilla multipliers}, so it's convenient.