Skip to content

Conversation

@Tomerporian
Copy link
Contributor

image

Adding cosine rewarmed scheduler. Rewarming to where cosine would have been if running for the total number of steps - of both original and rewarmed runs.

There are two arguments that are used:

--cosine-rewarmed-target-steps - set the total number of steps.
--cosine-rewarmed-original-warmup - number of warmup steps in the runs before rewarming. default: 1000.

Choose base_lr to be the base lr you would use in the run with total number of steps. The new base_lr is computed within the scheduler

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant