-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Pull requests: NVIDIA/Megatron-LM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
fix: remove unnecessary trailing comma in statement
#1265
opened Oct 29, 2024 by
singleheart
Loading…
Enabling LR scaling for a specific layer (ex. down-projection...) during pretraining
#1262
opened Oct 28, 2024 by
dhia680
Loading…
[ENHANCEMENT] Add support for Apex RMSNorm for use in qk-norm
#1261
opened Oct 28, 2024 by
wdevazelhes
Loading…
support qwen2 and siglip weight conversion script to enable training …
#1221
opened Oct 16, 2024 by
tao-githup
Loading…
[Functions] Support Packed_seq_params in Megatron-LM
#1215
opened Oct 12, 2024 by
Baibaifan
Loading…
Fix UnboundLocalError in initialize.py due to uninitialized 'seed' variable
#1211
opened Oct 11, 2024 by
JavaZeroo
Loading…
Update fully_parallel.py
stale
No activity in 60 days on issue or PR
#1067
opened Sep 4, 2024 by
weipingtao
Loading…
Fix duplicate init for self.module in DistributedDataParallel
stale
No activity in 60 days on issue or PR
#1065
opened Sep 4, 2024 by
Aurelius84
Loading…
use No activity in 60 days on issue or PR
torch.exp_
not torch.exp(..., out=)
stale
#1054
opened Aug 30, 2024 by
crcrpar
Loading…
Previous Next
ProTip!
What’s not been updated in a month: updated:<2024-10-10.