Skip to content

Pull requests: NVIDIA/Megatron-LM

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Huvu/update t5 attentionmasktype
#1273 opened Nov 4, 2024 by huvunvidia Loading…
Update t5_model.py
#1271 opened Nov 2, 2024 by huvunvidia Loading…
[ENHANCEMENT] Add z-loss
#1270 opened Nov 1, 2024 by wdevazelhes Loading…
Enable huggingface tokenizer
#1268 opened Oct 30, 2024 by msiddaiah Loading…
fix: remove unnecessary trailing comma in statement
#1265 opened Oct 29, 2024 by singleheart Loading…
Add support to process gzip files
#1260 opened Oct 28, 2024 by puneeshkhanna Loading…
[Wrong spelling] Update training.py
#1229 opened Oct 21, 2024 by zyqhnu Loading…
Typo fix in readme
#1223 opened Oct 17, 2024 by alexchen4ai Loading…
readme spelling correction
#1216 opened Oct 13, 2024 by jonassteinberg1 Loading…
[Functions] Support Packed_seq_params in Megatron-LM
#1215 opened Oct 12, 2024 by Baibaifan Loading…
Embedding
#1209 opened Oct 10, 2024 by rachitgarg91 Loading…
Dev/optimizer offloading
#1205 opened Oct 10, 2024 by lostkevin Loading…
fix bugs for multi_latent_attention
#1203 opened Oct 9, 2024 by xqiangx1991 Loading…
Use consistent assert message
#1195 opened Oct 3, 2024 by youzagou Loading…
Expose cp_comm_type in ModelParallelConfig
#1160 opened Sep 27, 2024 by zochaoq Loading…
Enabling UCC backend for PP communication
#1157 opened Sep 24, 2024 by youngeunkwon0405 Loading…
opt:opt ltor masks
#1155 opened Sep 24, 2024 by Baibaifan Loading…
Update fully_parallel.py stale No activity in 60 days on issue or PR
#1067 opened Sep 4, 2024 by weipingtao Loading…
Fix duplicate init for self.module in DistributedDataParallel stale No activity in 60 days on issue or PR
#1065 opened Sep 4, 2024 by Aurelius84 Loading…
use torch.exp_ not torch.exp(..., out=) stale No activity in 60 days on issue or PR
#1054 opened Aug 30, 2024 by crcrpar Loading…
ProTip! What’s not been updated in a month: updated:<2024-10-10.