-
Notifications
You must be signed in to change notification settings - Fork 4.1k
Pull requests: NVIDIA/Megatron-LM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
chore: nightly sync main into dev (18_06_2026)
Run functional tests
Run MBridge tests
Attach this for testing this PR against MBridge main
#5402
opened Jun 18, 2026 by
svcnvidia-nemo-ci
•
Draft
[Fix] Fix MoE router z-loss compatibility with TE CUDA Graph capture.
community-request
#5401
opened Jun 18, 2026 by
Baibaifan
Loading…
fix(optimizer): route GatedDeltaNet in_proj to Adam instead of orthogonalizing it (Muon)
community-request
#5400
opened Jun 18, 2026 by
yuchenwang3
Loading…
Update goldens for weekly tests after pytorch and TE bumps.
complexity: high
#5399
opened Jun 18, 2026 by
balasaajay
Contributor
Loading…
1 of 6 tasks
Fix fast-cache-load rank synchronization guard
community-request
waiting-on-customer
Waiting on the original author to respond
#5398
opened Jun 18, 2026 by
sandyhouse
Loading…
1 task
Add RADIO vision encoder wrapper for MIMO example
complexity: medium
#5397
opened Jun 17, 2026 by
yashaswikarnati
Contributor
Loading…
perf(gated_delta_net): fold q/k L2-norm into the gated_delta_rule kernel
community-request
#5396
opened Jun 17, 2026 by
yuchenwang3
•
Draft
fix(optimizer): skip grad-norm clipping for orthogonalizing (Muon) optimizers
community-request
#5395
opened Jun 17, 2026 by
yuchenwang3
Loading…
fix: skip permute kernel launch when valid_tokens is zero (closes #4660)
community-request
#5393
opened Jun 17, 2026 by
botbikamordehai2-sketch
Loading…
[main] moe(perf): Refactor GDN A2A helper flow
complexity: medium
#5392
opened Jun 17, 2026 by
yuzhongw-nvidia
Contributor
Loading…
1 of 6 tasks
[dev] Add experimental decoupled compact LayerWise DDP layout for Muon
complexity: medium
#5388
opened Jun 17, 2026 by
Wohox
Contributor
Loading…
3 of 6 tasks
Add experimental Megatron-FSDP fully_shard implementation
complexity: medium
Final Review
PR is in the "final review" stage
MFSDPv2
Run tests
#5387
opened Jun 17, 2026 by
wujingyue
Contributor
Loading…
Add DSA/DSv4 Indexer Replay for RL training stability
community-request
#5386
opened Jun 17, 2026 by
ParamThakkar123
Loading…
route collectives through torchcomms
community-request
#5385
opened Jun 16, 2026 by
tushar00jain
•
Draft
Fix fused MLA down projection with tensor parallelism
complexity: low
Final Review
PR is in the "final review" stage
#5383
opened Jun 16, 2026 by
sraman-rgb
Contributor
Loading…
6 tasks
Add hetero MIMO entrypoint, bootstrap, and mock data (integration)
#5377
opened Jun 16, 2026 by
yashaswikarnati
Contributor
•
Draft
Add MIMO forward step and per-token loss for the stock schedule
#5376
opened Jun 16, 2026 by
yashaswikarnati
Contributor
•
Draft
Add hetero grid args and MoE process groups for MIMO example
#5375
opened Jun 16, 2026 by
yashaswikarnati
Contributor
•
Draft
Add Nemotron6-MoE VLM model provider for MIMO example
#5374
opened Jun 16, 2026 by
yashaswikarnati
Contributor
•
Draft
Support the MIMO cross-grid path in training loop
complexity: low
#5373
opened Jun 16, 2026 by
yashaswikarnati
Contributor
Loading…
Eagerly initialize MIMO bridge process groups and symmetrize leader broadcasts
#5370
opened Jun 16, 2026 by
yashaswikarnati
Contributor
•
Draft
Previous Next
ProTip!
What’s not been updated in a month: updated:<2026-05-18.