Unify bf16 gb300 qwen3 235b mapping#2670
Unify bf16 gb300 qwen3 235b mapping#2670dingqingy-nv wants to merge 1 commit intoNVIDIA-NeMo:mainfrom
Conversation
Signed-off-by: Dingqing Yang <dingqingy@nvidia.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughThis change simplifies a configuration file by replacing an explicit detailed definition of Changes
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~3 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 3 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs). Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
What does this PR do ?
Align the BF16 V2 config for Qwen3 235B A22B on GB300 with the MXFP8/FP8 config, so all precisions use the same parallelism strategy to avoid nan grad issue.
Summary by CodeRabbit