Revert MFSDP mixed precision config arguments. by cspades · Pull Request #2848 · NVIDIA-NeMo/Megatron-Bridge

cspades · 2026-03-17T20:31:43Z

What does this PR do ?

Reverts the changes to MixedPrecisionConfig from here: [recipe] feat: Add MixedPrecisionConfig to model provider pathway. #2784

Changelog

Megatron-FSDP will use DDPConfig: Move Megatron-FSDP MixedPrecisionPolicy arguments from FSDP adapter t… NVIDIA/Megatron-LM#3903

GitHub Actions CI

See the CI sectionin the Contributing doc for how to trigger the CI. A Nvidia developer will need to approve and trigger the CI for external contributors.

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you add or update any necessary documentation?
Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
- Reviewer: Does the PR have correct import guards for all optional libraries?

If you haven't finished some of the above items you can still open "Draft" PR.

Additional Information

Related to # (issue)

Summary by CodeRabbit

Refactor
- Removed explicit mixed-precision configuration parameters from distributed model building and initialization functions. The system now relies on default mixed-precision handling, reducing configuration complexity for model setup and training workflows.
Tests
- Removed test coverage for mixed-precision configuration propagation paths to align with API changes.

Signed-off-by: Cory Ye <cye@nvidia.com>

coderabbitai · 2026-03-17T20:36:43Z

📝 Walkthrough

Walkthrough

This pull request removes mixed-precision configuration plumbing from the distributed model building and initialization pipeline. The mixed_precision_config parameter is eliminated from public APIs including unimodal_build_distributed_models, provide_distributed_model, and get_model. Additionally, Megatron-FSDP-specific dtype fields are removed from MixedPrecisionConfig along with associated initialization and finalization logic. Related test coverage is removed.

Changes

Cohort / File(s)	Summary
Model Building and Distributed Wrapping `src/megatron/bridge/models/common/unimodal.py`, `src/megatron/bridge/models/model_provider.py`, `src/megatron/bridge/training/setup.py`	Removed `mixed_precision_config` parameter from public functions `unimodal_build_distributed_models`, `provide_distributed_model`, `get_model`, and updated call sites to stop forwarding this argument to DDP wrapping logic.
Mixed Precision Configuration `src/megatron/bridge/training/mixed_precision.py`	Removed three Megatron-FSDP-specific public fields (`megatron_fsdp_main_params_dtype`, `megatron_fsdp_main_grads_dtype`, `megatron_fsdp_grad_comm_dtype`) from `MixedPrecisionConfig` class, along with `__post_init__` method and associated `__setattr__` and `finalize` logic that managed dtype mappings and synchronization.
Test Coverage `tests/unit_tests/models/common/test_unimodal.py`, `tests/unit_tests/training/test_mixed_precision.py`	Removed test methods validating mixed-precision configuration propagation through distributed model building and provider APIs, as well as tests for Megatron-FSDP dtype field interactions and grad_reduce_in_fp32 behavior.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 3 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Test Results For Major Changes	⚠️ Warning	PR removes 314 lines of tests and modifies public APIs without documenting test results or verification in the PR description.	Add test results and verification to the PR description confirming the breaking changes have been validated and no regressions introduced.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The PR title accurately summarizes the primary change: reverting MFSDP mixed precision config arguments from multiple files and test suites.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

📝 Coding Plan

Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/megatron/bridge/models/model_provider.py (1)
34-34: ⚠️ Potential issue | 🟡 Minor

Remove duplicate Callable import.

Callable is already imported at line 18. This duplicate import should be removed.
Suggested fix
-from typing import Callable
-
 import torch
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/megatron/bridge/models/model_provider.py` at line 34, Remove the
duplicate import of Callable by deleting the redundant "from typing import
Callable" line in model_provider.py so only the original Callable import
(already present earlier in the file) remains; locate the duplicate import
statement to remove it, leaving the earlier import at line 18 intact.

🧹 Nitpick comments (1)

src/megatron/bridge/models/common/unimodal.py (1)
15-20: Import ordering issue: standard library import after logger initialization.

The typing import on line 20 should be grouped with other standard library imports at the top of the file, before the logger initialization. Per coding guidelines, imports should be organized in order: future imports, standard library, third-party, first-party, local folder imports.
Suggested reordering
-import logging
-
-
-logger = logging.getLogger(__name__)
-
-from typing import Any, Callable
+import logging
+from typing import Any, Callable
+
+
+logger = logging.getLogger(__name__)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/megatron/bridge/models/common/unimodal.py` around lines 15 - 20, Move the
standard-library import "from typing import Any, Callable" above the logger
initialization and group it with other standard library imports at the top of
the file so imports follow the required ordering; specifically, ensure the
import occurs before the "logger = logging.getLogger(__name__)" line and keep
the "logger" creation after all standard-library imports are declared.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@src/megatron/bridge/models/model_provider.py`:
- Line 34: Remove the duplicate import of Callable by deleting the redundant
"from typing import Callable" line in model_provider.py so only the original
Callable import (already present earlier in the file) remains; locate the
duplicate import statement to remove it, leaving the earlier import at line 18
intact.

---

Nitpick comments:
In `@src/megatron/bridge/models/common/unimodal.py`:
- Around line 15-20: Move the standard-library import "from typing import Any,
Callable" above the logger initialization and group it with other standard
library imports at the top of the file so imports follow the required ordering;
specifically, ensure the import occurs before the "logger =
logging.getLogger(__name__)" line and keep the "logger" creation after all
standard-library imports are declared.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 49915189-c5e5-4c3b-a8dc-2cdcc6435653

📥 Commits

Reviewing files that changed from the base of the PR and between f689296 and ab1bb53.

📒 Files selected for processing (6)

src/megatron/bridge/models/common/unimodal.py
src/megatron/bridge/models/model_provider.py
src/megatron/bridge/training/mixed_precision.py
src/megatron/bridge/training/setup.py
tests/unit_tests/models/common/test_unimodal.py
tests/unit_tests/training/test_mixed_precision.py

💤 Files with no reviewable changes (4)

src/megatron/bridge/training/mixed_precision.py
tests/unit_tests/training/test_mixed_precision.py
tests/unit_tests/models/common/test_unimodal.py
src/megatron/bridge/training/setup.py

Signed-off-by: Cory Ye <cye@nvidia.com>

Revert MFSDP mixed precision config arguments.

ab1bb53

Signed-off-by: Cory Ye <cye@nvidia.com>

cspades self-assigned this Mar 17, 2026

copy-pr-bot bot temporarily deployed to test March 17, 2026 20:32 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci March 17, 2026 20:35 Inactive

coderabbitai bot reviewed Mar 17, 2026

View reviewed changes

copy-pr-bot bot had a problem deploying to nemo-ci March 17, 2026 20:46 Failure

copy-pr-bot bot temporarily deployed to nemo-ci March 17, 2026 20:46 Inactive

cspades requested review from ananthsub, maanug-nv and yaoyu-33 March 17, 2026 21:23

copy-pr-bot bot had a problem deploying to nemo-ci March 17, 2026 21:24 Failure

copy-pr-bot bot had a problem deploying to nemo-ci March 17, 2026 22:53 Failure

copy-pr-bot bot had a problem deploying to nemo-ci March 18, 2026 02:07 Failure

Merge branch 'main' into cye/revert-mfsdp-mp-config

8050330

copy-pr-bot bot temporarily deployed to test March 18, 2026 15:26 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci March 18, 2026 15:36 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci March 18, 2026 15:44 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci March 18, 2026 15:53 Inactive

maanug-nv approved these changes Mar 18, 2026

View reviewed changes

cspades enabled auto-merge (squash) March 18, 2026 18:56

Merge branch 'main' into cye/revert-mfsdp-mp-config

6135df2

ko3n1g disabled auto-merge March 18, 2026 18:57

ko3n1g merged commit 17fba84 into NVIDIA-NeMo:main Mar 18, 2026
14 checks passed

copy-pr-bot bot temporarily deployed to test March 18, 2026 18:58 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci March 18, 2026 19:19 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci March 18, 2026 19:27 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci March 18, 2026 19:42 Inactive

copy-pr-bot bot pushed a commit that referenced this pull request Mar 19, 2026

Revert MFSDP mixed precision config arguments. (#2848)

175ed08

Signed-off-by: Cory Ye <cye@nvidia.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Revert MFSDP mixed precision config arguments.#2848

Revert MFSDP mixed precision config arguments.#2848
ko3n1g merged 3 commits intoNVIDIA-NeMo:mainfrom
cspades:cye/revert-mfsdp-mp-config

cspades commented Mar 17, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Mar 17, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

cspades commented Mar 17, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do ?

Changelog

GitHub Actions CI

Before your PR is "Ready for review"

Additional Information

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

cspades commented Mar 17, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 17, 2026 •

edited

Loading