feat: Added save_optimizer flag to control saving optimizer or not in checkpointing #1843

odedovadia · 2026-01-29T09:46:57Z

What does this PR do ?

Add save_optimizer configuration flag to control whether optimizer state is saved with checkpoints, greatly reducing checkpoint size when optimizer state is not needed for resumption.

Adds save_optimizer field to CheckpointingConfig (default: True for backward compatibility).
Introduces CheckpointManager.get_resume_paths() helper to centralize checkpoint path resolution logic.
Updates all algorithms (SFT, DPO, GRPO, RM, Distillation) to use the new helper and respect the flag.
When resuming from a checkpoint without optimizer state, a warning is logged and the optimizer is freshly initialized.

Issues

Closes #1828

Usage

# In your config file
checkpointing:
  enabled: true
  save_optimizer: false  # Skip saving optimizer state (default: true)

Before your PR is "Ready for review"

Pre checks:

[] Make sure you read and followed Contributor guidelines
[] Did you write any new necessary tests?
[] Did you run the unit tests and functional tests locally? Visit our Testing Guide for how to run tests
Did you add or update any necessary documentation? Visit our Document Development Guide for how to write, build and test the docs.

Summary by CodeRabbit

Release Notes

New Features
- Added configuration option to control optimizer state saving in checkpoints.
Refactor
- Centralized checkpoint resume path resolution for improved consistency across training algorithms (distillation, DPO, GRPO, RM, and SFT).

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2026-01-29T09:53:07Z

📝 Walkthrough

Walkthrough

Introduces a new save_optimizer configuration flag to selectively save optimizer states during checkpointing. Refactors five algorithm modules to use a centralized get_resume_paths() method for path resolution and conditionally include optimizer paths based on the flag's value.

Changes

Cohort / File(s)	Summary
Core Checkpoint Infrastructure `nemo_rl/utils/checkpoint.py`	Added `save_optimizer` optional flag to `CheckpointingConfig` (defaults to True), stored in `CheckpointManager`, and introduced `get_resume_paths()` method to centralize derivation of weights and optimizer paths from checkpoints, with fallback handling when optimizer state is missing.
Algorithm Refactoring `nemo_rl/algorithms/distillation.py`, `dpo.py`, `grpo.py`, `rm.py`, `sft.py`	Replaced explicit checkpoint path construction with calls to `checkpointer.get_resume_paths()` during setup. Updated checkpoint saving logic to conditionally pass `optimizer_path` only when `checkpointer.save_optimizer` is True; otherwise passes None, making optimizer state saving conditional across all five algorithm modules.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 2

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 78.57% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Test Results For Major Changes	⚠️ Warning	PR introduces major feature (save_optimizer flag) but PR description states tests are pending with no test results provided.	Add test results to PR description demonstrating the new save_optimizer flag works correctly and doesn't introduce regressions across all affected algorithms.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check	✅ Passed	The PR fully implements the requested feature from issue `#1828`: adds a configurable save_optimizer flag to CheckpointingConfig, updates algorithms to respect it, and provides backward-compatible behavior with warning on missing optimizer state.
Out of Scope Changes check	✅ Passed	All changes across checkpoint.py and algorithm files are directly related to implementing the save_optimizer flag feature and centralizing checkpoint path resolution logic as described in issue `#1828`.
Title check	✅ Passed	The title accurately and clearly describes the main change: adding a save_optimizer flag to control whether optimizer state is saved during checkpointing, which is the primary feature across all modified files.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

nemo_rl/utils/checkpoint.py (1)

52-63: Add save_optimizer to exemplar YAML files and move default from code to YAML.

The save_optimizer key is not reflected in any exemplar YAML files under examples/configs/, violating the guideline that config defaults live in YAML. Remove the inline default comment # Default: True from line 62 and add save_optimizer: true to the checkpointing sections in all exemplar YAMLs (e.g., examples/configs/dpo.yaml, examples/configs/grpo_math_1B.yaml). Additionally, update the docstring to explicitly document the recommended default value and that the field accepts boolean values.

🤖 Fix all issues with AI agents

In `@nemo_rl/utils/checkpoint.py`:
- Line 110: Replace the in-code default for save_optimizer with a strict read
from the config so the YAML remains the source of truth: change the assignment
to read the key without a non-None default (e.g., use config["save_optimizer"]
or config.get("save_optimizer") and validate presence) in the
constructor/initializer where self.save_optimizer is set (the assignment using
self.save_optimizer = config.get("save_optimizer", True)); add a clear error or
validation message if the key is missing so callers know to set it in YAML.

🧹 Nitpick comments (1)

nemo_rl/utils/checkpoint.py (1)
125-132: Add stacklevel to the warning for better caller attribution.

This warning points at the helper instead of the caller.
🔧 Suggested tweak
-            warnings.warn(
+            warnings.warn(
                 f"Optimizer state not found at {optimizer_path}. "
                 "Optimizer will be freshly initialized."
-            )
+            , stacklevel=2)

nemo_rl/utils/checkpoint.py

Signed-off-by: Oded Ovadia <odedov@dreamgroup.com>

guyueh1 · 2026-02-04T01:22:21Z

@odedov-dream is there any test in unit or functional test that we can add to guard the functionality of the save_optimizer=false path?

Signed-off-by: Oded Ovadia <odedov@dreamgroup.com>

odedovadia · 2026-02-04T08:39:56Z

@odedov-dream is there any test in unit or functional test that we can add to guard the functionality of the save_optimizer=false path?

Sure, added 2 unit tests.

odedovadia requested review from a team as code owners January 29, 2026 09:46

odedovadia mentioned this pull request Jan 29, 2026

Save checkpoint without optimizer states #1828

Open

coderabbitai bot reviewed Jan 29, 2026

View reviewed changes

nemo_rl/utils/checkpoint.py Show resolved Hide resolved

github-actions bot added the community-request label Jan 29, 2026

Added save_optimizer flag

140822d

Signed-off-by: Oded Ovadia <odedov@dreamgroup.com>

odedovadia force-pushed the save-optimizer-flag branch from b44f4c8 to 140822d Compare January 29, 2026 10:11

guyueh1 changed the title ~~Added save_optimizer flag~~ feat: Added save_optimizer flag to control saving optimizer or not in checkpointing Jan 29, 2026

guyueh1 self-requested a review January 29, 2026 16:40

guyueh1 added the CI:L0 Run doctests and unit tests label Jan 29, 2026

guyueh1 temporarily deployed to nemo-ci January 29, 2026 16:43 — with GitHub Actions Inactive

guyueh1 temporarily deployed to nemo-ci January 29, 2026 18:56 — with GitHub Actions Inactive

chtruong814 added the needs-follow-up Issue needs follow-up label Jan 31, 2026

pre-commit fixes

ea843c6

Signed-off-by: Oded Ovadia <odedov@dreamgroup.com>

chtruong814 removed the needs-follow-up Issue needs follow-up label Feb 4, 2026

Added checkpointing tests

2b494f4

Signed-off-by: Oded Ovadia <odedov@dreamgroup.com>

odedovadia requested a review from a team as a code owner February 4, 2026 08:37

chtruong814 added the needs-follow-up Issue needs follow-up label Feb 6, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Added save_optimizer flag to control saving optimizer or not in checkpointing #1843

feat: Added save_optimizer flag to control saving optimizer or not in checkpointing #1843

Uh oh!

odedovadia commented Jan 29, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Jan 29, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

guyueh1 commented Feb 4, 2026

Uh oh!

odedovadia commented Feb 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

feat: Added save_optimizer flag to control saving optimizer or not in checkpointing #1843

Are you sure you want to change the base?

feat: Added save_optimizer flag to control saving optimizer or not in checkpointing #1843

Uh oh!

Conversation

odedovadia commented Jan 29, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do ?

Issues

Usage

Before your PR is "Ready for review"

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai bot commented Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

guyueh1 commented Feb 4, 2026

Uh oh!

odedovadia commented Feb 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

odedovadia commented Jan 29, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 29, 2026 •

edited

Loading