Cleanup before v0.2 release by skyw · Pull Request #135 · NVIDIA-NeMo/Emerging-Optimizers

skyw · 2026-03-18T17:08:41Z

Improve some function interfaces.

Add missing files and registry.

Minor update to documents.

Signed-off-by: Hao Wu <skyw@nvidia.com>

copy-pr-bot · 2026-03-18T17:08:46Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

greptile-apps · 2026-03-18T17:11:00Z

Greptile Summary

This PR performs pre-release housekeeping for v0.2: the primary functional change is renaming update_eigenbasis_and_momentum → update_eigenbasis_and_exp_avgs (and its momentum parameter → exp_avg) in soap.py to align with the existing exp_avg/exp_avg_sq naming convention. It also adds the missing riemannian_optimizers/__init__.py, registers ObliqueSGD, ObliqueAdam, and MuonHyperball in the optimizer registry, bumps the docs version to 0.2.0, and cleans up linting dependencies.

soap.py: Clean rename of update_eigenbasis_and_momentum → update_eigenbasis_and_exp_avgs; all call sites, docstrings, tests, and API docs updated consistently.
riemannian_optimizers/__init__.py: New file adding the missing package init, following the same pattern as orthogonalized_optimizers/__init__.py.
normalized_optimizer.py / muon_hyperball.py: Registry decorators and __all__ added to make ObliqueSGD, ObliqueAdam, and MuonHyperball first-class public API entries.
Bug surfaced by public export: ObliqueSGD.step has a pre-existing bug where the momentum buffer is computed into a local variable (buf = torch.add(...)) but never written back to state["momentum_buffer"], making momentum accumulation a no-op. This is now exposed publicly via __all__ for the first time.
pyproject.toml / uv.lock: flake8 and pylint removed from test dependencies.

Confidence Score: 3/5

Safe to merge after fixing the momentum buffer bug in ObliqueSGD, which is now a public API.
The SOAP rename is clean and fully consistent across all call sites, tests, and docs. The main concern is ObliqueSGD's momentum buffer not being persisted — a pre-existing correctness bug that is being surfaced as a public interface for the first time in this PR. Merging as-is would publish a broken optimizer to the v0.2 registry.
emerging_optimizers/riemannian_optimizers/normalized_optimizer.py — ObliqueSGD.step momentum buffer update

Important Files Changed

Filename	Overview
emerging_optimizers/soap/soap.py	Renames `update_eigenbasis_and_momentum` → `update_eigenbasis_and_exp_avgs` and its `momentum` parameter → `exp_avg` throughout. All call sites, the `__all__` list, docstrings, and internal comments are updated consistently. No logic changes.
emerging_optimizers/riemannian_optimizers/normalized_optimizer.py	Adds `__all__`, imports `registry`, and registers `ObliqueSGD`/`ObliqueAdam`. Pre-existing bug: `ObliqueSGD.step` reassigns the local `buf` variable but never writes it back to `state["momentum_buffer"]`, silently disabling momentum accumulation — now being surfaced as a public API for the first time.
emerging_optimizers/riemannian_optimizers/init.py	New `__init__.py` that re-exports everything from `normalized_optimizer` via `*`. Follows the same pattern as `orthogonalized_optimizers/__init__.py`.
emerging_optimizers/orthogonalized_optimizers/muon_hyperball.py	Adds `@registry.register_optimizer("muon_hyperball")` decorator and `from emerging_optimizers import registry` import. No logic changes.
tests/test_soap.py	Updates test method name, docstring, and all variable names to match the renamed function `update_eigenbasis_and_exp_avgs` and parameter `exp_avg`. No logic changes.
docs/apidocs/soap.md	Updates `autofunction` directive to reference `update_eigenbasis_and_exp_avgs` (the renamed function), consistent with `soap.py` changes.
docs/conf.py	Bumps `release` version string from `0.1.0` to `0.2.0`.
pyproject.toml	Removes `flake8` and `pylint` from the `test` dependency group (aligning with the CONTRIBUTING.md mention that flake8 was removed from forced checks).

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["update_eigenbasis_and_exp_avgs()\n(renamed from update_eigenbasis_and_momentum)"] --> B

    B["Step 1: Project exp_avg → original basis\nprecondition(exp_avg, eigenbasis_list, dims=[[0],[1]])"]
    B --> C{use_eigh?}

    C -- "True" --> D["get_eigenbasis_eigh(kronecker_factor_list)\n→ updated_eigenbasis_list\n(exp_avg_sq unchanged)"]
    C -- "False" --> E["get_eigenbasis_qr(kronecker_factor_list, eigenbasis_list,\nexp_avg_sq, power_iter_steps)\n→ updated_eigenbasis_list, exp_avg_sq"]

    D --> F
    E --> F

    F["Step 3: Project exp_avg → new eigenbasis\nprecondition(exp_avg, updated_eigenbasis_list, dims=[[0],[0]])"]
    F --> G["Return (updated_eigenbasis_list, exp_avg, exp_avg_sq)"]

    G --> H["Caller (SOAP.step) updates state:\nstate['Q_L'], state['Q_R'] = updated_eigenbasis_list\nstate['exp_avg'] = exp_avg\nstate['exp_avg_sq'] = exp_avg_sq"]

Comments Outside Diff (1)

emerging_optimizers/riemannian_optimizers/normalized_optimizer.py, line 115-121 (link)

Momentum buffer never written back to state

buf = torch.add(grad, buf, alpha=mom) creates a new tensor and reassigns the local variable buf, but state["momentum_buffer"] still points to the original all-zeros tensor. After step 1, every subsequent step computes grad + mom * zeros = grad, making the optimizer behave identically to plain gradient descent rather than momentum SGD. Compare with ObliqueAdam, which correctly updates its buffers in-place via .mul_() and .add_().

The fix is to either update the state or use an in-place operation:

_{Last reviewed commit: "update uv lock"}

emerging_optimizers/soap/soap.py

Signed-off-by: Hao Wu <skyw@nvidia.com>

skyw · 2026-03-18T17:31:00Z

/ok to test 7360c6b

github-actions · 2026-03-18T17:39:56Z

Test Results

48 files ±0 98 suites ±0 1m 15s ⏱️ +4s
1 012 tests ±0 1 012 ✅ ±0 0 💤 ±0 0 ❌ ±0
2 255 runs ±0 2 255 ✅ ±0 0 💤 ±0 0 ❌ ±0

Results for commit d499815. ± Comparison against base commit eead338.

This pull request removes 18 and adds 18 tests. Note that renamed tests count towards both.

__main__.SoapFunctionsTest ‑ test_update_eigenbasis_and_momentum0 (M=8, N=4, use_eigh=True)
__main__.SoapFunctionsTest ‑ test_update_eigenbasis_and_momentum1 (M=8, N=4, use_eigh=False)
__main__.SoapFunctionsTest ‑ test_update_eigenbasis_and_momentum10 (M=16, N=33, use_eigh=True)
__main__.SoapFunctionsTest ‑ test_update_eigenbasis_and_momentum11 (M=16, N=33, use_eigh=False)
__main__.SoapFunctionsTest ‑ test_update_eigenbasis_and_momentum12 (M=33, N=4, use_eigh=True)
__main__.SoapFunctionsTest ‑ test_update_eigenbasis_and_momentum13 (M=33, N=4, use_eigh=False)
__main__.SoapFunctionsTest ‑ test_update_eigenbasis_and_momentum14 (M=33, N=8, use_eigh=True)
__main__.SoapFunctionsTest ‑ test_update_eigenbasis_and_momentum15 (M=33, N=8, use_eigh=False)
__main__.SoapFunctionsTest ‑ test_update_eigenbasis_and_momentum16 (M=33, N=33, use_eigh=True)
__main__.SoapFunctionsTest ‑ test_update_eigenbasis_and_momentum17 (M=33, N=33, use_eigh=False)
…

__main__.SoapFunctionsTest ‑ test_update_eigenbasis_and_exp_avgs0 (M=8, N=4, use_eigh=True)
__main__.SoapFunctionsTest ‑ test_update_eigenbasis_and_exp_avgs1 (M=8, N=4, use_eigh=False)
__main__.SoapFunctionsTest ‑ test_update_eigenbasis_and_exp_avgs10 (M=16, N=33, use_eigh=True)
__main__.SoapFunctionsTest ‑ test_update_eigenbasis_and_exp_avgs11 (M=16, N=33, use_eigh=False)
__main__.SoapFunctionsTest ‑ test_update_eigenbasis_and_exp_avgs12 (M=33, N=4, use_eigh=True)
__main__.SoapFunctionsTest ‑ test_update_eigenbasis_and_exp_avgs13 (M=33, N=4, use_eigh=False)
__main__.SoapFunctionsTest ‑ test_update_eigenbasis_and_exp_avgs14 (M=33, N=8, use_eigh=True)
__main__.SoapFunctionsTest ‑ test_update_eigenbasis_and_exp_avgs15 (M=33, N=8, use_eigh=False)
__main__.SoapFunctionsTest ‑ test_update_eigenbasis_and_exp_avgs16 (M=33, N=33, use_eigh=True)
__main__.SoapFunctionsTest ‑ test_update_eigenbasis_and_exp_avgs17 (M=33, N=33, use_eigh=False)
…

♻️ This comment has been updated with latest results.

Signed-off-by: Hao Wu <skyw@nvidia.com>

skyw · 2026-03-18T17:48:56Z

/ok to test d499815

codecov · 2026-03-18T18:27:22Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

skyw added 3 commits March 18, 2026 09:56

qol improvement

567c687

Signed-off-by: Hao Wu <skyw@nvidia.com>

update documents

04f257b

Signed-off-by: Hao Wu <skyw@nvidia.com>

improve soap interface

68d27dd

Signed-off-by: Hao Wu <skyw@nvidia.com>

skyw requested a review from mkhona-nvidia March 18, 2026 17:08

skyw requested a review from a team as a code owner March 18, 2026 17:08

greptile-apps bot reviewed Mar 18, 2026

View reviewed changes

emerging_optimizers/soap/soap.py Outdated Show resolved Hide resolved

emerging_optimizers/soap/soap.py Outdated Show resolved Hide resolved

skyw added 3 commits March 18, 2026 10:11

minor format change in CONTRIBUTING.md

d8913ff

Signed-off-by: Hao Wu <skyw@nvidia.com>

fix typo

176671a

Signed-off-by: Hao Wu <skyw@nvidia.com>

improve riemannian_optimizers

29c783a

Signed-off-by: Hao Wu <skyw@nvidia.com>

mkhona-nvidia previously approved these changes Mar 18, 2026

View reviewed changes

update examples in docstring

7360c6b

Signed-off-by: Hao Wu <skyw@nvidia.com>

skyw dismissed mkhona-nvidia’s stale review via 7360c6b March 18, 2026 17:30

copy-pr-bot bot temporarily deployed to test March 18, 2026 17:31 Inactive

copy-pr-bot bot had a problem deploying to nemo-ci March 18, 2026 17:31 Failure

skyw mentioned this pull request Mar 18, 2026

Add more emerging optimizers NVIDIA/Megatron-LM#3907

Merged

5 tasks

copy-pr-bot bot had a problem deploying to nemo-ci March 18, 2026 17:40 Failure

update uv lock

d499815

Signed-off-by: Hao Wu <skyw@nvidia.com>

copy-pr-bot bot temporarily deployed to test March 18, 2026 17:49 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci March 18, 2026 17:49 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci March 18, 2026 17:57 Inactive

skyw enabled auto-merge (squash) March 18, 2026 18:27

skyw disabled auto-merge March 18, 2026 18:30

skyw merged commit 1effa02 into main Mar 18, 2026
16 checks passed

skyw deleted the skyw/v0.2_cleanup branch March 18, 2026 18:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cleanup before v0.2 release#135

Cleanup before v0.2 release#135
skyw merged 8 commits intomainfrom
skyw/v0.2_cleanup

skyw commented Mar 18, 2026

Uh oh!

copy-pr-bot bot commented Mar 18, 2026

Uh oh!

greptile-apps bot commented Mar 18, 2026 •

edited

Loading

Comments Outside Diff (1)

Uh oh!

Uh oh!

Uh oh!

skyw commented Mar 18, 2026

Uh oh!

github-actions bot commented Mar 18, 2026 •

edited

Loading

Uh oh!

skyw commented Mar 18, 2026

Uh oh!

codecov bot commented Mar 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

skyw commented Mar 18, 2026

Uh oh!

copy-pr-bot bot commented Mar 18, 2026

Uh oh!

greptile-apps bot commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Flowchart

Comments Outside Diff (1)

Uh oh!

Uh oh!

Uh oh!

skyw commented Mar 18, 2026

Uh oh!

github-actions bot commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test Results

Uh oh!

skyw commented Mar 18, 2026

Uh oh!

codecov bot commented Mar 18, 2026

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

greptile-apps bot commented Mar 18, 2026 •

edited

Loading

github-actions bot commented Mar 18, 2026 •

edited

Loading