Remove all nemo2 imports from old repo by oyilmaz-nvidia · Pull Request #628 · NVIDIA-NeMo/Export-Deploy

oyilmaz-nvidia · 2026-03-03T23:45:12Z

No description provided.

… dynamic inference - Add nemo_deploy/llm/inference/nemo_utils.py which vendors standalone NeMo utilities (MCoreTokenizerWrappper, ckpt path helpers, constants) with no dependency on the nemo package, and re-exports the complex NeMo types (GPTConfig, T5Config, io, set_modelopt_spec_if_exists_in_ckpt) under a single HAVE_NEMO guard. - Remove direct from nemo.* imports from inference_base.py and tron_utils.py; both files now import from the local nemo_utils module instead. - Fix AttributeError in create_mcore_engine: GPTInferenceWrapper was called with (model, inference_context) but the deployed Megatron-LM API expects (model, inference_wrapper_config, inference_context). Add InferenceWrapperConfig built from model.config attributes; MCoreEngine then internally creates a DynamicInferenceContext and switches to DynamicInferenceEngine. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Fix import ordering in test_inference_base.py (ruff I001) - Remove direct nemo imports from inference_base.py, nemo_utils.py, tron_utils.py - Add nemo_io.py with standalone load_context implementation - Remove HAVE_NEMO guard checks now that nemo is no longer a static dependency - Update tests to remove HAVE_NEMO patches and use types.SimpleNamespace

- Remove unused StaticInferenceContext import - Use inner model config for hidden_size/params_dtype instead of outer model - Add buffer_size_gb param to create_mcore_engine and MegatronLLMDeployable Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…linting

copy-pr-bot · 2026-03-03T23:45:16Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

…linting

Signed-off-by: Charlie Truong <chtruong@nvidia.com>

Move the InferenceWrapperConfig import from module level into the body of create_mcore_engine, so pytest can collect test_inference_base.py in the nemo:26.02 container where that megatron-core module path does not exist. GPU-only tests that call create_mcore_engine are skipped in CPU CI, so the import never executes there.

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

oyilmaz-nvidia · 2026-03-09T19:24:50Z

/ok to test d8ac4f5

oyilmaz-nvidia · 2026-03-09T19:26:58Z

/ok to test d8ac4f5

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

oyilmaz-nvidia · 2026-03-09T19:29:28Z

/ok to test c398fee

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

oyilmaz-nvidia · 2026-03-09T22:17:04Z

/ok to test f63d2c3

oyilmaz-nvidia and others added 4 commits March 3, 2026 16:41

Merge branch 'remove-direct-nemo-imports-in-inference' into fix/ruff-…

a37a149

…linting

oyilmaz-nvidia requested review from athitten, meatybobby and pthombre as code owners March 3, 2026 23:45

github-actions bot added deploy LLM export tests labels Mar 3, 2026

oyilmaz-nvidia and others added 9 commits March 5, 2026 07:36

Update mbridge commit

3b99d12

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

Merge branch 'remove-direct-nemo-imports-in-inference' into fix/ruff-…

8488f19

…linting

Fix megatron-bridge install

21862a6

Signed-off-by: Charlie Truong <chtruong@nvidia.com>

Set cryptography to < 47

c5fdd40

Signed-off-by: Charlie Truong <chtruong@nvidia.com>

Use static inference context

32d0e06

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

Fix for the test

3b60125

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

Fix merge conflicts

830fcec

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

oyilmaz-nvidia requested a review from a team as a code owner March 9, 2026 05:46

oyilmaz-nvidia and others added 3 commits March 9, 2026 11:31

Fix merge conflicts

8ae2106

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

Merge branch 'main' into fix/ruff-linting

32f1726

Remove HAVE_NEMO

d8ac4f5

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

Fix linting

c398fee

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

copy-pr-bot bot temporarily deployed to test March 9, 2026 19:30 Inactive

copy-pr-bot bot had a problem deploying to nemo-ci March 9, 2026 19:47 Failure

Fix doc issues

f63d2c3

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

copy-pr-bot bot deployed to test March 9, 2026 22:17 Active

copy-pr-bot bot had a problem deploying to nemo-ci March 9, 2026 22:26 Failure

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove all nemo2 imports from old repo#628

Remove all nemo2 imports from old repo#628
oyilmaz-nvidia wants to merge 18 commits intomainfrom
fix/ruff-linting

oyilmaz-nvidia commented Mar 3, 2026

Uh oh!

copy-pr-bot bot commented Mar 3, 2026

Uh oh!

oyilmaz-nvidia commented Mar 9, 2026

Uh oh!

oyilmaz-nvidia commented Mar 9, 2026

Uh oh!

oyilmaz-nvidia commented Mar 9, 2026

Uh oh!

oyilmaz-nvidia commented Mar 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

oyilmaz-nvidia commented Mar 3, 2026

Uh oh!

copy-pr-bot bot commented Mar 3, 2026

Uh oh!

oyilmaz-nvidia commented Mar 9, 2026

Uh oh!

oyilmaz-nvidia commented Mar 9, 2026

Uh oh!

oyilmaz-nvidia commented Mar 9, 2026

Uh oh!

oyilmaz-nvidia commented Mar 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants