Add helm inference #321

shantanutrip · 2025-11-21T02:31:47Z

What's changing and why?

CLI and SDK change for mig profile support

Before/After UX

Before:

chadchc@80a997306578 hyperpod-jumpstart-inference-template % hyp create hyp-jumpstart-endpoint --help
Usage: hyp create hyp-jumpstart-endpoint [OPTIONS]

Options:
  --version TEXT                  Schema version to use
  --debug BOOLEAN                 Enable debug mode
  --namespace TEXT                Kubernetes namespace
  --accept-eula BOOLEAN           Whether model terms of use have been accepted  [default: False]
  --metadata-name TEXT            Name of the jumpstart endpoint object
  --model-id TEXT                 Unique identifier of the model within the hub  [required]
  --model-version TEXT            Semantic version of the model to deploy (e.g. 1.0.0)
  --instance-type TEXT            EC2 instance type for the inference server  [required]
  --endpoint-name TEXT            Name of SageMaker endpoint; empty string means no creation
  --tls-certificate-output-s3-uri TEXT
                                  S3 URI to write the TLS certificate
  --help                          Show this message and exit.

After:

chadchc@80a997306578 hyperpod-jumpstart-inference-template % hyp create hyp-jumpstart-endpoint --help
Usage: hyp create hyp-jumpstart-endpoint [OPTIONS]

Options:
  --version TEXT                  Schema version to use
  --debug BOOLEAN                 Enable debug mode
  --namespace TEXT                Kubernetes namespace
  --accept-eula BOOLEAN           Whether model terms of use have been accepted  [default: False]
  --metadata-name TEXT            Name of the jumpstart endpoint object
  --model-id TEXT                 Unique identifier of the model within the hub  [required]
  --model-version TEXT            Semantic version of the model to deploy (e.g. 1.0.0)
  --instance-type TEXT            EC2 instance type for the inference server  [required]
  --accelerator-partition-type TEXT
                                  MIG profile to use for GPU partitioning
  --accelerator-partition-validation TEXT
                                  Enable MIG validation for GPU partitioning. Default is true.  [default: True]
  --endpoint-name TEXT            Name of SageMaker endpoint; empty string means no creation
  --tls-certificate-output-s3-uri TEXT
                                  S3 URI to write the TLS certificate
  --help                          Show this message and exit.

How was this change tested?

Test with local setup inference operator
Using this command

chadchc@80a997306578 hyperpod-jumpstart-inference-template % hyp create hyp-jumpstart-endpoint \     
  --version 1.1 \
  --model-id deepseek-llm-r1-distill-qwen-1-5b \
  --instance-type ml.p4d.24xlarge \
  --endpoint-name js-test \
  --accelerator-partition-type "mig-4g.20gb" \
  --accelerator-partition-validation true --debug true --tls-certificate-output-s3-uri s3://sagemaker-cc-helm-test-9a02e56c-tls-ae472dc0

Are unit tests added?

Yes

Are integration tests added?

No

Reviewer Guidelines

‼️ Merge Requirements: PRs with failing integration tests cannot be merged without justification.

One of the following must be true:

All automated PR checks pass
Failed tests include local run results/screenshots proving they work
Changes are documentation-only

--------- Co-authored-by: Chad Chiang <[email protected]>

piyushdaftary

Can you look into UT failures

FAILED test/unit_tests/cli/test_inference.py::test_js_create_with_mig_profile - NameError: name 'mock_load_schema' is not defined
FAILED test/unit_tests/inference/test_hp_jumpstart_endpoint.py::TestHPJumpStartEndpoint::test_create - AssertionError: Expected 'call_create_api' to be called once. Called 0 times.
FAILED test/unit_tests/inference/test_hp_jumpstart_endpoint.py::TestHPJumpStartEndpoint::test_create_missing_name_and_endpoint_name - AssertionError: Exception not raised
FAILED test/unit_tests/inference/test_hp_jumpstart_endpoint.py::TestHPJumpStartEndpoint::test_create_validation_logic_priority - AssertionError: Expected 'validate_mig_profile' to be called once. Called 0...
FAILED test/unit_tests/inference/test_hp_jumpstart_endpoint.py::TestHPJumpStartEndpoint::test_create_with_accelerator_partition_validation - AssertionError: Expected 'validate_mig_profile' to be called once. Called 0...
FAILED test/unit_tests/inference/test_hp_jumpstart_endpoint.py::TestHPJumpStartEndpoint::test_create_with_metadata - AssertionError: Expected 'call_create_api' to have been called once. Called...
FAILED test/unit_tests/inference/test_hp_jumpstart_endpoint.py::TestHPJumpStartEndpoint::test_create_without_accelerator_partition_validation - AssertionError: Expected 'validate_instance_type' to be called once. Called...

chad119 and others added 2 commits November 21, 2025 02:24

Mig Profile with inference

82c8ebb

--------- Co-authored-by: Chad Chiang <[email protected]>

Add helm chart changes

7431fc8

shantanutrip requested a review from a team as a code owner November 21, 2025 02:31

shantanutrip temporarily deployed to manual-approval November 21, 2025 02:31 — with GitHub Actions Inactive

rvasahu-amazon approved these changes Nov 21, 2025

View reviewed changes

piyushdaftary approved these changes Nov 21, 2025

View reviewed changes

XuanCS approved these changes Nov 21, 2025

View reviewed changes

jam-jee approved these changes Nov 21, 2025

View reviewed changes

piyushdaftary suggested changes Nov 21, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add helm inference #321

Add helm inference #321

Uh oh!

shantanutrip commented Nov 21, 2025 •

edited

Loading

Uh oh!

piyushdaftary left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Add helm inference #321

Are you sure you want to change the base?

Add helm inference #321

Uh oh!

Conversation

shantanutrip commented Nov 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What's changing and why?

Before/After UX

How was this change tested?

Are unit tests added?

Are integration tests added?

Reviewer Guidelines

Uh oh!

piyushdaftary left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

shantanutrip commented Nov 21, 2025 •

edited

Loading