Skip to content

Conversation

@shantanutrip
Copy link
Contributor

@shantanutrip shantanutrip commented Nov 21, 2025

What's changing and why?

  • CLI and SDK change for mig profile support

Before/After UX

Before:

chadchc@80a997306578 hyperpod-jumpstart-inference-template % hyp create hyp-jumpstart-endpoint --help
Usage: hyp create hyp-jumpstart-endpoint [OPTIONS]

Options:
  --version TEXT                  Schema version to use
  --debug BOOLEAN                 Enable debug mode
  --namespace TEXT                Kubernetes namespace
  --accept-eula BOOLEAN           Whether model terms of use have been accepted  [default: False]
  --metadata-name TEXT            Name of the jumpstart endpoint object
  --model-id TEXT                 Unique identifier of the model within the hub  [required]
  --model-version TEXT            Semantic version of the model to deploy (e.g. 1.0.0)
  --instance-type TEXT            EC2 instance type for the inference server  [required]
  --endpoint-name TEXT            Name of SageMaker endpoint; empty string means no creation
  --tls-certificate-output-s3-uri TEXT
                                  S3 URI to write the TLS certificate
  --help                          Show this message and exit.

After:

chadchc@80a997306578 hyperpod-jumpstart-inference-template % hyp create hyp-jumpstart-endpoint --help
Usage: hyp create hyp-jumpstart-endpoint [OPTIONS]

Options:
  --version TEXT                  Schema version to use
  --debug BOOLEAN                 Enable debug mode
  --namespace TEXT                Kubernetes namespace
  --accept-eula BOOLEAN           Whether model terms of use have been accepted  [default: False]
  --metadata-name TEXT            Name of the jumpstart endpoint object
  --model-id TEXT                 Unique identifier of the model within the hub  [required]
  --model-version TEXT            Semantic version of the model to deploy (e.g. 1.0.0)
  --instance-type TEXT            EC2 instance type for the inference server  [required]
  --accelerator-partition-type TEXT
                                  MIG profile to use for GPU partitioning
  --accelerator-partition-validation TEXT
                                  Enable MIG validation for GPU partitioning. Default is true.  [default: True]
  --endpoint-name TEXT            Name of SageMaker endpoint; empty string means no creation
  --tls-certificate-output-s3-uri TEXT
                                  S3 URI to write the TLS certificate
  --help                          Show this message and exit.

How was this change tested?

  1. Test with local setup inference operator
  2. Using this command
chadchc@80a997306578 hyperpod-jumpstart-inference-template % hyp create hyp-jumpstart-endpoint \     
  --version 1.1 \
  --model-id deepseek-llm-r1-distill-qwen-1-5b \
  --instance-type ml.p4d.24xlarge \
  --endpoint-name js-test \
  --accelerator-partition-type "mig-4g.20gb" \
  --accelerator-partition-validation true --debug true --tls-certificate-output-s3-uri s3://sagemaker-cc-helm-test-9a02e56c-tls-ae472dc0

Are unit tests added?

Yes

Are integration tests added?

No

Reviewer Guidelines

‼️ Merge Requirements: PRs with failing integration tests cannot be merged without justification.

One of the following must be true:

  • All automated PR checks pass
  • Failed tests include local run results/screenshots proving they work
  • Changes are documentation-only

chad119 and others added 2 commits November 21, 2025 02:24
@shantanutrip shantanutrip requested a review from a team as a code owner November 21, 2025 02:31
Copy link
Contributor

@piyushdaftary piyushdaftary left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you look into UT failures

FAILED test/unit_tests/cli/test_inference.py::test_js_create_with_mig_profile - NameError: name 'mock_load_schema' is not defined
FAILED test/unit_tests/inference/test_hp_jumpstart_endpoint.py::TestHPJumpStartEndpoint::test_create - AssertionError: Expected 'call_create_api' to be called once. Called 0 times.
FAILED test/unit_tests/inference/test_hp_jumpstart_endpoint.py::TestHPJumpStartEndpoint::test_create_missing_name_and_endpoint_name - AssertionError: Exception not raised
FAILED test/unit_tests/inference/test_hp_jumpstart_endpoint.py::TestHPJumpStartEndpoint::test_create_validation_logic_priority - AssertionError: Expected 'validate_mig_profile' to be called once. Called 0...
FAILED test/unit_tests/inference/test_hp_jumpstart_endpoint.py::TestHPJumpStartEndpoint::test_create_with_accelerator_partition_validation - AssertionError: Expected 'validate_mig_profile' to be called once. Called 0...
FAILED test/unit_tests/inference/test_hp_jumpstart_endpoint.py::TestHPJumpStartEndpoint::test_create_with_metadata - AssertionError: Expected 'call_create_api' to have been called once. Called...
FAILED test/unit_tests/inference/test_hp_jumpstart_endpoint.py::TestHPJumpStartEndpoint::test_create_without_accelerator_partition_validation - AssertionError: Expected 'validate_instance_type' to be called once. Called...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants