-
Notifications
You must be signed in to change notification settings - Fork 57
[WIP][AQUA][GPU Shape Recommendation] Support for Service Managed Models #1252
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
) | ||
if request.deployment_config: | ||
shape_recommendation_report = ( | ||
ShapeRecommendationReport.from_deployment_config( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we check if a deployment_config was successfully obtained, and if so we will generate the report immediately.
ads/aqua/shaperecommend/constants.py
Outdated
"trust_remote_code": "--trust-remote-code" | ||
} | ||
|
||
DEFAULT_WEIGHT_SIZE = "bfloat16" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like we already have such variable, three lines below?
ads/aqua/shaperecommend/estimator.py
Outdated
params.append(VLLM_PARAMS["in_flight_quant"]) | ||
|
||
# add trust-remote-code if custom modules are specified | ||
if c.trust_remote_code: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we use more meaningful name for the config variable?
c
->llm_config
?
) | ||
|
||
@staticmethod | ||
def get_bool(raw, key, default=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In ads/coomon/utils
we already have - parse_bool
function, maybe this function can be reused?
tie_word_embeddings = LLMConfig.get_bool(raw, "tie_word_embeddings", True) | ||
|
||
trust_remote_code = "auto_map" in raw # trust-remote-code is always needed when this key is present | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add more description for the section below? Can the error be more specific?
if None in [
num_hidden_layers,
hidden_size,
vocab_size,
num_attention_heads,
head_dim,
]:
raise ValueError("Missing required value in model config.")
ads/aqua/shaperecommend/recommend.py
Outdated
try: | ||
shapes = self.valid_compute_shapes(compartment_id=request.compartment_id) | ||
|
||
ds_model = self._validate_model_ocid(request.model_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed that this method also retrieves model details, but the function name doesn’t suggest that it returns the model config as well. I’d recommend separating the retrieval of model details from this function to make its purpose clearer and more consistent.
COMPARTMENT_OCID, description="The OCID of user's compartment" | ||
) | ||
|
||
deployment_config: Optional[AquaDeploymentConfig] = Field( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better to do: default=None
) | ||
|
||
else: | ||
data = self._get_model_config(ds_model) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure why would we need to repeat same lines again?
{ | ||
"deployment_params": { | ||
"quantization": "mxfp4", | ||
"max_model_len": null, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be confusing, that in one case we say that max_model_len is null, but in params it is 130000
ads/aqua/shaperecommend/recommend.py
Outdated
if model: | ||
model_size = str(model.total_model_gb) | ||
else: | ||
model_size = "Using Pre-Defined Config" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should use -
instead? Should we add total model gb param in the service configs?
quantization: Optional[str] = Field( | ||
None, description="Type of quantization (e.g. 4bit)." | ||
) | ||
max_model_len: Optional[int] = Field(None, description="Maximum length of input sequence.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Merge conflict?
) | ||
|
||
@classmethod | ||
def from_deployment_config(cls, deployment_config: AquaDeploymentConfig, model_name: str, valid_shapes: List[ComputeShapeSummary]) -> "ShapeRecommendationReport": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use the formatter, to format the code.
AquaValueError | ||
If model type is unsupported by tool (no recommendation report generated) | ||
""" | ||
deployment_config = self.get_deployment_config(model_id=kwargs.get("model_id")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you update the description and add acceptable params for kwargs?
Could you also add telemetry for the shape recommender, we need to know if this feature is used by customers. |
shapes = self.valid_compute_shapes(compartment_id=request.compartment_id) | ||
|
||
ds_model = self._validate_model_ocid(request.model_id) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you update the description for this method? Specifically input params.
), | ||
], | ||
) | ||
def test_which_shapes_valid_from_file( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test will potentially fail every time we update the GPU index json file.
5965fbe
to
bbc0007
Compare
For AQUA Service models, pre-defined configuration files already exist and can be directly used. This PR adds support for service models via obtaining their pre-defined config files, parsing this metadata, and formatting a response consistent with the current output of the GPU Shape Recommendation Tool
Returns

Returns