Commit 3f5c5ac
authored
feat(sagemaker): add support for serverless inference endpoints (#35557)
Implements SageMaker Serverless Inference endpoints as requested in issue #23148.
- Add ServerlessProductionVariantProps interface with maxConcurrency, memorySizeInMB, and provisionedConcurrency
- Extend EndpointConfig to support serverless variants alongside existing instance variants
- Add comprehensive validation for serverless configuration parameters
- Enforce mutual exclusivity between instance and serverless variants
- Add CloudFormation template generation for ServerlessConfig properties
- Include extensive test coverage for validation scenarios and error cases
### Issue # 23148
Closes #23148.
### Reason for this change
AWS SageMaker Serverless Inference is not supported in the CDK SageMaker L2 constructs. Users can only configure instance-based endpoints, missing the serverless option for intermittent/unpredictable traffic patterns that could benefit from cost-effective serverless inference.
This feature was explicitly planned in the original [SageMaker Endpoint L2 construct RFC](https://github.com/aws/aws-cdk-rfcs/blob/master/text/0431-sagemaker-l2-endpoint.md#feature-additions) with Instance-prefixed classes designed to make room for Serverless-prefixed analogs.
### Description of changes
Implements AWS SageMaker Serverless Inference support in CDK SageMaker L2 constructs, enabling cost-effective serverless endpoints for intermittent workloads:
- **New `ServerlessProductionVariantProps` interface** extending `ProductionVariantProps` with AWS-compliant serverless properties:
- `maxConcurrency`: 1-200 range (required)
- `memorySizeInMB`: 1024-6144MB in 1GB increments (required)
- `provisionedConcurrency`: 1-200 range, optional, must be ≤ maxConcurrency
- **New `addServerlessProductionVariant()` method** with comprehensive input validation
- **Extended `EndpointConfigProps`** with optional `serverlessProductionVariant` property
- **Mutual exclusivity enforcement** between instance and serverless variants per AWS constraints
- **Single serverless variant limit** per endpoint configuration (AWS limitation)
- **Comprehensive synthesis-time validation** with clear, actionable error messages
- **CloudFormation integration** leveraging existing L1 construct `ServerlessConfig` support
**Usage Example**:
```typescript
import * as sagemaker from '@aws-cdk/aws-sagemaker-alpha';
declare const model: sagemaker.IModel;
// Create serverless endpoint configuration
const endpointConfig = new sagemaker.EndpointConfig(this, 'ServerlessEndpointConfig', {
serverlessProductionVariant: {
model: model,
variantName: 'serverlessVariant',
maxConcurrency: 10,
memorySizeInMB: 2048,
provisionedConcurrency: 5, // optional
},
});
```
### Describe any new or updated permissions being added
N/A - No new IAM permissions required. Leverages existing SageMaker model and endpoint permissions.
### Description of how you validated changes
- **Unit tests**: Added 12 comprehensive serverless variant tests covering all validation scenarios:
- Memory size validation (1024-6144MB in 1GB increments)
- Concurrency range validation (1-200 for both max and provisioned)
- Mutual exclusivity enforcement between instance and serverless variants
- Single serverless variant limit per AWS constraints
- Cross-environment model compatibility validation
- Error condition testing with clear error messages
- CloudFormation template generation verification
- **Integration tests**: Extended existing integration test with serverless endpoint configuration, verified CloudFormation template generation with correct `ServerlessConfig` properties:
```yaml
ServerlessEndpointConfig:
Type: AWS::SageMaker::EndpointConfig
Properties:
ProductionVariants:
- ServerlessConfig:
MaxConcurrency: 10
MemorySizeInMB: 2048
ProvisionedConcurrency: 5
VariantName: serverlessVariant
```
- **Comprehensive testing results**: 63/63 unit tests pass (100% success rate), 4/4 integration tests pass, no regressions detected across 16,024+ CDK tests
### Checklist
- [x] My code adheres to the [CONTRIBUTING GUIDE](https://github.com/aws/aws-cdk/blob/main/CONTRIBUTING.md) and [DESIGN GUIDELINES](https://github.com/aws/aws-cdk/blob/main/docs/DESIGN_GUIDELINES.md)
----
*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*1 parent 60096ac commit 3f5c5ac
File tree
16 files changed
+34385
-1375
lines changed- packages/@aws-cdk/aws-sagemaker-alpha
- lib
- test
- integ.endpoint-config.js.snapshot
- asset.98e27853307092de1d03c86e89a5ead7aab9f8ea8f6722e4f113f04f34a329fd
- asset.ca235e6258b11c240506ff06f79037eca461b8d0d9464a947a386d38d8163515.bundle
16 files changed
+34385
-1375
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
214 | 214 | | |
215 | 215 | | |
216 | 216 | | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
217 | 249 | | |
218 | 250 | | |
219 | 251 | | |
| |||
Lines changed: 188 additions & 5 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
75 | 75 | | |
76 | 76 | | |
77 | 77 | | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
78 | 103 | | |
79 | 104 | | |
80 | 105 | | |
| |||
119 | 144 | | |
120 | 145 | | |
121 | 146 | | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
122 | 167 | | |
123 | 168 | | |
124 | 169 | | |
| |||
142 | 187 | | |
143 | 188 | | |
144 | 189 | | |
| 190 | + | |
| 191 | + | |
145 | 192 | | |
146 | 193 | | |
147 | 194 | | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
148 | 205 | | |
149 | 206 | | |
150 | 207 | | |
| |||
207 | 264 | | |
208 | 265 | | |
209 | 266 | | |
| 267 | + | |
210 | 268 | | |
211 | 269 | | |
212 | 270 | | |
| |||
215 | 273 | | |
216 | 274 | | |
217 | 275 | | |
| 276 | + | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
| 280 | + | |
218 | 281 | | |
219 | 282 | | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
220 | 287 | | |
221 | 288 | | |
222 | 289 | | |
223 | 290 | | |
224 | | - | |
| 291 | + | |
225 | 292 | | |
226 | 293 | | |
227 | 294 | | |
| |||
238 | 305 | | |
239 | 306 | | |
240 | 307 | | |
| 308 | + | |
| 309 | + | |
| 310 | + | |
241 | 311 | | |
242 | 312 | | |
243 | 313 | | |
| |||
252 | 322 | | |
253 | 323 | | |
254 | 324 | | |
| 325 | + | |
| 326 | + | |
| 327 | + | |
| 328 | + | |
| 329 | + | |
| 330 | + | |
| 331 | + | |
| 332 | + | |
| 333 | + | |
| 334 | + | |
| 335 | + | |
| 336 | + | |
| 337 | + | |
| 338 | + | |
| 339 | + | |
| 340 | + | |
| 341 | + | |
| 342 | + | |
| 343 | + | |
| 344 | + | |
| 345 | + | |
| 346 | + | |
| 347 | + | |
| 348 | + | |
255 | 349 | | |
256 | 350 | | |
257 | 351 | | |
| |||
276 | 370 | | |
277 | 371 | | |
278 | 372 | | |
279 | | - | |
280 | | - | |
| 373 | + | |
| 374 | + | |
| 375 | + | |
| 376 | + | |
281 | 377 | | |
282 | | - | |
| 378 | + | |
| 379 | + | |
| 380 | + | |
| 381 | + | |
| 382 | + | |
| 383 | + | |
| 384 | + | |
| 385 | + | |
| 386 | + | |
283 | 387 | | |
284 | 388 | | |
285 | 389 | | |
| |||
310 | 414 | | |
311 | 415 | | |
312 | 416 | | |
| 417 | + | |
| 418 | + | |
| 419 | + | |
| 420 | + | |
| 421 | + | |
| 422 | + | |
| 423 | + | |
| 424 | + | |
| 425 | + | |
| 426 | + | |
| 427 | + | |
| 428 | + | |
| 429 | + | |
| 430 | + | |
| 431 | + | |
| 432 | + | |
| 433 | + | |
| 434 | + | |
| 435 | + | |
| 436 | + | |
| 437 | + | |
| 438 | + | |
| 439 | + | |
| 440 | + | |
| 441 | + | |
| 442 | + | |
| 443 | + | |
| 444 | + | |
| 445 | + | |
| 446 | + | |
| 447 | + | |
| 448 | + | |
| 449 | + | |
| 450 | + | |
| 451 | + | |
| 452 | + | |
| 453 | + | |
| 454 | + | |
| 455 | + | |
| 456 | + | |
| 457 | + | |
| 458 | + | |
| 459 | + | |
| 460 | + | |
| 461 | + | |
| 462 | + | |
| 463 | + | |
| 464 | + | |
| 465 | + | |
| 466 | + | |
| 467 | + | |
| 468 | + | |
| 469 | + | |
| 470 | + | |
| 471 | + | |
313 | 472 | | |
314 | 473 | | |
315 | 474 | | |
316 | 475 | | |
317 | | - | |
| 476 | + | |
| 477 | + | |
| 478 | + | |
| 479 | + | |
318 | 480 | | |
319 | 481 | | |
320 | 482 | | |
| |||
324 | 486 | | |
325 | 487 | | |
326 | 488 | | |
| 489 | + | |
| 490 | + | |
| 491 | + | |
| 492 | + | |
| 493 | + | |
| 494 | + | |
| 495 | + | |
| 496 | + | |
| 497 | + | |
| 498 | + | |
| 499 | + | |
| 500 | + | |
| 501 | + | |
| 502 | + | |
| 503 | + | |
| 504 | + | |
| 505 | + | |
| 506 | + | |
| 507 | + | |
| 508 | + | |
| 509 | + | |
327 | 510 | | |
0 commit comments