-
Notifications
You must be signed in to change notification settings - Fork 1.3k
feat: support Bedrock models beyond the Mantle API (e.g. Amazon Nova, native Converse API) #5230
Description
Summary
The current Bedrock inference provider (remote::bedrock) only supports models available through Bedrock's OpenAI-compatible Mantle API (bedrock-mantle.{region}.api.aws/v1). This limits it to a small subset of Bedrock models — primarily OpenAI GPT-OSS and Llama models.
Many popular Bedrock models are not available through the Mantle endpoint and can only be accessed via the native Bedrock Converse API (bedrock-runtime.{region}.amazonaws.com). Notable examples include:
- Amazon Nova (Nova Pro, Nova Lite, Nova Micro)
- Anthropic Claude (via Bedrock)
- Cohere Command models
- Mistral models on Bedrock
- Amazon Titan models
Current behavior
The provider uses OpenAIMixin and routes all requests through the Mantle OpenAI-compatible endpoint. Models that aren't exposed on that endpoint simply can't be used, and the provider has no fallback to the native Converse API.
The provider also only supports chat completions — embeddings raise NotImplementedError.
Proposed behavior
Add support for Bedrock's native Converse API as an alternative (or additional) client path. This would allow users to access the full catalog of Bedrock foundation models, not just those behind Mantle.
Possible approaches:
- Dual-mode provider: detect whether a model is available via Mantle or Converse and route accordingly
- Separate provider: add a
remote::bedrock-converseprovider alongside the existing Mantle-based one - Replace with Converse: since Converse supports a superset of models, migrate the provider entirely (though this would lose the OpenAI-compatible shortcut for Mantle-supported models)
Additional context
- The Mantle API uses Bearer token auth (
AWS_BEARER_TOKEN_BEDROCK), while the native Converse API uses standard AWS credentials (access key / secret key / session token or IAM role) - The Converse API supports features like tool use, streaming, and vision that would map well to Llama Stack's inference API
- This would significantly expand the set of models usable with Llama Stack on AWS