diff --git a/docs/docs/providers/openai_responses_limitations.mdx b/docs/docs/providers/openai_responses_limitations.mdx index 0755ae3e59..85b276c7af 100644 --- a/docs/docs/providers/openai_responses_limitations.mdx +++ b/docs/docs/providers/openai_responses_limitations.mdx @@ -98,16 +98,6 @@ The `reasoning` object in the output of Responses works for inference providers --- -### Service Tier - -**Status:** Not Implemented - -**Issue:** [#3550](https://github.com/llamastack/llama-stack/issues/3550) - -Responses has a field `service_tier` that can be used to prioritize access to inference resources. Not all inference providers have such a concept, but Llama Stack pass through this value for those providers that do. Currently it does not. - ---- - ### Incomplete Details **Status:** Not Implemented