Skip to content

AzureOpenAiChatModel Retry Mechanism Not Working. #2055

@shelltea

Description

@shelltea

Bug description
The AzureOpenAiChatModel does not support the retry mechanism. According to the documentation, this feature should already be supported. However, after configuring the following parameters, retries still do not occur:

spring.ai.retry.on-client-errors=true
spring.ai.retry.on-http-codes=400,408,429,500,502,503,504

Environment

  • Spring AI version: 1.0.0-M5
  • Java version: 21

Steps to reproduce

  1. Configure the retry settings in your application properties file as follows:
    spring.ai.retry.on-client-errors=true
    spring.ai.retry.on-http-codes=400,408,429,500,502,503,504
  2. Invoke the AzureOpenAiChatModel under conditions that would trigger a retry (e.g., by forcing an HTTP 400 or other configured codes).
  3. Observe that no retry occurs.

Expected behavior
The AzureOpenAiChatModel should attempt to retry the request according to the specified retry configuration parameters.

Error Logs

com.azure.core.exception.HttpResponseException: Status code 400, "{"error":{"inner_error":{"code":"ResponsibleAIPolicyViolation","content_filter_results":{"sexual":{"filtered":true,"severity":"high"},"violence":{"filtered":false,"severity":"safe"},"hate":{"filtered":false,"severity":"safe"},"self_harm":{"filtered":false,"severity":"safe"}}},"code":"content_filter","message":"The response was filtered due to the prompt triggering Azure OpenAI's content management policy. Please modify your prompt and retry. To learn more about our content filtering policies please read our documentation: \r\nhttps://go.microsoft.com/fwlink/?linkid=2198766.","param":"prompt","type":null}}"

	at com.azure.core.implementation.http.rest.RestProxyBase.instantiateUnexpectedException(RestProxyBase.java:388)
	at com.azure.core.implementation.http.rest.SyncRestProxy.ensureExpectedStatus(SyncRestProxy.java:133)
	at com.azure.core.implementation.http.rest.SyncRestProxy.handleRestReturnType(SyncRestProxy.java:211)
	at com.azure.core.implementation.http.rest.SyncRestProxy.invoke(SyncRestProxy.java:86)
	at com.azure.core.implementation.http.rest.RestProxyBase.invoke(RestProxyBase.java:124)
	at com.azure.core.http.rest.RestProxy.invoke(RestProxy.java:95)
	at jdk.proxy2/jdk.proxy2.$Proxy196.getChatCompletionsSync(Unknown Source)
	at com.azure.ai.openai.implementation.OpenAIClientImpl.getChatCompletionsWithResponse(OpenAIClientImpl.java:1900)
	at com.azure.ai.openai.OpenAIClient.getChatCompletionsWithResponse(OpenAIClient.java:350)
	at com.azure.ai.openai.OpenAIClient.getChatCompletions(OpenAIClient.java:760)
	at org.springframework.ai.azure.openai.AzureOpenAiChatModel.lambda$internalCall$1(AzureOpenAiChatModel.java:244)
	at io.micrometer.observation.Observation.observe(Observation.java:565)
	at org.springframework.ai.azure.openai.AzureOpenAiChatModel.internalCall(AzureOpenAiChatModel.java:240)
	at org.springframework.ai.azure.openai.AzureOpenAiChatModel.call(AzureOpenAiChatModel.java:226)
	at org.springframework.ai.chat.client.DefaultChatClient$DefaultChatClientRequestSpec$1.aroundCall(DefaultChatClient.java:675)
	at org.springframework.ai.chat.client.advisor.DefaultAroundAdvisorChain.lambda$nextAroundCall$1(DefaultAroundAdvisorChain.java:98)
	at io.micrometer.observation.Observation.observe(Observation.java:565)
	at org.springframework.ai.chat.client.advisor.DefaultAroundAdvisorChain.nextAroundCall(DefaultAroundAdvisorChain.java:98)
	at org.springframework.ai.chat.client.DefaultChatClient$DefaultCallResponseSpec.doGetChatResponse(DefaultChatClient.java:488)
	at org.springframework.ai.chat.client.DefaultChatClient$DefaultCallResponseSpec.lambda$doGetObservableChatResponse$1(DefaultChatClient.java:477)
	at io.micrometer.observation.Observation.observe(Observation.java:565)
	at org.springframework.ai.chat.client.DefaultChatClient$DefaultCallResponseSpec.doGetObservableChatResponse(DefaultChatClient.java:477)
	at org.springframework.ai.chat.client.DefaultChatClient$DefaultCallResponseSpec.doSingleWithBeanOutputConverter(DefaultChatClient.java:451)
	at org.springframework.ai.chat.client.DefaultChatClient$DefaultCallResponseSpec.entity(DefaultChatClient.java:446)

Activity

added this to the 1.0.0-RC1 milestone on Apr 21, 2025
markpollack

markpollack commented on Apr 30, 2025

@markpollack
Member

i believe that we didn't add spring retry around the azure openai sdk, because there is already a feature in that SDK relating to retry. I can't seem to find the docs on that. Could someone confirm?

markpollack

markpollack commented on May 6, 2025

@markpollack
Member

@mkheck any insight?

mkheck

mkheck commented on May 6, 2025

@mkheck
Contributor

@markpollack I'll dig into it and let you know. If there is no Azure-specific mechanism, I should be able to wrap it with Spring Retry. More news shortly.

mkheck

mkheck commented on May 6, 2025

@mkheck
Contributor

@markpollack Please go ahead and assign it to me. I'll look at the ones you've tagged me with for review and work through them.

modified the milestones: 1.0.0-RC1, 1.0.x on May 13, 2025
iAMSagar44

iAMSagar44 commented on May 13, 2025

@iAMSagar44
Contributor

Hi @markpollack / @mkheck ,

There is already a feature in the azure-sdk-for-java which retries for transient errors. I think the default retry count is 3.
Please check the RetryPolicy.class and ExponentialBackoff.class n the com.azure.core.http.policy package.
I believe these are added to the HttpPipelinePolicy array in the createHttpPipeline() method in the OpenAIClientBuilder.class.

Here is an example of the retry occurring for a 429 error.

1st Request -

2025-05-13T17:35:59.381+10:00 INFO 10148 --- [docs-ai-assistant] [oundedElastic-8] c.a.a.o.i.O.getChatCompletions : {"az.sdk.message":"HTTP request","method":"POST","url":"https://{Azure_OpenAI_Endpoint}//openai/deployments/gpt-4o/chat/completions?api-version=2025-01-01-preview","tryCount":1,"content-length":26643}

1st Response - Failed -

2025-05-13T17:35:59.428+10:00 INFO 10148 --- [docs-ai-assistant] [-http-kqueue-13] c.a.a.o.i.O.getChatCompletions : {"az.sdk.message":"HTTP response","statusCode":429,"url":"https://{Azure_OpenAI_Endpoint}//openai/deployments/gpt-4o/chat/completions?api-version=2025-01-01-preview","durationMs":46,"content-length":440,"content-length":440,"body":"{"error":{"code":"429","message": "Requests to the ChatCompletions_Create Operation under Azure OpenAI API version 2025-01-01-preview have exceeded token rate limit of your current OpenAI S0 pricing tier. Please retry after 51 seconds. Please go here: https://aka.ms/oai/quotaincrease if you would like to further increase the default rate limit. For Free Account customers, upgrade to Pay as you Go here: https://aka.ms/429TrialUpgrade.\"}}"}

Retry attempt 1 - 2nd Request -

2025-05-13T17:36:50.434+10:00 INFO 10148 --- [docs-ai-assistant] [ parallel-5] c.a.a.o.i.O.getChatCompletions : {"az.sdk.message":"HTTP request","method":"POST","url":"https://{Azure_OpenAI_Endpoint}//openai/deployments/gpt-4o/chat/completions?api-version=2025-01-01-preview","tryCount":2,"content-length":26643}

Success Response -

2025-05-13T17:36:51.354+10:00 INFO 10148 --- [docs-ai-assistant] [-http-kqueue-15] c.a.a.o.i.O.getChatCompletions : {"az.sdk.message":"HTTP response","statusCode":200,"url":"https://{Azure_OpenAI_Endpoint}//openai/deployments/gpt-4o/chat/completions?api-version=2025-01-01-preview","durationMs":921}

I did not have any custom retry mechanism in my code.
You can see the logs in your application by setting this env variable - export AZURE_HTTP_LOG_DETAIL_LEVEL=BODY when using Azure OpenAI Chat models or embedding models or AI Search in your Spring AI application.

Hope this analysis helps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Relationships

None yet

    Development

    No branches or pull requests

      Participants

      @markpollack@shelltea@mkheck@iAMSagar44

      Issue actions

        AzureOpenAiChatModel Retry Mechanism Not Working. · Issue #2055 · spring-projects/spring-ai