Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Key Vault secret reference resolution should respect startup timeout allowance #516

Closed
amerjusupovic opened this issue Feb 10, 2024 · 1 comment · Fixed by #518 or #594
Closed
Assignees
Labels
bug Something isn't working

Comments

@amerjusupovic
Copy link
Member

amerjusupovic commented Feb 10, 2024

With the introduction of time-based retries on startup, the provider now continues attempting to connect to Azure App Configuration to load the initial configuration until the startup timeout elapses, or until the connection is successful.

Currently, Key Vault secret reference resolution in the provider does not respect the configured startup timeout allowance. If Key Vault is momentarily down during startup and secret references are used in the configuration then startup will fail if the default Key Vault retry count (3) is exceeded. This will surface as a KeyVaultReferenceException.

The provider should be updated to recognize transient Key Vault errors, such as momentary unavailability and retry as long as the startup timeout allows.

@jimmyca15 jimmyca15 changed the title Handle KeyVaultReferenceException when thrown during startup timeout Key Vault reference resolution should respect startup timeout allowance Feb 10, 2024
@jimmyca15 jimmyca15 added the bug Something isn't working label Feb 10, 2024
@jimmyca15 jimmyca15 changed the title Key Vault reference resolution should respect startup timeout allowance Key Vault secret reference resolution should respect startup timeout allowance Feb 10, 2024
@amerjusupovic amerjusupovic reopened this Sep 24, 2024
@amerjusupovic
Copy link
Member Author

The PR originally merged for this PR introduced an issue where Key Vault exceptions would trigger a failover in the provider, which can cause throttling in certain scenarios. It also didn't make sense in the context of geo-replication since the Key Vault reference would remain the same across replicas but would continue being retried. This PR is being reverted, and the original concern in this issue is being addressed by #589 to lower the impact of transient Key Vault connection errors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment