Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add failover request #584

Closed
wants to merge 2 commits into from
Closed

Conversation

zhiyuanliang-ms
Copy link
Contributor

Why this PR?

#583

Visible change

Correlation context will have two more tags: UsesReplicaDiscovery and FailoverRequest

@@ -1077,6 +1080,8 @@ private void UpdateCacheExpirationTime(KeyValueWatcher changeWatcher)
}

previousEndpoint = currentEndpoint;

_requestTracingOptions.IsFailoverRequest = true;
Copy link
Contributor

@RichardChen820 RichardChen820 Jul 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's consider this concrete scenario, customer passes an endpoint (static) and we discover a replica endpoint (dynamic)

  1. At a time point, the static one starts to fail, failover to the dynamic one, current code can handle the dynamic client as a failover, add failover tag in correlation context.
  2. The static client keeps failing and refresh trigger before its backoff time due, I think the dynamic client still need to have the failover tag in correlation context, but the current code will treat it as not a failover

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And we need to consider the loadbalance feature, the load balance requests should not be considered as failover

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't consider loadbalancing, then we can have a very simple way to decide whether the request is from a failover: we can just check whether the current client is the first client from the list returned by ConfigurationClientManager.GetClients. But this way will break when load balancing is enabled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RichardChen820 I sent an email to discuss this.

@zhiyuanliang-ms zhiyuanliang-ms force-pushed the zhiyuanliang/failover-telemetry branch from 4ea24d5 to f71d226 Compare July 25, 2024 13:12
/// <summary>
/// Flag to indicate whether replica discovery is enabled.
/// </summary>
public bool IsReplicaDiscoveryEnabled { get; set; } = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's quite easy to know the count of replica discovered by the srv dns lookup, how about set DiscoveredReplicaCount=x rather than using a boolean to indicate enabled or not.

@@ -223,6 +230,11 @@ public async Task RefreshAsync(CancellationToken cancellationToken)
}
);

if (clients.Count() < totalClientCount && _requestTracingEnabled && _requestTracingOptions != null)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comparing the client count can't tell failover happen or not, it just tell some client's backoff time is not due.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I planned to move this PR on preview branch where we have load balancing implementated.

@zhiyuanliang-ms
Copy link
Contributor Author

Move this PR on preview branch so that we can think more clearly about how we should consider failover when load balancing is introduced.

@zhiyuanliang-ms zhiyuanliang-ms deleted the zhiyuanliang/failover-telemetry branch July 26, 2024 07:51
@zhiyuanliang-ms
Copy link
Contributor Author

Move to #586

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants