-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change PickFirstLeafLoadBalancer to only have 1 subchannel at a time #11520
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sending what I have. I'll need to look at this more later.
core/src/main/java/io/grpc/internal/PickFirstLeafLoadBalancer.java
Outdated
Show resolved
Hide resolved
core/src/main/java/io/grpc/internal/PickFirstLeafLoadBalancer.java
Outdated
Show resolved
Hide resolved
core/src/main/java/io/grpc/internal/PickFirstLeafLoadBalancer.java
Outdated
Show resolved
Hide resolved
core/src/main/java/io/grpc/internal/PickFirstLeafLoadBalancer.java
Outdated
Show resolved
Hide resolved
@@ -458,7 +538,8 @@ private SubchannelData createNewSubchannel(SocketAddress addr, Attributes attrs) | |||
} | |||
|
|||
private boolean isPassComplete() { | |||
if (addressIndex.isValid() || subchannels.size() < addressIndex.size()) { | |||
if ((!serializingRetries && addressIndex.isValid()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we want to ignore whether the index is valid?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because we reset it for subchannel retry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still nagging at me. I feel like either we never need this condition or we need to replace it with something else for serializingRetries. I agree that we need this function to return true even though we've reset the index (for ++numTf
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, there would never be a time where that was true but the subchannels.size() < addressIndex.size()
was false. We do have the firstPass variable, so I used that where this is called to skip the comparisons on the repeated checks once we've seen the end.
289ca19
to
df8e78c
Compare
core/src/main/java/io/grpc/internal/PickFirstLeafLoadBalancer.java
Outdated
Show resolved
Hide resolved
…if environment variable GRPC_SERIALIZE_RETRIES == true. Cache serializingRetries value so that it doesn't have to look up the flag every time. Clear the correct task when READY in processSubchannelState and move the logic to cancelScheduledTasks Cleanup based on PR review remove unneeded checks for shutdown.
d15a3ee
to
3f910b0
Compare
Ready for re-review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sending what I have. This seems much less bug-prone
core/src/main/java/io/grpc/internal/PickFirstLeafLoadBalancer.java
Outdated
Show resolved
Hide resolved
core/src/main/java/io/grpc/internal/PickFirstLeafLoadBalancer.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After the first failure for an address, I don't see how we start a second connection to the same address (after the backoff).
…t is disabled. Remove an extra index.increment in LeafLB Fix spelling, remove unneeded additions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good other than the nagging !serializingRetries && addressIndex.isValid()
core/src/main/java/io/grpc/internal/PickFirstLeafLoadBalancer.java
Outdated
Show resolved
Hide resolved
core/src/main/java/io/grpc/internal/PickFirstLeafLoadBalancer.java
Outdated
Show resolved
Hide resolved
@@ -458,7 +538,8 @@ private SubchannelData createNewSubchannel(SocketAddress addr, Attributes attrs) | |||
} | |||
|
|||
private boolean isPassComplete() { | |||
if (addressIndex.isValid() || subchannels.size() < addressIndex.size()) { | |||
if ((!serializingRetries && addressIndex.isValid()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still nagging at me. I feel like either we never need this condition or we need to replace it with something else for serializingRetries. I agree that we need this function to return true even though we've reset the index (for ++numTf
).
} | ||
} | ||
|
||
if (isPassComplete()) { | ||
if (!firstPass || isPassComplete()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's at least add a comment that the !firstPass
is an optimization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just removed it. Looping on a boolean comparison hopefully won't take too long.
…rpc#11520) * Change PickFirstLeafLoadBalancer to only have 1 subchannel at a time if environment variable GRPC_SERIALIZE_RETRIES == true. Cache serializingRetries value so that it doesn't have to look up the flag every time. Clear the correct task when READY in processSubchannelState and move the logic to cancelScheduledTasks Cleanup based on PR review remove unneeded checks for shutdown. * Fix previously broken tests * Shutdown previous subchannel when run off end of index. * Provide option to disable subchannel retries to let PFLeafLB take control of retries. * InternalSubchannel internally goes to IDLE when sees TF when reconnect is disabled. Remove an extra index.increment in LeafLB
Hide behind environment variable GRPC_SERIALIZE_RETRIES
This is for testing with GMS core to see if they use the new PF with only 1 subchannel at a time trying to reconnect whether that eliminates their 6% data increase.