-
Notifications
You must be signed in to change notification settings - Fork 490
feature: add simple session affinity plugins in gateway plugin #1751
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Summary of ChangesHello @googs1025, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a significant enhancement to the gateway plugin by adding a new session affinity routing algorithm. This feature allows the system to maintain sticky sessions with specific backend pods, improving consistency for client requests by attempting to route them to the same pod using a session ID. The implementation includes a robust fallback mechanism for scenarios where session affinity cannot be maintained and updates the gateway's response header processing to correctly propagate session-related headers back to the client. This change aims to provide a more stable and predictable routing experience for users. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a session affinity routing plugin, which is a great feature. The implementation is mostly solid, with good test coverage for the new logic. I've identified a couple of areas for improvement. Firstly, there's a potential bug in RoutingContext where the new RespHeaders field isn't reset when the context is reused from a pool, which could lead to stale data. Secondly, the fallback mechanism in the session affinity router could be made more robust to handle cases where a randomly selected pod is misconfigured. My detailed comments and suggestions are below.
| pods := readyPodList.All() | ||
|
|
||
| selected := pods[rand.Intn(len(pods))] | ||
| port := utils.GetModelPortForPod(ctx.RequestID, selected) | ||
| if port == 0 || selected.Status.PodIP == "" { | ||
| return "", fmt.Errorf("selected pod has no valid network address") | ||
| } | ||
| addr := net.JoinHostPort(selected.Status.PodIP, strconv.Itoa(int(port))) | ||
|
|
||
| ctx.SetTargetPod(selected) | ||
| r.setSessionHeader(ctx, addr) | ||
| klog.V(5).Infof("Fallback to random pod: %s (%s)", selected.Name, addr) | ||
|
|
||
| return ctx.TargetAddress(), nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current implementation of fallbackRoute randomly selects a single pod and fails the entire request if that pod happens to have an invalid network address (e.g., port configured to 0). This is not very robust, as one misconfigured pod could cause random failures even if other healthy pods are available. A better approach would be to iterate through the available pods in a random order until a valid one is found.
pods := readyPodList.All()
rand.Shuffle(len(pods), func(i, j int) { pods[i], pods[j] = pods[j], pods[i] })
for _, selected := range pods {
port := utils.GetModelPortForPod(ctx.RequestID, selected)
// A routable pod must have a valid IP and port.
if port == 0 || selected.Status.PodIP == "" {
klog.V(4).Infof("Fallback skipping pod %s with invalid network address (IP: %s, Port: %d)", selected.Name, selected.Status.PodIP, port)
continue
}
addr := net.JoinHostPort(selected.Status.PodIP, strconv.Itoa(int(port)))
ctx.SetTargetPod(selected)
r.setSessionHeader(ctx, addr)
klog.V(5).Infof("Fallback to random pod: %s (%s)", selected.Name, addr)
return ctx.TargetAddress(), nil
}
return "", fmt.Errorf("no fallback pod found with a valid network address")There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
+---------------------+
| Client |
+----------+----------+
|
| HTTP Request
| (with optional x-session-id header)
v
+---------------------+
| Aibrix Gateway Plugin |
+----------+----------+
|
| Routing Decision
v
+-----------------------------+
| Session Affinity Router | ←───┐
| (sessionAffinityRouter) | │
+-----------------------------+ │
| │ Uses
| 1. Reads x-session-id │
| 2. Decodes → IP:Port │
| 3. Matches Pod │
v │
+---------------------+ │
| Fallback Router | <───────────┘
+----------+----------+
|
| Selected Pod Address
v
+---------------------+
| Ready Pod List |
| Endpoints / Pods) |
+----------+----------+
|
| Target Pod Info
v
+---------------------+
| Backend Pod |
| (vLLM / LLM Server) |
+----------+----------+
|
| HTTP Response
| (Set-Cookie or x-session-id)
v
+---------------------+
| Client |
+---------------------+
|
014d0b3 to
6fa2832
Compare
|
Can you describe how the workflow will be.
My thought process is that, we should have user provide UID as session-id and session affinity router maintains small structure to track session-id to target pod address with 1 hour TTL. |
|
Thanks for the great question! In the current design, I use a lightweight, stateless session affinity ID where the client carries an opaque session token that encodes the target pod’s IP:Port. Here’s how it works:
|
I agree that the UUID approach you suggested 😄. The client uses an abstract session identifier and the gateway plugin maintains a short lived mapping from UUID to pod address (e.g., with a 1-hour TTL)—is also a strong alternative. It would improve security by hiding backend topology and allow more flexible session management. |
|
I will refer to this approach to update the existing implementation.
|
|
To summarize, user directly starts with session-id header as UUID, from first request (reducing client burden to read session-id header from first request and applying to subsequent requests). For gateway, it tracks UUID to pod (has TTL), and internally if the pod fails or gateway finds better pod for that session, it can change the pod associated with that session transparently without user knowing. |
will update implementation 😄 |
Hi @varungup90 Thanks for the great discussion on moving to a uuid based session affinity. I have a quick question about edge-case: What should we do if the client send an My proposed behavior:
This ensures robustness and backward compatibility while avoiding request failures due to client-side errors. |
|
Sounds good |
|
@varungup90 The original approach—where the client carries a token encoding the target pod address (e.g., base64(IP:Port))—works reliably across replicas and avoids this issue. |
6fa2832 to
e4f1082
Compare
d294a41 to
a098f46
Compare
|
@varungup90 @googs1025 what's the status of this PR? ready to go? |
a098f46 to
b6953ed
Compare
| pods := readyPodList.All() | ||
| rand.Shuffle(len(pods), func(i, j int) { pods[i], pods[j] = pods[j], pods[i] }) | ||
|
|
||
| for _, selected := range pods { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pods passed here are in ready state and will have valid IP and port. Since podList is an array, for loop will always select same pod. Can you use rand.Intn based selection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion! The current approach uses rand.Shuffle to randomize the order of ready pods and then picks the first one with a valid IP and port. This ensures we avoid invalid pods while maintaining randomness.
|
Overall LGTM. One nit comment to randomize fallback route and you can add documentation with a sample. |
I agree that adding documentation with a sample would be helpful. I'd prefer to address this in a follow-up PR where I can update the docs consistently |
b6953ed to
0ddd4e4
Compare
Signed-off-by: CYJiang <[email protected]>
Signed-off-by: CYJiang <[email protected]>
0ddd4e4 to
b13e454
Compare
Pull Request Description
[Please provide a clear and concise description of your changes here]
Related Issues
Resolves: #1728
Important: Before submitting, please complete the description above and review the checklist below.
Contribution Guidelines (Expand for Details)
We appreciate your contribution to aibrix! To ensure a smooth review process and maintain high code quality, please adhere to the following guidelines:
Pull Request Title Format
Your PR title should start with one of these prefixes to indicate the nature of the change:
[Bug]: Corrections to existing functionality[CI]: Changes to build process or CI pipeline[Docs]: Updates or additions to documentation[API]: Modifications to aibrix's API or interface[CLI]: Changes or additions to the Command Line Interface[Misc]: For changes not covered above (use sparingly)Note: For changes spanning multiple categories, use multiple prefixes in order of importance.
Submission Checklist
By submitting this PR, you confirm that you've read these guidelines and your changes align with the project's contribution standards.