feature: add simple session affinity plugins in gateway plugin #1751

googs1025 · 2025-11-13T11:15:03Z

Pull Request Description

[Please provide a clear and concise description of your changes here]

Related Issues

Resolves: #1728

Important: Before submitting, please complete the description above and review the checklist below.

Contribution Guidelines (Expand for Details)

We appreciate your contribution to aibrix! To ensure a smooth review process and maintain high code quality, please adhere to the following guidelines:

Pull Request Title Format

Your PR title should start with one of these prefixes to indicate the nature of the change:

[Bug]: Corrections to existing functionality
[CI]: Changes to build process or CI pipeline
[Docs]: Updates or additions to documentation
[API]: Modifications to aibrix's API or interface
[CLI]: Changes or additions to the Command Line Interface
[Misc]: For changes not covered above (use sparingly)

Note: For changes spanning multiple categories, use multiple prefixes in order of importance.

Submission Checklist

PR title includes appropriate prefix(es)
Changes are clearly explained in the PR description
New and existing tests pass successfully
Code adheres to project style and best practices
Documentation updated to reflect changes (if applicable)
Thorough testing completed, no regressions introduced

By submitting this PR, you confirm that you've read these guidelines and your changes align with the project's contribution standards.

gemini-code-assist · 2025-11-13T11:15:20Z

Summary of Changes

Hello @googs1025, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a significant enhancement to the gateway plugin by adding a new session affinity routing algorithm. This feature allows the system to maintain sticky sessions with specific backend pods, improving consistency for client requests by attempting to route them to the same pod using a session ID. The implementation includes a robust fallback mechanism for scenarios where session affinity cannot be maintained and updates the gateway's response header processing to correctly propagate session-related headers back to the client. This change aims to provide a more stable and predictable routing experience for users.

Highlights

New Session Affinity Routing Algorithm: Introduced a new routing algorithm named 'session-affinity' for the gateway plugin. This algorithm aims to route subsequent requests from the same client to the same backend pod.
Session ID Based Routing: The session affinity is achieved by using a session ID, x-session-id, stored in the request header. This ID encodes the target pod's IP:Port using base64 encoding.
Fallback Mechanism: If a session ID is missing, invalid, or the target pod specified by the session ID is not ready, the router gracefully falls back to selecting a random ready pod.
Response Header Propagation: The gateway's response header handling has been updated to propagate router-generated response headers (such as the x-session-id for sticky sessions) back to the client, while correctly skipping HTTP/2 pseudo-headers.
Comprehensive Testing: New unit tests have been added for both the sessionAffinityRouter and the updated HandleResponseHeaders logic to ensure correctness and robustness.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a session affinity routing plugin, which is a great feature. The implementation is mostly solid, with good test coverage for the new logic. I've identified a couple of areas for improvement. Firstly, there's a potential bug in RoutingContext where the new RespHeaders field isn't reset when the context is reused from a pool, which could lead to stale data. Secondly, the fallback mechanism in the session affinity router could be made more robust to handle cases where a randomly selected pod is misconfigured. My detailed comments and suggestions are below.

pkg/types/router_context.go

gemini-code-assist · 2025-11-13T11:16:54Z

pkg/plugins/gateway/algorithms/simple_session_affinity.go

+	pods := readyPodList.All()
+
+	selected := pods[rand.Intn(len(pods))]
+	port := utils.GetModelPortForPod(ctx.RequestID, selected)
+	if port == 0 || selected.Status.PodIP == "" {
+		return "", fmt.Errorf("selected pod has no valid network address")
+	}
+	addr := net.JoinHostPort(selected.Status.PodIP, strconv.Itoa(int(port)))
+
+	ctx.SetTargetPod(selected)
+	r.setSessionHeader(ctx, addr)
+	klog.V(5).Infof("Fallback to random pod: %s (%s)", selected.Name, addr)
+
+	return ctx.TargetAddress(), nil


The current implementation of fallbackRoute randomly selects a single pod and fails the entire request if that pod happens to have an invalid network address (e.g., port configured to 0). This is not very robust, as one misconfigured pod could cause random failures even if other healthy pods are available. A better approach would be to iterate through the available pods in a random order until a valid one is found.

pods := readyPodList.All() rand.Shuffle(len(pods), func(i, j int) { pods[i], pods[j] = pods[j], pods[i] }) for _, selected := range pods { port := utils.GetModelPortForPod(ctx.RequestID, selected) // A routable pod must have a valid IP and port. if port == 0 || selected.Status.PodIP == "" { klog.V(4).Infof("Fallback skipping pod %s with invalid network address (IP: %s, Port: %d)", selected.Name, selected.Status.PodIP, port) continue } addr := net.JoinHostPort(selected.Status.PodIP, strconv.Itoa(int(port))) ctx.SetTargetPod(selected) r.setSessionHeader(ctx, addr) klog.V(5).Infof("Fallback to random pod: %s (%s)", selected.Name, addr) return ctx.TargetAddress(), nil } return "", fmt.Errorf("no fallback pod found with a valid network address")

googs1025 · 2025-11-13T11:17:12Z

+---------------------+
|      Client         |
+----------+----------+
           |
           |  HTTP Request
           |  (with optional x-session-id header)
           v
+---------------------+
|   Aibrix Gateway Plugin  |
+----------+----------+
           |
           | Routing Decision
           v
+-----------------------------+
| Session Affinity Router     |  ←───┐
| (sessionAffinityRouter)     |      │
+-----------------------------+      │
           |                         │ Uses
           | 1. Reads x-session-id   │
           | 2. Decodes → IP:Port    │
           | 3. Matches Pod          │
           v                         │
+---------------------+             │
|   Fallback Router   | <───────────┘
+----------+----------+
           |
           | Selected Pod Address
           v
+---------------------+
|   Ready Pod List    |
|  Endpoints / Pods)  |
+----------+----------+
           |
           | Target Pod Info
           v
+---------------------+
|    Backend Pod      |
| (vLLM / LLM Server) |
+----------+----------+
           |
           | HTTP Response
           | (Set-Cookie or x-session-id)
           v
+---------------------+
|      Client         |
+---------------------+

pkg/plugins/gateway/algorithms/simple_session_affinity.go

varungup90 · 2025-11-13T19:00:22Z

Can you describe how the workflow will be.

User directly sends "session-id": "1.1.1.1:8000" in first request OR user does first request then takes the backup of target-pod-address and then uses for next requests?

My thought process is that, we should have user provide UID as session-id and session affinity router maintains small structure to track session-id to target pod address with 1 hour TTL.

googs1025 · 2025-11-14T00:06:38Z

Thanks for the great question!

In the current design, I use a lightweight, stateless session affinity ID where the client carries an opaque session token that encodes the target pod’s IP:Port. Here’s how it works:

First request: The client sends a request without the x-session-id header.
The gateway falls back to a standard routing strategy (e.g., random selection) and picks a ready pod.
It then base64-encodes the pod’s IP:Port (e.g., "10.244.1.5:8000" → "MTAuMjQ0LjEuNTo4MDAw") and returns it in the response header (x-session-id).
The client (or frontend SDK) stores this value—ideally in a secure, HttpOnly cookie—for future requests.
Subsequent requests: The client includes the same x-session-id in the request header.
The gateway decodes it to retrieve the intended IP:Port, checks if that pod is still ready, and routes the request accordingly.
If the pod is no longer available (e.g., scaled down, restarted), the gateway automatically falls back to a new ready pod and issues a new session ID in the response.

googs1025 · 2025-11-14T00:08:45Z

Can you describe how the workflow will be.

User directly sends "session-id": "1.1.1.1:8000" in first request OR user does first request then takes the backup of target-pod-address and then uses for next requests?

My thought process is that, we should have user provide UID as session-id and session affinity router maintains small structure to track session-id to target pod address with 1 hour TTL.

I agree that the UUID approach you suggested 😄. The client uses an abstract session identifier and the gateway plugin maintains a short lived mapping from UUID to pod address (e.g., with a 1-hour TTL)—is also a strong alternative. It would improve security by hiding backend topology and allow more flexible session management.

googs1025 · 2025-11-14T07:31:39Z

I will refer to this approach to update the existing implementation.

Can you describe how the workflow will be.

User directly sends "session-id": "1.1.1.1:8000" in first request OR user does first request then takes the backup of target-pod-address and then uses for next requests?

My thought process is that, we should have user provide UID as session-id and session affinity router maintains small structure to track session-id to target pod address with 1 hour TTL.

varungup90 · 2025-11-14T08:39:03Z

To summarize, user directly starts with session-id header as UUID, from first request (reducing client burden to read session-id header from first request and applying to subsequent requests).

For gateway, it tracks UUID to pod (has TTL), and internally if the pod fails or gateway finds better pod for that session, it can change the pod associated with that session transparently without user knowing.

googs1025 · 2025-11-15T02:21:24Z

To summarize, user directly starts with session-id header as UUID, from first request (reducing client burden to read session-id header from first request and applying to subsequent requests).

For gateway, it tracks UUID to pod (has TTL), and internally if the pod fails or gateway finds better pod for that session, it can change the pod associated with that session transparently without user knowing.

will update implementation 😄

googs1025 · 2025-11-17T12:03:04Z

To summarize, user directly starts with session-id header as UUID, from first request (reducing client burden to read session-id header from first request and applying to subsequent requests).

For gateway, it tracks UUID to pod (has TTL), and internally if the pod fails or gateway finds better pod for that session, it can change the pod associated with that session transparently without user knowing.

Hi @varungup90

Thanks for the great discussion on moving to a uuid based session affinity. I have a quick question about edge-case:

What should we do if the client send an x-session-id that is not a valid UUID ?

My proposed behavior:

Validate the x-session-id using uuid.Parse()
If invalid, ignore it, fall back to selecting a ready pod randomly
Issue a new valid UUID in the response header
Log a warning/info message (with requestID) for observability

This ensures robustness and backward compatibility while avoiding request failures due to client-side errors.

varungup90 · 2025-11-17T14:02:30Z

Sounds good

googs1025 · 2025-11-18T01:05:26Z

@varungup90
I come back to here again:
Session affinity will break in multi-replica deployments because each gateway instance maintains its own in-memory UUID → pod mapping. If requests from the same client hit different replicas, the session ID won’t be recognized, causing unnecessary fallbacks and loss of continuity.

The original approach—where the client carries a token encoding the target pod address (e.g., base64(IP:Port))—works reliably across replicas and avoids this issue.

Jeffwan · 2025-11-27T07:17:38Z

@varungup90 @googs1025 what's the status of this PR? ready to go?

varungup90 · 2025-12-01T20:57:53Z

pkg/plugins/gateway/algorithms/simple_session_affinity.go

+	pods := readyPodList.All()
+	rand.Shuffle(len(pods), func(i, j int) { pods[i], pods[j] = pods[j], pods[i] })
+
+	for _, selected := range pods {


pods passed here are in ready state and will have valid IP and port. Since podList is an array, for loop will always select same pod. Can you use rand.Intn based selection.

Thanks for the suggestion! The current approach uses rand.Shuffle to randomize the order of ready pods and then picks the first one with a valid IP and port. This ensures we avoid invalid pods while maintaining randomness.

varungup90 · 2025-12-01T21:01:06Z

Overall LGTM. One nit comment to randomize fallback route and you can add documentation with a sample.

googs1025 · 2025-12-02T02:11:48Z

you can add documentation with a sample.

I agree that adding documentation with a sample would be helpful. I'd prefer to address this in a follow-up PR where I can update the docs consistently

Signed-off-by: CYJiang <[email protected]>

gemini-code-assist bot reviewed Nov 13, 2025

View reviewed changes

googs1025 force-pushed the session_affinity_plugin branch 2 times, most recently from 014d0b3 to 6fa2832 Compare November 13, 2025 12:29

varungup90 reviewed Nov 13, 2025

View reviewed changes

pkg/plugins/gateway/algorithms/simple_session_affinity.go Outdated Show resolved Hide resolved

varungup90 reviewed Nov 13, 2025

View reviewed changes

pkg/plugins/gateway/algorithms/simple_session_affinity.go Outdated Show resolved Hide resolved

googs1025 force-pushed the session_affinity_plugin branch from 6fa2832 to e4f1082 Compare November 18, 2025 01:10

googs1025 requested a review from varungup90 November 20, 2025 15:09

googs1025 force-pushed the session_affinity_plugin branch from d294a41 to a098f46 Compare November 25, 2025 03:51

googs1025 mentioned this pull request Nov 26, 2025

[Feature]: Add external-filter in Header for advanced routing #1804

Merged

googs1025 force-pushed the session_affinity_plugin branch from a098f46 to b6953ed Compare November 28, 2025 06:52

varungup90 reviewed Dec 1, 2025

View reviewed changes

varungup90 approved these changes Dec 1, 2025

View reviewed changes

googs1025 force-pushed the session_affinity_plugin branch from b6953ed to 0ddd4e4 Compare December 2, 2025 02:12

googs1025 requested a review from varungup90 December 3, 2025 08:51

googs1025 added 2 commits December 3, 2025 16:51

feature: add simple session affinity plugins in gateway plugin

818ffed

Signed-off-by: CYJiang <[email protected]>

fix review comment

b13e454

Signed-off-by: CYJiang <[email protected]>

googs1025 force-pushed the session_affinity_plugin branch from 0ddd4e4 to b13e454 Compare December 3, 2025 08:51

feature: add simple session affinity plugins in gateway plugin #1751

Are you sure you want to change the base?

feature: add simple session affinity plugins in gateway plugin #1751

Conversation

googs1025 commented Nov 13, 2025

Pull Request Description

Related Issues

Pull Request Title Format

Submission Checklist

Uh oh!

gemini-code-assist bot commented Nov 13, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

gemini-code-assist bot Nov 13, 2025

Choose a reason for hiding this comment

Uh oh!

googs1025 Nov 18, 2025

Choose a reason for hiding this comment

Uh oh!

googs1025 commented Nov 13, 2025

Uh oh!

Uh oh!

Uh oh!

varungup90 commented Nov 13, 2025

Uh oh!

googs1025 commented Nov 14, 2025

Uh oh!

googs1025 commented Nov 14, 2025

Uh oh!

googs1025 commented Nov 14, 2025

Uh oh!

varungup90 commented Nov 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

googs1025 commented Nov 15, 2025

Uh oh!

googs1025 commented Nov 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

varungup90 commented Nov 17, 2025

Uh oh!

googs1025 commented Nov 18, 2025

Uh oh!

Jeffwan commented Nov 27, 2025

Uh oh!

varungup90 Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

googs1025 Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

varungup90 commented Dec 1, 2025

Uh oh!

googs1025 commented Dec 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

varungup90 commented Nov 14, 2025 •

edited

Loading

googs1025 commented Nov 17, 2025 •

edited

Loading

googs1025 Dec 2, 2025 •

edited

Loading