Skip to content

refactor(routers): use worker.http_client() instead of shared AppContext.client#1016

Open
CatherineSue wants to merge 7 commits intomainfrom
feat/routers-use-worker-http-client
Open

refactor(routers): use worker.http_client() instead of shared AppContext.client#1016
CatherineSue wants to merge 7 commits intomainfrom
feat/routers-use-worker-http-client

Conversation

@CatherineSue
Copy link
Copy Markdown
Member

@CatherineSue CatherineSue commented Apr 1, 2026

Description

Part of the per-worker resilience refactor series.

Problem

All routers share a single reqwest::Client from AppContext.client. This means all workers share the same connection pool — one slow worker's connections affect others.

Solution

Routers now use worker.http_client() for request routing. Each worker has its own isolated connection pool (configured via HttpPoolConfig at registration time).

The shared client is removed from WorkerSelector (model refresh now uses worker.http_client() too) and from Gemini SharedComponents. HTTP and HTTP PD routers still have self.client field but it's unused for request routing.

Changes (one commit per router)

  1. worker_selection.rs: Remove client from WorkerSelector, refresh_worker_models uses worker.http_client()
  2. OpenAI router: chat.rs, non_streaming.rs, streaming.rs use worker.http_client() instead of ctx.components.client()
  3. HTTP router: proxy_get_request, route_simple_request, send_typed_request use worker.http_client(). Changed select_first_worker to return Arc<dyn Worker> instead of String.
  4. Gemini router: non_stream_execution.rs uses worker.http_client(). Removed client field from Gemini SharedComponents.
  5. HTTP PD router: proxy_to_worker, build_post_with_headers, health_generate use per-worker clients. Changed proxy_to_worker to take &dyn Worker instead of URL string.

Test Plan

  • cargo test -p smg --lib — all 450 tests pass
  • Pre-commit hooks pass (rustfmt, clippy, codespell, DCO)
Checklist
  • cargo +nightly fmt passes
  • cargo clippy --all-targets --all-features -- -D warnings passes

Summary by CodeRabbit

  • Refactor
    • Switched from a shared HTTP client to per-worker HTTP clients across routers and routing flows, improving request isolation and reliability.
  • Tests
    • Updated unit tests to align with worker-based selection and per-worker HTTP client routing changes.

Summary by CodeRabbit

WorkerSelector no longer needs a shared reqwest::Client. The
refresh_worker_models function now uses worker.http_client() to call
/v1/models on each external worker, using the worker's own isolated
connection pool.

- Remove client field from WorkerSelector
- Remove client parameter from WorkerSelector::new()
- Update all 5 callers to drop the client argument

Signed-off-by: Chang Su <chang.s.su@oracle.com>
Switch OpenAI Chat, Responses non-streaming, and Responses streaming
handlers from ctx.components.client() (shared) to worker.http_client()
(per-worker isolated connection pool).

SharedComponents.client is no longer used for request routing in the
OpenAI router — only for model refresh via WorkerSelector (already
migrated to worker.http_client() in the previous commit).

Signed-off-by: Chang Su <chang.s.su@oracle.com>
Switch HTTP router from self.client (shared) to worker.http_client()
(per-worker) for all request routing paths:

- proxy_get_request: change select_first_worker to return Arc<dyn Worker>
  instead of String, use worker.http_client()
- route_simple_request (fan-out): use worker.http_client().clone()
- send_typed_request: use worker.http_client()
- Update tests for new select_first_worker return type

Signed-off-by: Chang Su <chang.s.su@oracle.com>
Switch Gemini non-stream execution from shared client to
worker.http_client(). Remove client field from Gemini
SharedComponents since it's no longer used (worker_selection
already migrated, and this was the last usage).

Signed-off-by: Chang Su <chang.s.su@oracle.com>
Switch HTTP PD router from self.client (shared) to worker.http_client()
(per-worker) for all request paths:

- proxy_to_worker: takes &dyn Worker instead of URL string
- build_post_with_headers: use prefill/decode worker clients
- health_generate: use prefill/decode worker clients

Signed-off-by: Chang Su <chang.s.su@oracle.com>
@github-actions github-actions bot added model-gateway Model gateway crate changes openai OpenAI router changes anthropic Anthropic router changes gemini Gemini router changes labels Apr 1, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 1, 2026

📝 Walkthrough

Walkthrough

Removed router-level shared reqwest::Client usage and moved HTTP request execution to per-worker HTTP clients; updated WorkerSelector to accept only &WorkerRegistry and adjusted call sites across Anthropic, Gemini, OpenAI, and HTTP routers accordingly.

Changes

Cohort / File(s) Summary
Worker Selection Core
model_gateway/src/routers/worker_selection.rs
WorkerSelector::new now takes only &WorkerRegistry; refresh_worker_models signature updated and uses worker.http_client().get(...) instead of a shared client.
OpenAI Router & Context
model_gateway/src/routers/openai/context.rs, model_gateway/src/routers/openai/router.rs, model_gateway/src/routers/openai/realtime/rest.rs
Removed shared reqwest::Client from SharedComponents; SharedComponents is empty; ComponentRefs::client() removed; WorkerSelector::new calls updated; realtime forward function no longer accepts external client and derives HTTP client from selected worker.
OpenAI Responses (streaming & non-streaming)
model_gateway/src/routers/openai/responses/route.rs, model_gateway/src/routers/openai/responses/non_streaming.rs, model_gateway/src/routers/openai/responses/streaming.rs
Outbound HTTP calls and worker selection updated to use worker.http_client() instead of router/shared client; call-sites adjusted accordingly.
OpenAI Chat
model_gateway/src/routers/openai/chat.rs
WorkerSelector::new now constructed with deps.worker_registry; upstream requests use worker.http_client() rather than shared client.
Gemini Router & Context
model_gateway/src/routers/gemini/context.rs, model_gateway/src/routers/gemini/router.rs
Removed client: reqwest::Client from SharedComponents; GeminiRouter::new no longer initializes a shared client.
Gemini Steps
model_gateway/src/routers/gemini/steps/worker_selection.rs, model_gateway/src/routers/gemini/steps/non_stream_execution.rs
WorkerSelector constructed with &ctx.components.worker_registry; upstream requests now use worker.http_client() in place of ctx.components.client.
Anthropic Router
model_gateway/src/routers/anthropic/router.rs
WorkerSelector::new call simplified to pass only &self.router_ctx.worker_registry; selection flow unchanged.
HTTP Router & PD Router
model_gateway/src/routers/http/router.rs, model_gateway/src/routers/http/pd_router.rs
Router-wide reqwest::Client removed from structs; select_first_worker now returns Arc<dyn Worker>; all proxy/health requests now build URLs from worker.url() and use worker.http_client() for requests; tests updated.
Misc call-site updates
multiple files across routers (.../route_messages, route_responses, non_stream_execution, etc.)
Various call-sites updated to construct WorkerSelector without a shared client and to use worker.http_client() for outbound HTTP calls.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested labels

realtime-api, tests

Suggested reviewers

  • key4ng
  • slin1237

Poem

🐰 Hopping lines of Rust and cheer,
Each Worker holds its client near.
No shared barrel, each sip is fine,
Requests routed down their own spine.
Hooray — per-worker cups align! 🥕

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely describes the main refactoring: migrating from shared AppContext.client to per-worker HTTP clients.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/routers-use-worker-http-client

Warning

Review ran into problems

🔥 Problems

Timed out fetching pipeline failures after 30000ms


Comment @coderabbitai help to get the list of available commands and usage tips.

.map(|worker| {
let url = format!("{}/{}", worker.base_url(), endpoint);
let client = self.client.clone();
let client = worker.http_client().clone();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Nit: With this change, self.client is no longer used for any HTTP requests in Router — the only remaining reference is in the Debug impl (line 59). The client: Client field on the Router struct (line 50) is now dead code and can be removed along with its initialization in new() and the Debug field.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in bbaa9f4. Removed client field, initialization, and Debug field from Router.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7b8e82cdbb

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

let payload_ref = ctx.payload().expect("Payload not prepared");
let payload_json = Arc::new(payload_ref.json.clone());
let client = ctx.components.client().clone();
let client = worker.http_client().clone();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Restore configured client for mesh-synced workers

Switching chat routing to worker.http_client() assumes every Worker carries a fully configured HTTP client, but mesh-synced workers are still built via BasicWorkerBuilder::from_spec(...).build() / .new(...).build() in worker_registry.rs without .http_client(...), which falls back to a bare default client in worker_builder.rs (no router TLS identity/custom CA/proxy settings). In mesh deployments that depend on those router-level settings, requests to selected workers will now fail after this line, whereas the previous shared AppContext.client path worked; please either inject configured clients when creating mesh workers or keep a safe fallback for workers with default clients.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid concern. Mesh-synced workers built via on_remote_worker_state use BasicWorkerBuilder without router TLS config, so their http_client() is a bare default (no mTLS/custom CA). The fix requires passing RouterConfig into the mesh sync path. Tracked as a known limitation — pre-existing issue exposed by this migration.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the model gateway to utilize per-worker HTTP clients across multiple routers and the worker selection process. The review feedback identifies a recurring issue where worker.url() is used for request construction, which is incompatible with Data Parallel (DP) aware workers due to rank suffixes. It is recommended to use worker.base_url() instead and to centralize this URL formatting logic to improve maintainability.

Ok(worker) => {
let mut request_builder = worker
.http_client()
.get(format!("{}/{endpoint}", worker.url()));
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using worker.url() for the request URL is incorrect for Data Parallel (DP) aware workers. The url() method returns the worker's identity string which includes the @rank suffix (e.g., http://host:port@rank), causing reqwest to misparse the host. Use worker.base_url() instead to get the physical endpoint URL. Since this URL construction pattern is repeated across the router, consider extracting it into a shared helper function.

Suggested change
.get(format!("{}/{endpoint}", worker.url()));
.get(format!("{}/{endpoint}", worker.base_url()));
References
  1. Extract duplicated logic into a shared helper function to improve maintainability and reduce redundancy.
  2. Server-generated IDs used in format! macros for URL paths do not require URL encoding, as this is the standard pattern throughout the codebase for non-user-input IDs.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pre-existing — the old code also used w.url().to_string() in proxy_to_first_prefill_worker. Not introduced by this PR.

Switch realtime REST routes (sessions, client_secrets,
transcription_sessions) from shared_components.client to
worker.http_client(). Remove client parameter from
forward_realtime_rest.

Remove client field from OpenAI SharedComponents and
ComponentRefs::client() — no remaining usages.

Signed-off-by: Chang Su <chang.s.su@oracle.com>
Remove self.client from Router and PDRouter structs — no longer used
for request routing after migration to worker.http_client().
Also remove unused reqwest::Client import from HTTP router.

Signed-off-by: Chang Su <chang.s.su@oracle.com>
@github-actions github-actions bot added the realtime-api Realtime API related changes label Apr 1, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bbaa9f443c

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

) -> bool {
let url = format!("{}/v1/models", worker.url());
let mut backend_req = client.get(&url);
let mut backend_req = worker.http_client().get(&url);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Keep configured client for external model refresh

Switching refresh_worker_models to worker.http_client() regresses model discovery for mesh-synced workers that are constructed without an injected client (WorkerRegistry::on_remote_worker_state builds via BasicWorkerBuilder::from_spec(...).build() at core/worker_registry.rs:1065, and BasicWorkerBuilder::build falls back to a default reqwest client at core/worker_builder.rs:245-253). In deployments that rely on router-level TLS/proxy/custom-CA settings, /v1/models refresh now fails and select_worker can return model_not_found even though healthy workers exist. Fresh evidence beyond the existing chat-path comment: this now breaks worker selection itself (refresh-on-miss) before any request routing occurs.

Useful? React with 👍 / 👎.

) -> bool {
let url = format!("{}/v1/models", worker.url());
let mut backend_req = client.get(&url);
let mut backend_req = worker.http_client().get(&url);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Important: Same mesh-worker concern flagged on chat.rs by another reviewer, but with a worse failure mode here: refresh_worker_models returns false on failure and only logs a warning (line 237). For mesh-synced workers built via BasicWorkerBuilder::from_spec(spec).build() without .http_client(...), this will silently use a bare default client (no TLS identity, no custom CA, no proxy settings from router config). Unlike the request-routing paths where failure surfaces as an HTTP error to the caller, here the failure is silent — the worker's model list simply won't refresh, making it invisible for model-based routing without any user-facing error.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@model_gateway/src/routers/http/pd_router.rs`:
- Around line 71-75: The current code calls
WorkerRegistry::get_prefill_workers() and then uses workers.first(), which can
pick an unhealthy worker; change the selection to pick a healthy/available
worker (e.g., call .iter().find(|w| w.is_available()) or .into_iter().filter(|w|
w.is_available()).next()) before calling proxy_to_worker so
get_server_info/get_model_info are proxied only to available backends; update
the branch that currently matches on workers.first() to instead match on the
healthy worker found and pass that worker.as_ref() into proxy_to_worker.

In `@model_gateway/src/routers/http/router.rs`:
- Around line 76-82: select_first_worker currently returns any healthy worker
from worker_registry.get_all(), which can allow PD or gRPC backends to be
chosen; change it to filter for workers that are both WorkerType::Regular and
ConnectionMode::Http before checking health. Specifically, in
select_first_worker, replace the existing healthy_workers computation with an
iterator that first filters by worker.worker_type() == WorkerType::Regular &&
worker.connection_mode() == ConnectionMode::Http, then by is_healthy(), and
return the first matching Arc<dyn Worker> or the same "No workers are available"
error if none match.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: b60130ca-e574-4894-9935-3fd920932a6b

📥 Commits

Reviewing files that changed from the base of the PR and between 7b8e82c and bbaa9f4.

📒 Files selected for processing (5)
  • model_gateway/src/routers/http/pd_router.rs
  • model_gateway/src/routers/http/router.rs
  • model_gateway/src/routers/openai/context.rs
  • model_gateway/src/routers/openai/realtime/rest.rs
  • model_gateway/src/routers/openai/router.rs

Comment on lines 71 to +75
let workers = self.worker_registry.get_prefill_workers();
let first_worker_url = workers.first().map(|w| w.url().to_string());

if let Some(worker_url) = first_worker_url {
self.proxy_to_worker(worker_url, endpoint, headers).await
if let Some(worker) = workers.first() {
self.proxy_to_worker(worker.as_ref(), endpoint, headers)
.await
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail
rg -n -A20 -B5 'fn get_prefill_workers|fn get_decode_workers' model_gateway/src/core/worker_registry.rs

Repository: lightseekorg/smg

Length of output: 1397


Add health filter before proxy in proxy_to_first_prefill_worker.

WorkerRegistry::get_prefill_workers() only filters by worker type; it returns all prefill workers regardless of health status. Calling .first() can select an unhealthy prefill worker, causing get_server_info and get_model_info to proxy requests to a down backend. Add an explicit .filter(|w| w.is_available()) or similar before .first().

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@model_gateway/src/routers/http/pd_router.rs` around lines 71 - 75, The
current code calls WorkerRegistry::get_prefill_workers() and then uses
workers.first(), which can pick an unhealthy worker; change the selection to
pick a healthy/available worker (e.g., call .iter().find(|w| w.is_available())
or .into_iter().filter(|w| w.is_available()).next()) before calling
proxy_to_worker so get_server_info/get_model_info are proxied only to available
backends; update the branch that currently matches on workers.first() to instead
match on the healthy worker found and pass that worker.as_ref() into
proxy_to_worker.

Comment on lines +76 to +82
fn select_first_worker(&self) -> Result<Arc<dyn Worker>, String> {
let workers = self.worker_registry.get_all();
let healthy_workers: Vec<_> = workers.iter().filter(|w| w.is_healthy()).collect();
if healthy_workers.is_empty() {
Err("No workers are available".to_string())
} else {
Ok(healthy_workers[0].url().to_string())
Ok(healthy_workers[0].clone())
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Restrict select_first_worker to regular HTTP workers.

This is the only selection helper in the file that bypasses the WorkerType::Regular + ConnectionMode::Http filter. In mixed deployments, Line 89 can forward /health_generate, /get_server_info, or /get_model_info to a PD or gRPC backend.

🎯 Suggested fix
 fn select_first_worker(&self) -> Result<Arc<dyn Worker>, String> {
-    let workers = self.worker_registry.get_all();
-    let healthy_workers: Vec<_> = workers.iter().filter(|w| w.is_healthy()).collect();
-    if healthy_workers.is_empty() {
-        Err("No workers are available".to_string())
-    } else {
-        Ok(healthy_workers[0].clone())
-    }
+    self.worker_registry
+        .get_workers_filtered(
+            None,
+            Some(WorkerType::Regular),
+            Some(ConnectionMode::Http),
+            None,
+            false,
+        )
+        .into_iter()
+        .find(|w| w.is_healthy())
+        .ok_or_else(|| "No workers are available".to_string())
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@model_gateway/src/routers/http/router.rs` around lines 76 - 82,
select_first_worker currently returns any healthy worker from
worker_registry.get_all(), which can allow PD or gRPC backends to be chosen;
change it to filter for workers that are both WorkerType::Regular and
ConnectionMode::Http before checking health. Specifically, in
select_first_worker, replace the existing healthy_workers computation with an
iterator that first filters by worker.worker_type() == WorkerType::Regular &&
worker.connection_mode() == ConnectionMode::Http, then by is_healthy(), and
return the first matching Arc<dyn Worker> or the same "No workers are available"
error if none match.

@mergify
Copy link
Copy Markdown
Contributor

mergify bot commented Apr 3, 2026

Hi @CatherineSue, this PR has merge conflicts that must be resolved before it can be merged. Please rebase your branch:

git fetch origin main
git rebase origin/main
# resolve any conflicts, then:
git push --force-with-lease

@mergify mergify bot added the needs-rebase PR has merge conflicts that need to be resolved label Apr 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

anthropic Anthropic router changes gemini Gemini router changes model-gateway Model gateway crate changes needs-rebase PR has merge conflicts that need to be resolved openai OpenAI router changes realtime-api Realtime API related changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant