[Feature][Algo] Add Balancer #42

chickeyton · 2025-11-24T02:12:00Z

Add balancer and the corresponding platform connectors

Add new balancer and connectors from llm-balancer

Balancer [tested]: The high level balancer interface providing configurations & SLO management, request load balancing functions
Request Routers:
- Random router [tested]
- RR router [tested]
- Queue Length router [tested]
- Prefill (i.e. TTFT) router [tested]
- Encode router
- Decode router [tested]
- Kvaware router
Dynamic P/D [tested]: for dynamic P/D role switching base on the configurated SLO and realtime statistics
Connectors:
- vllm integrations on KV cache awareness & instance config
- lmcache integrations on KV cache awareness
Batch Routing: Route a collection of tasks once by using greedy local search optimization algo to maximize the load balancing effect, supported by all routers listed above
vllm plugin : A vllm plugin package required for the vllm integrations on KV cache awareness

For the usage examples:

How to create and config Balancer and start a Http proxy server for requests
https://github.com/chickeyton/llm-balancer/blob/workload_router/llm_balancer/api/http/app.py
https://github.com/chickeyton/llm-balancer/tree/workload_router/examples/http/p_d (configs for P/D Disagg)
https://github.com/chickeyton/llm-balancer/tree/workload_router/examples/http/pd (configs for Mixed Mode)
https://github.com/chickeyton/llm-balancer/tree/workload_router/examples/http/dynamic_pd (configs for Dynamic P/D)
https://github.com/chickeyton/llm-balancer/tree/workload_router/examples/http/batched_p_d (configs for Batch Routing)

How to actually handle chat/completions requests with Balancer on P/D disagg(with Dynamic P/D, Batch Routing), Mixed mode (with Batch Routing)
https://github.com/chickeyton/llm-balancer/blob/workload_router/llm_balancer/api/http/pipeline/p_d.py
https://github.com/chickeyton/llm-balancer/blob/workload_router/llm_balancer/api/http/pipeline/pd.py
https://github.com/chickeyton/llm-balancer/blob/workload_router/llm_balancer/api/http/pipeline/utils.py
https://github.com/chickeyton/llm-balancer/blob/workload_router/llm_balancer/api/http/pipeline/pipeline.py

How to implement custom LLM service discovery
https://github.com/chickeyton/llm-balancer/blob/workload_router/examples/balancer/basic_usage.py (RedisEndpointTracker)

How to implement custom KV cache awareness
https://github.com/chickeyton/llm-balancer/blob/workload_router/llm_balancer/connectors/vllm/kv_connector.py
https://github.com/chickeyton/llm-balancer/blob/workload_router/llm_balancer/connectors/vllm/kv_cache_tracker.py

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Documentation update
Performance improvement
Code refactoring
Test improvements
CI/CD improvements

Related Issues

Changes Made

Testing

Existing tests pass
New tests added (if applicable)
Manual testing performed

Test Coverage

Documentation

Documentation updated (if needed)
Code comments added/updated
API documentation updated (if applicable)

Checklist

Screenshots/Output

Additional Notes

Reviewer Checklist

Signed-off-by: chickeyton <[email protected]>

lm_service/connectors/vllm/kv_cache_tracker.py

+            self.is_endpoint_up = True
+
+    def __init__(self, tracker: EndpointTracker, preserve_down_records: bool = False):
+        super(Thread, self).__init__()


lm_service/connectors/vllm/endpoint.py

@@ -0,0 +1,26 @@
+from dataclasses import dataclass


lm_service/connectors/vllm/endpoint.py

+from dataclasses import dataclass
+
+from ...balancer import EndpointConfig, Endpoint
+from openai import AsyncOpenAI, OpenAI


Signed-off-by: chickeyton <[email protected]>

github-actions · 2025-12-27T12:46:47Z

This pull request has been automatically marked as stale because it has not had recent activity.
It will be closed if no further activity occurs. Thank you for your contributions.

add balancer

3e11185

Signed-off-by: chickeyton <[email protected]>

github-advanced-security bot found potential problems Nov 24, 2025

View reviewed changes

bugfix

78189fc

Signed-off-by: chickeyton <[email protected]>

chickeyton changed the title ~~add balancer~~ [Feature][Algo] Add Balancer Nov 25, 2025

chickeyton added 2 commits November 27, 2025 15:23

enhancement

66ee24b

Signed-off-by: chickeyton <[email protected]>

remove .idea

745f932

Signed-off-by: chickeyton <[email protected]>

github-actions bot added the stale label Dec 27, 2025

chickeyton closed this Dec 31, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature][Algo] Add Balancer #42

[Feature][Algo] Add Balancer #42

Uh oh!

chickeyton commented Nov 24, 2025 •

edited

Loading

Uh oh!

Check failure

Check notice

Check notice

github-actions bot commented Dec 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

[Feature][Algo] Add Balancer #42

[Feature][Algo] Add Balancer #42

Uh oh!

Conversation

chickeyton commented Nov 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Add balancer and the corresponding platform connectors

Add new balancer and connectors from llm-balancer

Related Issues

Changes Made

Testing

Test Coverage

Documentation

Checklist

Screenshots/Output

Additional Notes

Reviewer Checklist

Uh oh!

Check failure

Check notice

Check notice

github-actions bot commented Dec 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

chickeyton commented Nov 24, 2025 •

edited

Loading