Skip to content

feat: add vLLM scale-out deployment with nginx load balancing#109

Draft
maryamtahhan wants to merge 2 commits into
redhat-et:mainfrom
maryamtahhan:multi-instance-test
Draft

feat: add vLLM scale-out deployment with nginx load balancing#109
maryamtahhan wants to merge 2 commits into
redhat-et:mainfrom
maryamtahhan:multi-instance-test

Conversation

@maryamtahhan
Copy link
Copy Markdown
Collaborator

Adds turnkey Ansible automation for deploying multiple vLLM instances on a single DUT with configurable nginx load balancing. Enables performance testing at scale with flexible configuration of instance count (1-10), cores per instance (8/16/32), SMT, prefix caching, and load balancing policies (round-robin/least-conn/ip-hash). Includes comprehensive documentation, example inventory, and integration with existing GuideLLM benchmark playbooks.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 21, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: f6f8a21d-be26-4837-9732-3a92313a0512

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Fix incorrect default container image for vllm_bench that caused
"Exec format error" on x86_64 servers.

Problem:
- Default was ARM image: quay.io/mtahhan/vllm:arm-base-cpu
- Embedding tests failed on x86_64 EC2 with "Exec format error"
- Error: "image platform (linux/arm64/v8) does not match (linux/amd64)"

Root Cause:
- vllm_bench config (line 98) used ARM-only image as default
- Introduced in commit 5a494c8 during inventory restructure
- Controller architecture (Mac ARM) is irrelevant - containers run on
  remote targets (EC2 x86_64)
- LLM tests unaffected - they use guidellm with multi-arch images

Fix:
- Change default to match vLLM server image (x86_64 compatible)
- Old: quay.io/mtahhan/vllm:arm-base-cpu
- New: docker.io/vllm/vllm-openai-cpu:v0.18.0
- Same image as DUT ensures version consistency

Impact:
- Embedding tests now work on x86_64 servers (AWS EC2)
- Users can still override with VLLM_BENCH_CONTAINER_IMAGE env var
- No impact on LLM tests (different benchmark tool)

Tested:
- EC2 x86_64 instances (DUT + Load Generator)
- Baseline and latency tests execute successfully

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>
@maryamtahhan
Copy link
Copy Markdown
Collaborator Author

@maryamtahhan make sure to also update mlfow integration and multi instance results conversion (to csv) to reflect this new feature.

Adds turnkey Ansible automation for deploying multiple vLLM instances on a single DUT with configurable nginx load balancing. Enables performance testing at scale with flexible configuration of instance count (1-10), cores per instance (8/16/32), SMT, prefix caching, and load balancing policies (round-robin/least-conn/ip-hash). Includes comprehensive documentation, example inventory, and integration with existing GuideLLM benchmark playbooks.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>
@maryamtahhan maryamtahhan force-pushed the multi-instance-test branch from 44c0408 to 4d50b81 Compare May 27, 2026 10:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant