fix(frontend): Fix smg serve reject vLLM OpenAI args by YouNeedCryDear · Pull Request #832 · lightseekorg/smg

YouNeedCryDear · 2026-03-20T00:46:57Z

Description

Problem

Fixes #824

smg serve only registered vLLM engine args during pass-2 parsing. That meant valid vLLM OpenAI/frontend flags
such as --enable-auto-tool-choice and --tool-call-parser were rejected before worker launch, even though they
are valid for the HTTP/OpenAI vLLM entrypoint.

The existing parser shape also blurred the mode-specific CLI surface for vLLM: HTTP workers need frontend args,
while gRPC workers do not.

Solution

Split vLLM arg registration into two parts:

keep engine arg registration in _add_vllm_args()
add a dedicated _add_vllm_frontend_args() helper for OpenAI/frontend flags

Then, in parse_serve_args(), only add the frontend args when --backend vllm and --connection-mode http are
selected. This keeps the parser aligned with the actual worker entrypoint:

HTTP mode accepts frontend + engine args
gRPC mode rejects frontend-only args early

Changes

updated bindings/python/src/smg/serve.py so _add_vllm_args() only loads vLLM engine CLI args
added _add_vllm_frontend_args() to load vllm.entrypoints.openai.cli_args.FrontendArgs when available
gated frontend arg registration in parse_serve_args() on serve_router_args.connection_mode == "http"
added regression tests in bindings/python/tests/test_serve.py for:
- HTTP mode accepting --enable-auto-tool-choice --tool-call-parser minimax_m2
- gRPC mode rejecting those frontend-only flags

Test Plan

Before:

smg serve --backend vllm --connection-mode http --model /tmp/model --enable-auto-tool-choice --tool-call-parser
minimax_m2

Expected old behavior:

argparse exits with unrecognized arguments: --enable-auto-tool-choice --tool-call-parser minimax_m2

cargo +nightly fmt passes
cargo clippy --all-targets --all-features -- -D warnings passes
(Optional) Documentation updated
(Optional) Please join us on Slack #sig-smg (https://slack.lightseek.org) to discuss, review, and merge PRs

Summary by CodeRabbit

New Features
- Support for vLLM OpenAI frontend arguments when serving with HTTP, enabling configuration of tool-choice and tool-call parsing.
Improvements
- Argument parsing now detects and reports missing vLLM dependencies more clearly and adapts backend argument handling accordingly.
Tests
- Added unit tests covering vLLM frontend argument behavior for HTTP and gRPC connection modes.

gemini-code-assist · 2026-03-20T00:47:09Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses an issue where smg serve was rejecting vLLM OpenAI API arguments. The core change involves enhancing the argument parsing logic for vLLM to properly handle these specific CLI flags. This ensures that users can leverage the full range of vLLM's OpenAI-compatible server features when using smg serve. Additionally, new tests have been added to validate this functionality and a warning is now issued when OpenAI arguments are provided in gRPC connection mode, clarifying expected behavior.

Highlights

vLLM Argument Parsing: Updated the smg serve command's vLLM argument parsing to correctly recognize and accept OpenAI-compatible server CLI flags, ensuring broader compatibility.
Two-Pass Parsing Behavior: Maintained the existing two-pass parsing behavior for smg serve while allowing vLLM worker arguments to pass through effectively.
Regression Tests Added: Introduced new regression tests within bindings/python/tests/test_serve.py to cover the vLLM argument parsing path, preventing future regressions.
gRPC Connection Mode Warning: Implemented a warning mechanism for the gRPC connection mode when OpenAI CLI arguments are present, informing users that these arguments will be ignored as SMG acts as the API server.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

coderabbitai · 2026-03-20T00:47:11Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Dynamically load vLLM engine arg classes, add optional vLLM OpenAI frontend CLI arg registration, and inject those frontend args into serve argument parsing when backend == "vllm" and connection_mode == "http"; tests validate HTTP acceptance and gRPC rejection of frontend flags.

Changes

Cohort / File(s)	Summary
vLLM serve changes `bindings/python/src/smg/serve.py`	Prefer `AsyncEngineArgs` from `vllm.engine.arg_utils` with `EngineArgs` fallback; add `_add_vllm_frontend_args` to conditionally register OpenAI frontend CLI args; inject frontend args into the final parse when `backend=="vllm"` and `connection_mode=="http"`; raise parser error if vLLM args import fails.
Serve tests (vLLM frontend handling) `bindings/python/tests/test_serve.py`	Add tests covering: HTTP mode accepts and preserves vLLM frontend flags in `backend_args`; gRPC mode rejects those flags (SystemExit code 2) and does not call frontend registration.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

feat(serve): add unified --connection-mode and fix gRPC health checks #335: Modifies bindings/python/src/smg/serve.py CLI/serve argument handling and vLLM frontend arg behavior — closely related.
fix parameters pass through for trtllm #509: Changes backend-specific argument parsing in serve.py, overlapping with second-pass parsing adjustments.
feat: smg serve pass through engine args to engine #460: Related changes to parse_serve_args and vLLM/frontend arg forwarding.

Suggested reviewers

key4ng
slin1237
gongwei-130

Poem

"I nibbled through the parser's hedge,
Found Async tucked behind the ledge,
HTTP welcomes frontend flair,
gRPC keeps those flags elsewhere,
A rabbit's hop — now arguments pair."

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 36.36% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and specifically identifies the fix: resolving the rejection of vLLM OpenAI arguments in the smg serve command.
Linked Issues check	✅ Passed	All coding objectives from issue `#824` are met: vLLM frontend args are now registered via _add_vllm_frontend_args(), HTTP mode accepts OpenAI flags, gRPC mode rejects them early, and regression tests validate both scenarios.
Out of Scope Changes check	✅ Passed	All changes are directly aligned with issue `#824`: modifications to serve.py handle vLLM arg registration, and new tests validate the fix without introducing unrelated alterations.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7e796a884d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

bindings/python/src/smg/serve.py

gemini-code-assist

Code Review

This pull request updates the vLLM argument parsing in smg serve to correctly handle OpenAI-compatible server flags and adds corresponding regression tests. The changes to support additional vLLM arguments are well-implemented. However, I've found a critical bug in the argument parsing logic that could lead to a NameError or incorrect behavior. I've provided a specific comment with a suggested fix to address this issue.

bindings/python/src/smg/serve.py

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@bindings/python/src/smg/serve.py`:
- Around line 526-534: The control flow around backend argument parsing is
wrong: change the second `if` to `elif` so `args` set by
`parser.parse_known_args` for `backend == "trtllm"` isn't immediately
overwritten, and ensure `args` is always defined before being referenced;
specifically update the block that calls `parser.parse_known_args` and
`parser.parse_args` (symbols: backend, args, parser.parse_known_args,
parser.parse_args) so the condition using `getattr(args, "connection_mode",
"grpc")` and `logger.warning` only runs in the `elif backend == "vllm" ...`
branch, preserving the intended behavior for both `trtllm` and `vllm`.

In `@bindings/python/tests/test_serve.py`:
- Around line 289-320: Summary: parse_serve_args in serve.py has a control-flow
bug that leads to a NameError because parser/backend_args are referenced before
being set when handling backend-specific args (e.g., "vllm"). Fix: ensure the
ArgumentParser instance is created before any backend-specific call and that
_import_backend_args(backend, parser) is invoked in the branch that handles
backend-specific flags (or unconditionally after parser creation), then call
parser.parse_known_args() (or otherwise collect backend_args) and assign
backend_args in all code paths; update parse_serve_args so symbols parser, args,
and backend_args are always defined before they are returned (refer to
parse_serve_args and _import_backend_args to locate the change).
- Around line 322-351: The test fails with a NameError due to a control-flow bug
in serve.py's parse_serve_args where backend-specific CLI args (via
_import_backend_args / parser population) are not always imported/defined before
being referenced for the vllm + connection_mode == "grpc" path; update
parse_serve_args so it always constructs and populates the argparse parser
(calling _import_backend_args or the vllm branch) before any logic that reads
backend-specific arguments or calls logger.warning, and ensure any variables
referenced in the gRPC-warning branch (e.g., the parser, args, or
backend-specific flags) are initialized regardless of code path.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: e157cc9d-d627-4cc0-946a-d1f1b1967939

📥 Commits

Reviewing files that changed from the base of the PR and between 4ecd869 and 7e796a8.

📒 Files selected for processing (2)

bindings/python/src/smg/serve.py
bindings/python/tests/test_serve.py

bindings/python/src/smg/serve.py

bindings/python/tests/test_serve.py

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3cc48129fe

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

bindings/python/src/smg/serve.py

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5797d231be

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

bindings/python/src/smg/serve.py

chatgpt-codex-connector · 2026-03-25T02:21:59Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Repo admins can enable using credits for code reviews in their settings.

CatherineSue

LGTM. Will merge after CI passes

Signed-off-by: Arthur Cheng <arthur.cheng@oracle.com>

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@bindings/python/src/smg/serve.py`:
- Around line 380-393: The vLLM frontend args are being registered wholesale in
_add_vllm_frontend_args, which allows undesired transport flags (--host, --port,
--uds) that break SMG HTTP worker routing; update _add_vllm_frontend_args to
import FrontendArgs (if present) but either (a) call FrontendArgs.add_cli_args
with a filtered argparse.ArgumentParser that has the --host, --port, and --uds
options removed/disabled before registration or (b) register only the safe
subset of flags (e.g., enable-prompt-tokens-details, enable-auto-tool-choice)
instead of calling add_cli_args; ensure VllmWorkerLauncher.build_command still
injects SMG --host/--port and add a regression test that starts a vLLM worker in
HTTP mode with transport-layer flags present (including --uds) and verifies the
worker binds to SMG host:port (health checks succeed).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: f9ba4bf7-d5e2-4671-ab48-385d1148524b

📥 Commits

Reviewing files that changed from the base of the PR and between bfcfad8 and 249ffe9.

📒 Files selected for processing (2)

bindings/python/src/smg/serve.py
bindings/python/tests/test_serve.py

bindings/python/src/smg/serve.py

Signed-off-by: Arthur Cheng <arthur.cheng@oracle.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1ea8c7b303

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

bindings/python/src/smg/serve.py

…seekorg#1041) Signed-off-by: Ziwen Zhao <zzw.mose@gmail.com> feat(metrics-ws): [3/4] add event-driven and polled collectors (lightseekorg#1027) Signed-off-by: key4ng <rukeyang@gmail.com> fix(responses): stop defaulting top_p for omitted requests (lightseekorg#1043) Signed-off-by: Ziwen Zhao <zzw.mose@gmail.com> fix(frontend): Fix smg serve reject vLLM OpenAI args (lightseekorg#832) Signed-off-by: Arthur Cheng <arthur.cheng@oracle.com> feat(realtime-api): WebRTC relay bridge (lightseekorg#733) Signed-off-by: yifeliu <yifengliu9@gmail.com> feat: align image-generation MCP behavior and response formatting Signed-off-by: TingtingZhou7 <zhoutt96@gmail.com> fix(image): only treat wrapped payload with result as image payload Signed-off-by: TingtingZhou7 <zhoutt96@gmail.com> refactor(image): remove unused fallback candidate in wrapped extractor Signed-off-by: TingtingZhou7 <zhoutt96@gmail.com> fix(image): preserve image tool arguments during sanitization Signed-off-by: TingtingZhou7 <zhoutt96@gmail.com> refactor(image): only inject default model in builtin sanitizer Signed-off-by: TingtingZhou7 <zhoutt96@gmail.com> fix(image): force override builtin image model in sanitizer Signed-off-by: TingtingZhou7 <zhoutt96@gmail.com> fix(image): force png output_compression to 100 Signed-off-by: TingtingZhou7 <zhoutt96@gmail.com> revert(image): restore minimal sanitizer and document fallback Signed-off-by: TingtingZhou7 <zhoutt96@gmail.com> refactor(responses): update image tool serde attr and preserve raw non-image output strings Signed-off-by: TingtingZhou7 <zhoutt96@gmail.com> refactor(image): rename generate_image to image_generation Signed-off-by: TingtingZhou7 <zhoutt96@gmail.com> revert(image): restore generate_image builtin naming Signed-off-by: TingtingZhou7 <zhoutt96@gmail.com> fix(image-context): retain image failure status and error in compact output Signed-off-by: TingtingZhou7 <zhoutt96@gmail.com> Fix Signed-off-by: TingtingZhou7 <zhoutt96@gmail.com>

YouNeedCryDear requested review from CatherineSue, gongwei-130, key4ng and slin1237 as code owners March 20, 2026 00:46

github-actions bot added python-bindings Python bindings changes tests Test changes labels Mar 20, 2026

chatgpt-codex-connector bot reviewed Mar 20, 2026

View reviewed changes

bindings/python/src/smg/serve.py Outdated Show resolved Hide resolved

gemini-code-assist bot reviewed Mar 20, 2026

View reviewed changes

bindings/python/src/smg/serve.py Show resolved Hide resolved

coderabbitai bot requested changes Mar 20, 2026

View reviewed changes

bindings/python/src/smg/serve.py Show resolved Hide resolved

bindings/python/tests/test_serve.py Show resolved Hide resolved

bindings/python/tests/test_serve.py Outdated Show resolved Hide resolved

coderabbitai bot approved these changes Mar 20, 2026

View reviewed changes

YouNeedCryDear force-pushed the fix/smg-serve-vllm branch from 7e796a8 to 3cc4812 Compare March 20, 2026 01:50

chatgpt-codex-connector bot reviewed Mar 20, 2026

View reviewed changes

bindings/python/src/smg/serve.py Outdated Show resolved Hide resolved

YouNeedCryDear force-pushed the fix/smg-serve-vllm branch from 3cc4812 to 5797d23 Compare March 20, 2026 03:10

chatgpt-codex-connector bot reviewed Mar 20, 2026

View reviewed changes

bindings/python/src/smg/serve.py Show resolved Hide resolved

YouNeedCryDear force-pushed the fix/smg-serve-vllm branch from 5797d23 to bfcfad8 Compare March 20, 2026 19:05

YouNeedCryDear changed the title ~~(fix): Fix smg serve reject vLLM OpenAI args~~ fix (frontend): Fix smg serve reject vLLM OpenAI args Mar 20, 2026

CatherineSue reviewed Mar 25, 2026

View reviewed changes

CatherineSue changed the title ~~fix (frontend): Fix smg serve reject vLLM OpenAI args~~ fix(frontend): Fix smg serve reject vLLM OpenAI args Mar 25, 2026

fix smg serve reject vLLM openai args

249ffe9

Signed-off-by: Arthur Cheng <arthur.cheng@oracle.com>

CatherineSue force-pushed the fix/smg-serve-vllm branch from ff6b3af to 249ffe9 Compare March 27, 2026 16:59

coderabbitai bot requested changes Mar 27, 2026

View reviewed changes

bindings/python/src/smg/serve.py Show resolved Hide resolved

YouNeedCryDear added 3 commits April 1, 2026 11:09

Merge branch 'main' into fix/smg-serve-vllm

8bd5ea7

Merge branch 'main' into fix/smg-serve-vllm

3d18362

filter uds from vllm engine

5eb6eec

Signed-off-by: Arthur Cheng <arthur.cheng@oracle.com>

coderabbitai bot approved these changes Apr 2, 2026

View reviewed changes

slin1237 approved these changes Apr 7, 2026

View reviewed changes

Merge branch 'main' into fix/smg-serve-vllm

1ea8c7b

format changes

118c6c9

Signed-off-by: Arthur Cheng <arthur.cheng@oracle.com>

chatgpt-codex-connector bot reviewed Apr 7, 2026

View reviewed changes

bindings/python/src/smg/serve.py Outdated Show resolved Hide resolved

slin1237 merged commit 762600b into lightseekorg:main Apr 7, 2026
33 checks passed

Conversation

YouNeedCryDear commented Mar 20, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Problem

Solution

Changes

Test Plan

Summary by CodeRabbit

Uh oh!

gemini-code-assist bot commented Mar 20, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

coderabbitai bot commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector bot commented Mar 25, 2026

Uh oh!

CatherineSue left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

YouNeedCryDear commented Mar 20, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 20, 2026 •

edited

Loading