-
Notifications
You must be signed in to change notification settings - Fork 714
feat: Add security flag to MM flow in vllm #4556
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…nto ibhosale_vllm_mm_flag
WalkthroughA new Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Poem
Pre-merge checks❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
Tip 📝 Customizable high-level summaries are now available in beta!You can now customize how CodeRabbit generates the high-level summary in your pull requests — including its content, structure, tone, and formatting.
Example instruction:
Note: This feature is currently in beta for Pro-tier users, and pricing will be announced later. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
components/src/dynamo/vllm/args.py(4 hunks)docs/backends/vllm/multimodal.md(1 hunks)examples/backends/vllm/launch/agg_multimodal_epd.sh(1 hunks)examples/backends/vllm/launch/agg_multimodal_llama.sh(1 hunks)examples/backends/vllm/launch/disagg_multimodal_epd.sh(2 hunks)examples/backends/vllm/launch/disagg_multimodal_llama.sh(1 hunks)
🧰 Additional context used
🧠 Learnings (4)
📓 Common learnings
Learnt from: ptarasiewiczNV
Repo: ai-dynamo/dynamo PR: 2027
File: container/deps/vllm/install_vllm.sh:0-0
Timestamp: 2025-07-22T10:22:28.972Z
Learning: The `--torch-backend=auto` flag works with vLLM installations via uv pip install, even though it's not a standard pip option. This flag is processed by vLLM's build system during installation to automatically match PyTorch distribution with container CUDA versions.
📚 Learning: 2025-10-28T05:48:37.621Z
Learnt from: ayushag-nv
Repo: ai-dynamo/dynamo PR: 3634
File: components/src/dynamo/vllm/multimodal_utils/model.py:39-42
Timestamp: 2025-10-28T05:48:37.621Z
Learning: In components/src/dynamo/vllm/multimodal_utils/model.py, the AutoModel.from_pretrained call with trust_remote_code=True in the load_vision_model function is intentional and expected for the vLLM multimodal implementation.
Applied to files:
examples/backends/vllm/launch/agg_multimodal_llama.sh
📚 Learning: 2025-07-22T10:22:28.972Z
Learnt from: ptarasiewiczNV
Repo: ai-dynamo/dynamo PR: 2027
File: container/deps/vllm/install_vllm.sh:0-0
Timestamp: 2025-07-22T10:22:28.972Z
Learning: The `--torch-backend=auto` flag works with vLLM installations via uv pip install, even though it's not a standard pip option. This flag is processed by vLLM's build system during installation to automatically match PyTorch distribution with container CUDA versions.
Applied to files:
examples/backends/vllm/launch/agg_multimodal_llama.sh
📚 Learning: 2025-07-03T10:14:30.570Z
Learnt from: fsaady
Repo: ai-dynamo/dynamo PR: 1730
File: examples/sglang/slurm_jobs/scripts/worker_setup.py:230-244
Timestamp: 2025-07-03T10:14:30.570Z
Learning: In examples/sglang/slurm_jobs/scripts/worker_setup.py, background processes (like nats-server, etcd) are intentionally left running even if later processes fail. This design choice allows users to manually connect to nodes and debug issues without having to restart the entire SLURM job from scratch, providing operational flexibility for troubleshooting in cluster environments.
Applied to files:
examples/backends/vllm/launch/disagg_multimodal_llama.sh
🪛 LanguageTool
docs/backends/vllm/multimodal.md
[grammar] ~27-~27: Ensure spelling is correct
Context: ...out --enable-multimodal. This flag is analogus to --enable-mm-embeds in vllm serve. ...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
🪛 Ruff (0.14.5)
components/src/dynamo/vllm/args.py
234-234: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: operator (amd64)
- GitHub Check: trtllm (arm64)
- GitHub Check: trtllm (amd64)
- GitHub Check: sglang (amd64)
- GitHub Check: vllm (amd64)
- GitHub Check: vllm (arm64)
- GitHub Check: sglang (arm64)
- GitHub Check: Build and Test - dynamo
🔇 Additional comments (8)
examples/backends/vllm/launch/agg_multimodal_llama.sh (1)
14-17: LGTM! Security flag correctly added.The
--enable-multimodalflag is properly added to both the processor and encode-prefill-worker invocations, aligning with the security requirement to prevent unintended multimodal data processing.examples/backends/vllm/launch/disagg_multimodal_llama.sh (1)
51-59: LGTM! Consistent security flag application.The
--enable-multimodalflag is correctly added to all three multimodal components (processor, encode-prefill-worker, and decode-worker) in the disaggregated setup, ensuring comprehensive security coverage.examples/backends/vllm/launch/agg_multimodal_epd.sh (1)
78-82: LGTM! Security flag properly applied.The
--enable-multimodalflag is correctly added to all three multimodal components (processor, encode-worker, and multimodal-worker), ensuring the security requirement is enforced across the aggregated EPD setup.components/src/dynamo/vllm/args.py (4)
63-63: LGTM! Good security-by-default design.The
enable_multimodalflag defaults toFalse, following the security principle of requiring explicit opt-in for multimodal processing.
163-167: LGTM! Clear CLI argument definition.The
--enable-multimodalCLI argument is properly defined withstore_trueaction and clear help text explaining its purpose.
233-234: LGTM! Effective validation logic.The validation correctly enforces that any multimodal component flag requires
--enable-multimodalto be set, preventing unintended multimodal data processing. The error message is clear and actionable.Note: The Ruff TRY003 hint is a style preference suggesting shorter messages or custom exception classes, but the current implementation is perfectly acceptable for this use case.
274-274: LGTM! Proper flag propagation.The
enable_multimodalflag is correctly propagated from CLI args to the config object, consistent with the pattern used for other configuration flags.examples/backends/vllm/launch/disagg_multimodal_epd.sh (1)
79-99: LGTM! Comprehensive security flag coverage.The
--enable-multimodalflag is correctly added to all four multimodal components (processor, encode-worker, prefill-worker, and decode-worker) in the disaggregated EPD setup, ensuring complete security coverage across the entire multimodal pipeline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New changes look good to me 👍. Great Work 💯
I believe the ARM container builds in CI are experiencing network problems, so you may have to re-run failed jobs a few times, before you get a successful run.
|
/ok to test 6144eb1 |
|
Also, with the most recent changes, I believe The script runs for test cases on our Gitlab vllm_gpu_2 job. |
Thanks for spotting this!! :) |
dmitry-tokarev-nv
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Let's test it well.
|
Updating branch to pick up CI fixes from #4561. |
Co-authored-by: KrishnanPrash <[email protected]>
Co-authored-by: Indrajit Bhosale <[email protected]> Co-authored-by: KrishnanPrash <[email protected]>
Co-authored-by: KrishnanPrash <[email protected]>
Overview:
Adds a required
--enable-multimodalsecurity flag to explicitly enable multimodal processing in vLLM workers, preventing unintended processing of multimodal data from untrusted sources.Similar to https://github.com/vllm-project/vllm/pull/27204/files
Details:
Where should the reviewer start?
Summary by CodeRabbit
New Features
--enable-multimodalflag for explicit multimodal processing control. This flag is now required when using multimodal components.Documentation
Chores
✏️ Tip: You can customize this high-level summary in your review settings.