Skip to content

Conversation

@milesial
Copy link
Contributor

@milesial milesial commented Oct 29, 2025

Overview:

See #3630 for context, this MR is a subset of it, focusing on HTTP fetch and b64 decode.
E2E plumbing will be done in a followup MR, the current behavior of URL passthrough stays the default.

image 506888486-0ff8a654-e710-4554-a9c3-22ee04d93fe5

Summary by CodeRabbit

  • New Features

    • Added multimodal media loading support enabling images and videos to be fetched from data URLs and HTTP/HTTPS sources.
    • Media processing now runs asynchronously for improved responsiveness.
  • Chores

    • Refactored media handling to support asynchronous operations.

@milesial milesial requested a review from a team as a code owner October 29, 2025 20:51
@github-actions github-actions bot added the feat label Oct 29, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 29, 2025

Walkthrough

Dependencies were added to support media fetching (reqwest, base64). The preprocessor's core functions were converted to async to enable multimodal media loading. New MediaLoader and EncodedMediaData types were introduced to handle media retrieval from URLs. All call sites were updated to use await syntax.

Changes

Cohort / File(s) Change Summary
Dependency Updates
lib/llm/Cargo.toml
Added reqwest (workspace) and base64 (0.22) for media operations; added mockito (1.7.0) as dev dependency
Preprocessor Core Async Conversion
lib/llm/src/preprocessor.rs
Converted preprocess_request and gather_multi_modal_data to async functions; added media_loader: Option<MediaLoader> field; integrated async media fetching workflow with conditional loader execution
Media Module Implementation
lib/llm/src/preprocessor/media/mod.rs, lib/llm/src/preprocessor/media/common.rs, lib/llm/src/preprocessor/media/loader.rs
Created new media submodule with EncodedMediaData struct for handling data/http(s) URLs, including base64 decoding; introduced MediaLoader struct with async fetch_media_part method for fetching image/video content
Test Updates
lib/llm/tests/preprocessor.rs
Updated preprocessor test to await async preprocess_request call

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Async/await conversion affects multiple call sites and function signatures
  • New media handling module introduces network I/O and URL parsing logic
  • Error handling chains (anyhow results) and encoding/decoding operations require careful verification
  • Particular attention needed on:
    • Correctness of async propagation through preprocessor call chain
    • EncodedMediaData URL scheme handling (data:, http, https) and base64 decoding logic
    • MediaLoader's content part type matching and error propagation
    • Test assertions align with async behavior

Poem

🐰 Hops through URLs, both high and low,
Base64 decoded, media flows,
Async await, the loader's song,
Multimodal dreams, oh so strong!

Pre-merge checks

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Description Check ⚠️ Warning The provided description is incomplete compared to the repository's PR description template. While an Overview section is present and provides context (referencing issue #3630 and noting that E2E plumbing will follow), the description is missing the Details section, the "Where should the reviewer start?" section, and a properly formatted Related Issues section using action keywords. The Overview alone is insufficient to meet the template's requirements, as it lacks substantive information about the specific changes made and guidance for reviewers. The author should add the missing sections: expand the Details section to describe what was changed and why (async function signatures, new media module structure, dependency additions), add a "Where should the reviewer start?" section highlighting key files like preprocessor.rs and the new media module files, and include a Related Issues section using a proper action keyword (e.g., "Relates to #3630") to link to the related issue.
✅ Passed checks (2 passed)
Check name Status Explanation
Title Check ✅ Passed The pull request title "feat: Media HTTP fetching and b64 decoding" clearly and concisely summarizes the main changes introduced in this PR. The changes focus on adding HTTP media fetching capabilities and base64 decoding support, which aligns directly with the implementation of the MediaLoader, EncodedMediaData, and related async infrastructure for media handling. The title is specific enough to convey the primary purpose without being overly verbose.
Docstring Coverage ✅ Passed Docstring coverage is 82.35% which is sufficient. The required threshold is 80.00%.

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3fe5653 and 0860585.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (6)
  • lib/llm/Cargo.toml (2 hunks)
  • lib/llm/src/preprocessor.rs (10 hunks)
  • lib/llm/src/preprocessor/media/common.rs (1 hunks)
  • lib/llm/src/preprocessor/media/loader.rs (1 hunks)
  • lib/llm/src/preprocessor/media/mod.rs (1 hunks)
  • lib/llm/tests/preprocessor.rs (1 hunks)
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2025-06-24T20:59:35.725Z
Learnt from: ishandhanani
PR: ai-dynamo/dynamo#1626
File: lib/llm/src/preprocessor.rs:238-239
Timestamp: 2025-06-24T20:59:35.725Z
Learning: In lib/llm/src/preprocessor.rs, the `sampling_options` call in the `preprocess_request` method is placed in the common section after the match statement on `request.prompt_input_type()`, meaning it applies to both `PromptInput::Tokens` and `PromptInput::Text` request types.

Applied to files:

  • lib/llm/tests/preprocessor.rs
  • lib/llm/src/preprocessor.rs
📚 Learning: 2025-09-16T19:47:30.312Z
Learnt from: KrishnanPrash
PR: ai-dynamo/dynamo#3067
File: lib/llm/src/preprocessor/prompt/template/oai.rs:87-134
Timestamp: 2025-09-16T19:47:30.312Z
Learning: In Dynamo, multimodal requests (containing image_url or other non-text content) are processed through a completely different workflow than text-only requests, so the may_be_fix_msg_content function in lib/llm/src/preprocessor/prompt/template/oai.rs will only encounter text-only content arrays.

Applied to files:

  • lib/llm/tests/preprocessor.rs
  • lib/llm/src/preprocessor.rs
📚 Learning: 2025-08-18T20:51:51.324Z
Learnt from: PeaBrane
PR: ai-dynamo/dynamo#2465
File: lib/runtime/src/pipeline/network/egress/push_router.rs:0-0
Timestamp: 2025-08-18T20:51:51.324Z
Learning: The runtime crate cannot depend on the llm crate due to architectural dependency constraints, preventing imports from lib/llm into lib/runtime.

Applied to files:

  • lib/llm/Cargo.toml
📚 Learning: 2025-08-25T22:04:45.205Z
Learnt from: nachiketb-nvidia
PR: ai-dynamo/dynamo#2700
File: lib/llm/src/protocols/openai/chat_completions/delta.rs:19-28
Timestamp: 2025-08-25T22:04:45.205Z
Learning: The response_generator() method exists on multiple request types in the codebase: NvCreateChatCompletionRequest (for chat completions) and NvCreateCompletionRequest (for text completions). When making signature changes, it's important to distinguish between these different object types as they have separate implementations and call sites.

Applied to files:

  • lib/llm/src/preprocessor.rs
🧬 Code graph analysis (3)
lib/llm/src/preprocessor/media/loader.rs (1)
lib/llm/src/preprocessor/media/common.rs (1)
  • from_url (17-43)
lib/llm/src/preprocessor/media/common.rs (1)
lib/llm/src/preprocessor/media/loader.rs (1)
  • new (19-25)
lib/llm/src/preprocessor.rs (2)
lib/llm/src/protocols/common/preprocessor.rs (2)
  • builder (95-97)
  • builder (132-134)
lib/llm/src/preprocessor/prompt.rs (1)
  • messages (51-51)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
  • GitHub Check: vllm (amd64)
  • GitHub Check: operator (arm64)
  • GitHub Check: trtllm (amd64)
  • GitHub Check: sglang (arm64)
  • GitHub Check: operator (amd64)
  • GitHub Check: sglang (amd64)
  • GitHub Check: clippy (.)
  • GitHub Check: tests (lib/bindings/python)
  • GitHub Check: tests (launch/dynamo-run)
  • GitHub Check: tests (.)
  • GitHub Check: clippy (launch/dynamo-run)
  • GitHub Check: tests (lib/runtime/examples)
  • GitHub Check: clippy (lib/bindings/python)
  • GitHub Check: Build and Test - dynamo
🔇 Additional comments (15)
lib/llm/Cargo.toml (2)

174-174: LGTM! Appropriate test dependency for HTTP mocking.

The mockito 1.7.0 dev dependency enables HTTP endpoint mocking in tests, as demonstrated in lib/llm/src/preprocessor/media/common.rs.


143-145: Dependencies correctly configured for media fetching.

Verification confirms reqwest (v0.12.22) is properly defined in workspace.dependencies with appropriate features, and lib/llm/Cargo.toml correctly references it via { workspace = true }. The base64 v0.22 dependency is appropriately specified for encoding support.

lib/llm/tests/preprocessor.rs (1)

554-554: LGTM! Correct async usage.

The test properly awaits the now-async preprocess_request method, consistent with the preprocessor's async conversion for media loading support.

lib/llm/src/preprocessor/media/mod.rs (1)

1-8: LGTM! Clean module structure.

The module properly declares submodules and exposes a minimal public API surface through EncodedMediaData and MediaLoader.

lib/llm/src/preprocessor/media/loader.rs (2)

19-25: LGTM! Simple and correct HTTP client initialization.

The constructor properly initializes the reqwest client with a user agent and propagates errors via Result<Self>.


27-50: LGTM! Clear media type handling with appropriate TODOs.

The method correctly:

  • Delegates URL fetching to EncodedMediaData::from_url
  • Supports ImageUrl and VideoUrl
  • Explicitly rejects AudioUrl with a clear error message
  • Documents future work (decode, NIXL registration) via TODOs

The TODOs align with the PR objectives stating that E2E plumbing will be added in a follow-up MR.

lib/llm/src/preprocessor/media/common.rs (4)

8-12: LGTM! Simple and appropriate data structure.

The struct cleanly separates raw bytes from the encoding state, enabling deferred base64 decoding.


17-43: LGTM! Correct URL handling with appropriate scheme support.

The implementation properly:

  • Handles data: URLs by extracting base64 content after the comma
  • Fetches http/https URLs via reqwest with error propagation
  • Validates non-empty responses
  • Rejects unsupported schemes with clear error messages

The deferred base64 decoding (line 26 stores as-is) aligns with the comment on line 16 about avoiding expensive operations in the tokio runtime.


46-52: LGTM! Correct conditional base64 decoding.

The method properly decodes base64-encoded bytes when the flag is set, using the standard base64 engine.


55-147: LGTM! Comprehensive test coverage.

The tests thoroughly cover:

  • Valid base64 data URLs (decode verification)
  • Empty base64 data (error case)
  • Invalid data URL format (error case)
  • HTTP 200 success with mockito
  • HTTP 404 error handling
  • Unsupported URL schemes (FTP)

This provides strong confidence in the implementation's correctness.

lib/llm/src/preprocessor.rs (5)

14-14: LGTM! Clean module addition.

The media module is properly exposed and MediaLoader is imported for use in the preprocessor.

Also applies to: 30-30


117-117: LGTM! Infrastructure prepared for future media loading.

The media_loader field is properly initialized to None, preserving current behavior (URL passthrough). The TODO on line 146 correctly notes that enabling this requires decoder configuration from the MDC, planned for a follow-up MR per the PR objectives.

Also applies to: 146-146, 154-154


274-345: LGTM! Correct async multimodal data gathering with appropriate TODOs.

The implementation correctly:

  • Converts the method to async (line 274)
  • Queues fetch tasks when media_loader is present (lines 315-324)
  • Falls back to URL passthrough when no loader is configured (lines 318-323)
  • Executes fetch tasks concurrently via join_all (lines 330-338)

Note: Fetched results (line 330 _results) are intentionally unused, as indicated by the TODO on line 337. This aligns with the PR objectives stating that E2E plumbing will be added in a follow-up MR. Since media_loader is always None (line 146), the fetch path is currently never executed, preserving existing behavior.


862-862: LGTM! Call sites properly updated to await async methods.

Both preprocess_request (line 862) and gather_multi_modal_data (line 989) are correctly awaited in their respective generate implementations.

Also applies to: 989-989


168-168: Verification confirmed: All call sites properly await the async function.

The script output shows both discovered call sites of preprocess_request are correctly using .await:

  • lib/llm/tests/preprocessor.rs:554 uses .await.unwrap()
  • lib/llm/src/preprocessor.rs:862 uses .await?

No changes are needed.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Signed-off-by: Alexandre Milesi <[email protected]>
Signed-off-by: Alexandre Milesi <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants