Skip to content

Conversation

@Monokaix
Copy link

@Monokaix Monokaix commented Nov 24, 2025

Overview:

make hostnames more descriptive and simplify dns check command

Details:

Changes:

  1. The hostnames var in mpirun -H command both include leader/worker pods, so it's more suitable to use all hostnames instead of only worker, or it will cause some misunderstanding
  2. The dns ready check should exclude leader pod itself and only check worker pods is enough, which is simpler and more efficient

Where should the reviewer start?

based on: #4477

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

  • closes GitHub issue: #xxx

Summary by CodeRabbit

  • Refactor

    • Improved internal hostname handling in the deployment backend to enhance multinode configuration support and deployment orchestration reliability.
  • Tests

    • Updated test suite to validate the refactored hostname generation logic.

✏️ Tip: You can customize this high-level summary in your review settings.

@Monokaix Monokaix requested a review from a team as a code owner November 24, 2025 06:27
@copy-pr-bot
Copy link

copy-pr-bot bot commented Nov 24, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@github-actions
Copy link

👋 Hi Monokaix! Thank you for contributing to ai-dynamo/dynamo.

Just a reminder: The NVIDIA Test Github Validation CI runs an essential subset of the testing framework to quickly catch errors.Your PR reviewers may elect to test the changes comprehensively before approving your changes.

🚀

@github-actions github-actions bot added the external-contribution Pull request is from an external contributor label Nov 24, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 24, 2025

Walkthrough

The PR refactors hostname generation in the TRTLLM backend, replacing a comma-separated string return type with a string slice return type. The function generateWorkerHostnames is renamed to hostNamesList, with callers updated to join the slice when needed for MPI configuration and DNS-wait logic in multinode deployments.

Changes

Cohort / File(s) Summary
Hostname generation refactoring
deploy/cloud/operator/internal/dynamo/backend_trtllm.go
Function generateWorkerHostnames replaced with hostNamesList returning []string instead of string. MPI host string now constructed via strings.Join(hostNamesList, ","). DNS-wait logic updated to use hostNamesList[1:] for worker list construction.
Test updates
deploy/cloud/operator/internal/dynamo/backend_trtllm_test.go
Test renamed to TestTRTLLMBackend_hostNamesList. Verification switched from substring containment checks to exact list equality using reflect.DeepEqual on the returned slice.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

  • Verify that all call sites of hostNamesList properly handle the slice return type and join as needed
  • Confirm hostNamesList[1:] slicing logic in DNS-wait path is correct for multinode scenarios
  • Ensure test expectations align with refactored function behavior

Poem

A rabbit once tangled with strings all askew,
But slices of hostnames made jobs shiny new,
With join and with slice, the addresses unite,
MPI marches onward, the deployment's quite right! 🐰

Pre-merge checks

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description check ✅ Passed The description covers all required template sections with specific details about the rationale and changes, though the related issue section remains as a placeholder.
Title check ✅ Passed The PR title accurately describes the main refactoring changes: making hostnames more descriptive (using all hostnames instead of just worker hostnames) and simplifying the DNS check command (excluding leader and checking only workers).

Tip

📝 Customizable high-level summaries are now available in beta!

You can now customize how CodeRabbit generates the high-level summary in your pull requests — including its content, structure, tone, and formatting.

  • Provide your own instructions using the high_level_summary_instructions setting.
  • Format the summary however you like (bullet lists, tables, multi-section layouts, contributor stats, etc.).
  • Use high_level_summary_in_walkthrough to move the summary from the description to the walkthrough section.

Example instruction:

"Divide the high-level summary into five sections:

  1. 📝 Description — Summarize the main change in 50–60 words, explaining what was done.
  2. 📓 References — List relevant issues, discussions, documentation, or related PRs.
  3. 📦 Dependencies & Requirements — Mention any new/updated dependencies, environment variable changes, or configuration updates.
  4. 📊 Contributor Summary — Include a Markdown table showing contributions:
    | Contributor | Lines Added | Lines Removed | Files Changed |
  5. ✔️ Additional Notes — Add any extra reviewer context.
    Keep each section concise (under 200 words) and use bullet or numbered lists for clarity."

Note: This feature is currently in beta for Pro-tier users, and pricing will be announced later.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@Monokaix Monokaix changed the title make hostnames more descriptive and simplify dns check command refactor: make hostnames more descriptive and simplify dns check command Nov 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

external-contribution Pull request is from an external contributor refactor size/M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant