Skip to content

Honor executor srun_args for Ray COMMAND srun#440

Merged
hemildesai merged 4 commits intoNVIDIA-NeMo:mainfrom
hemildesai:codex/honor-executor-srun-args-ray-command
Mar 3, 2026
Merged

Honor executor srun_args for Ray COMMAND srun#440
hemildesai merged 4 commits intoNVIDIA-NeMo:mainfrom
hemildesai:codex/honor-executor-srun-args-ray-command

Conversation

@hemildesai
Copy link
Contributor

Summary

  • remove hardcoded permanent args from the Ray COMMAND launch srun path in ray.sub.j2
  • pass command_srun_args from SlurmRayRequest so launch args come from executor config (executor.srun_args)
  • for heterogeneous grouped runs, use resource_group[0].srun_args for the head/COMMAND launch path
  • add regression coverage to verify the generated COMMAND srun honors executor srun_args

Testing

  • uv run pytest test/run/ray/test_slurm_ray_request.py -q

athitten
athitten previously approved these changes Mar 3, 2026
Copy link

@athitten athitten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you @hemildesai !

Signed-off-by: Hemil Desai <hemild@nvidia.com>
Signed-off-by: Hemil Desai <hemild@nvidia.com>
Signed-off-by: Hemil Desai <hemild@nvidia.com>
Signed-off-by: Hemil Desai <hemild@nvidia.com>
@hemildesai hemildesai merged commit 7640137 into NVIDIA-NeMo:main Mar 3, 2026
22 of 24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants