Skip to content

Add sbatch network arg#230

Merged
hemildesai merged 1 commit intoNVIDIA-NeMo:mainfrom
youngeunkwon0405:add_sbatch_network_arg
May 9, 2025
Merged

Add sbatch network arg#230
hemildesai merged 1 commit intoNVIDIA-NeMo:mainfrom
youngeunkwon0405:add_sbatch_network_arg

Conversation

@youngeunkwon0405
Copy link
Copy Markdown
Contributor

Example for setting #SBATCH --network=sharp

    executor = run.SlurmExecutor(
        account=account,
        partition=partition,
        tunnel=run.LocalTunnel(
            job_dir=os.path.join(log_dir, "experiments"),
        ),
        nodes=nodes,
        ntasks_per_node=num_gpus_per_node,
        container_image=container_image,
        container_mounts=mounts,
        env_vars=env_vars,
        srun_args=srun_args,
        time=time_limit,
        mem="0",
        exclusive=True,
        packager=run.GitArchivePackager(),
        segment=segment,
        network="sharp",
    )

Signed-off-by: Youngeun Kwon <youngeunk@nvidia.com>
@erhoo82
Copy link
Copy Markdown

erhoo82 commented May 9, 2025

Can you add the interface to set this in the perf script as well?

@youngeunkwon0405
Copy link
Copy Markdown
Contributor Author

Can you add the interface to set this in the perf script as well?

Sure, I can. I will create a NEMO PR for that.

@youngeunkwon0405
Copy link
Copy Markdown
Contributor Author

Can you add the interface to set this in the perf script as well?

Created a NeMO PR!13521. This PR also includes an interface to enable UBR.

@erhoo82
Copy link
Copy Markdown

erhoo82 commented May 9, 2025

@ko3n1g We don't need CI for this change.

@hemildesai hemildesai merged commit f3c3ac2 into NVIDIA-NeMo:main May 9, 2025
18 of 20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants