Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Base nvbandwidth on cuda image #26

Merged
merged 1 commit into from
Mar 18, 2025

Conversation

guptaNswati
Copy link
Contributor

For OSRB compliance, we need to base the nvbandwidth image on a cuda base image.

Single node bandwidth test with the updated base image:

docker run -it -e NVIDIA_IMEX_CHANNELS=0 -e NVIDIA_VISIBLE_DEVICES=all nvbandwidth:0.1 --allow-run-as-root  -n 2 nvbandwidth -p multinode_device_to_device_memcpy_read_ce
nvbandwidth Version: v0.7
Built from Git version: v0.7

MPI version: Open MPI v4.1.2, package: Debian OpenMPI, ident: 4.1.2, repo rev: v4.1.2, Nov 24, 2021
CUDA Runtime Version: 12080
CUDA Driver Version: 12080
Driver Version: 570.86.15

Process 0 (c893d00c6756): device 0: NVIDIA GH200 96GB HBM3 (00000009:01:00)
Process 1 (c893d00c6756): device 1: NVIDIA GH200 96GB HBM3 (00000019:01:00)

Running multinode_device_to_device_memcpy_read_ce.
memcpy CE GPU(row) -> GPU(column) bandwidth (GB/s)
           0         1
 0       N/A    391.93
 1    391.88       N/A

SUM multinode_device_to_device_memcpy_read_ce 783.81

NOTE: The reported results may not reflect the full capabilities of the platform.
Performance can vary with software drivers, hardware clocks, and system topology.

Copy link

copy-pr-bot bot commented Mar 17, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@guptaNswati
Copy link
Contributor Author

/ok-to-test

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot wasn't able to review any files in this pull request.

Files not reviewed (1)
  • deployments/container/nvbandwidth/Dockerfile: Language not supported
@ArangoGutierrez ArangoGutierrez merged commit 82668fb into NVIDIA:main Mar 18, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants