This project is neither sponsored nor supported by NVIDIA.
Use of NVIDIA NVSHMEM is governed by the terms at NVSHMEM Software License Agreement.
-
GDRCopy (v2.4 and above recommended) is a low-latency GPU memory copy library based on NVIDIA GPUDirect RDMA technology, and it requires kernel module installation with root privileges.
-
Hardware requirements
- GPUDirect RDMA capable devices, see GPUDirect RDMA Documentation
- InfiniBand GPUDirect Async (IBGDA) support, see IBGDA Overview
- For more detailed requirements, see NVSHMEM Hardware Specifications
GDRCopy requires kernel module installation on the host system. Complete these steps on the bare-metal host before container deployment:
wget https://github.com/NVIDIA/gdrcopy/archive/refs/tags/v2.4.4.tar.gz
cd gdrcopy-2.4.4/
make -j$(nproc)
sudo make prefix=/opt/gdrcopy install
After compiling the software, you need to install the appropriate packages based on your Linux distribution.
For instance, using Ubuntu 22.04 and CUDA 12.3 as an example:
cd packages
CUDA=/path/to/cuda ./build-deb-packages.sh
sudo dpkg -i gdrdrv-dkms_2.4.4_amd64.Ubuntu22_04.deb \
libgdrapi_2.4.4_amd64.Ubuntu22_04.deb \
gdrcopy-tests_2.4.4_amd64.Ubuntu22_04+cuda12.3.deb \
gdrcopy_2.4.4_amd64.Ubuntu22_04.deb
sudo ./insmod.sh # Load kernel modules on the bare-metal system
For containerized environments:
- Host: keep kernel modules loaded (
gdrdrv
) - Container: install DEB packages without rebuilding modules:
sudo dpkg -i gdrcopy_2.4.4_amd64.Ubuntu22_04.deb \ libgdrapi_2.4.4_amd64.Ubuntu22_04.deb \ gdrcopy-tests_2.4.4_amd64.Ubuntu22_04+cuda12.3.deb
gdrcopy_copybw # Should show bandwidth test results
Download NVSHMEM v3.1.7 from the NVIDIA NVSHMEM Archive.
Navigate to your NVSHMEM source directory and apply our provided patch:
git apply /path/to/deep_ep/dir/third-party/nvshmem.patch
Enable IBGDA by modifying /etc/modprobe.d/nvidia.conf
:
options nvidia NVreg_EnableStreamMemOPs=1 NVreg_RegistryDwords="PeerMappingOverride=1;"
Update kernel configuration:
sudo update-initramfs -u
sudo reboot
For more detailed configurations, please refer to the NVSHMEM Installation Guide.
The following example demonstrates building NVSHMEM with IBGDA support:
CUDA_HOME=/path/to/cuda && \
GDRCOPY_HOME=/path/to/gdrcopy && \
NVSHMEM_SHMEM_SUPPORT=0 \
NVSHMEM_UCX_SUPPORT=0 \
NVSHMEM_USE_NCCL=0 \
NVSHMEM_IBGDA_SUPPORT=1 \
NVSHMEM_PMIX_SUPPORT=0 \
NVSHMEM_TIMEOUT_DEVICE_POLLING=0 \
NVSHMEM_USE_GDRCOPY=1 \
cmake -S . -B build/ -DCMAKE_INSTALL_PREFIX=/path/to/your/dir/to/install
cd build
make -j$(nproc)
make install
Set environment variables in your shell configuration:
export NVSHMEM_DIR=/path/to/your/dir/to/install # Use for DeepEP installation
export LD_LIBRARY_PATH="${NVSHMEM_DIR}/lib:$LD_LIBRARY_PATH"
export PATH="${NVSHMEM_DIR}/bin:$PATH"
nvshmem-info -a # Should display details of nvshmem