diff --git a/gpu-operator/gpu-operator-rdma.rst b/gpu-operator/gpu-operator-rdma.rst index 07974d315..9d8582615 100644 --- a/gpu-operator/gpu-operator-rdma.rst +++ b/gpu-operator/gpu-operator-rdma.rst @@ -35,7 +35,7 @@ new kernel module ``nvidia-peermem`` is included in the standard NVIDIA driver i kernel module provides Mellanox Infiniband-based HCAs direct peer-to-peer read and write access to the GPU's memory. Starting with v23.9.1 of the Operator, the Operator uses GDS driver version 2.17.5 or newer. -This version and higher is only supported with the NVIDIA open kernel driver. +This version and higher is only supported with the NVIDIA Open GPU Kernel module driver. The sample commands for installing the Operator include the ``--set useOpenKernelModules=true`` command-line argument for Helm. @@ -386,7 +386,7 @@ The following section is applicable to the following configurations and describe * Kubernetes on bare metal and on vSphere VMs with GPU passthrough and vGPU. Starting with v22.9.1, the GPU Operator provides an option to load the ``nvidia-fs`` kernel module during the bootstrap of the NVIDIA driver daemonset. -Starting with v23.9.1, the GPU Operator deploys a version of GDS that requires using the NVIDIA open kernel driver. +Starting with v23.9.1, the GPU Operator deploys a version of GDS that requires using the NVIDIA Open GPU Kernel module driver. The following sample command applies to clusters that use the Network Operator to install the MLNX_OFED drivers. diff --git a/gpu-operator/life-cycle-policy.rst b/gpu-operator/life-cycle-policy.rst index 2d752a9c7..adb72afe9 100644 --- a/gpu-operator/life-cycle-policy.rst +++ b/gpu-operator/life-cycle-policy.rst @@ -159,7 +159,7 @@ Refer to :ref:`Upgrading the NVIDIA GPU Operator` for more information. .. _gds-open-kernel: :sup:`1` - This release of the GDS driver requires that you use the NVIDIA open kernel driver for the GPUs. + This release of the GDS driver requires that you use the NVIDIA Open GPU Kernel module driver for the GPUs. Refer to :doc:`gpu-operator-rdma` for more information. .. note:: diff --git a/gpu-operator/platform-support.rst b/gpu-operator/platform-support.rst index 28029b536..b811a8e31 100644 --- a/gpu-operator/platform-support.rst +++ b/gpu-operator/platform-support.rst @@ -41,19 +41,31 @@ Supported NVIDIA Data Center GPUs and Systems The following NVIDIA data center GPUs are supported on x86 based platforms: +.. _open-kern-module: #requires-open-kernel-module +.. |open-kern-module| replace:: :sup:`1` + .. tab-set:: .. tab-item:: GH-series Products + .. list-table:: :header-rows: 1 * - Product - Architecture - * - NVIDIA GH200 + * - NVIDIA GH200 |open-kern-module|_ - NVIDIA Grace Hopper + .. _requires-open-kernel-module: + + :sup:`1` + NVIDIA GH200 systems require the NVIDIA Open GPU Kernel module driver. + You can install the open kernel modules by specifying the ``driver.useOpenKernelModules=true`` + argument to the ``helm`` command. + Refer to :ref:`chart customization options` for more information. + .. tab-item:: A, H and L-series Products :selected: @@ -466,7 +478,7 @@ Supported operating systems and NVIDIA GPU Drivers with GPUDirect Storage. .. note:: Version v2.17.5 and higher of the NVIDIA GPUDirect Storage kernel driver, ``nvidia-fs``, - requires the NVIDIA open kernel modules. + requires the NVIDIA Open GPU Kernel module driver. You can install the open kernel modules by specifying the ``driver.useOpenKernelModules=true`` argument to the ``helm`` command. Refer to :ref:`chart customization options` for more information. diff --git a/gpu-operator/release-notes.rst b/gpu-operator/release-notes.rst index feac27e6a..e257c0c09 100644 --- a/gpu-operator/release-notes.rst +++ b/gpu-operator/release-notes.rst @@ -50,8 +50,9 @@ New Features - Run Ubuntu 22.04 and an NVIDIA Linux kernel, such as one provided with a ``linux-nvidia-`` package. - Add ``init_on_alloc=0`` and ``memhp_default_state=online_movable`` as Linux kernel boot parameters. + - Run the NVIDIA Open GPU Kernel module driver. -* Added support for configuring the driver container to use the NVIDIA open kernel modules. +* Added support for configuring the driver container to use the NVIDIA Open GPU Kernel module driver. Support is limited to installation using the runfile installer. Support for precompiled driver containers with open kernel modules is not available. @@ -59,6 +60,8 @@ New Features the NVIDIA GPUDirect Storage kernel driver version v2.17.5, are only supported with the open kernel modules. + NVIDIA GH200 Grace Hopper Superchip systems are only supported with the open kernel modules. + - Refer to :ref:`gpu-operator-helm-chart-options` for information about setting ``useOpenKernelModules`` if you manage the driver containers with the NVIDIA cluster policy custom resource definition. - Refer to :doc:`gpu-driver-configuration` for information about setting ``spec.useOpenKernelModules``