From 08c331cbe8bf5524571e7421345d5169a96b845b Mon Sep 17 00:00:00 2001 From: dagil-nvidia Date: Mon, 24 Nov 2025 13:14:14 -0800 Subject: [PATCH 1/2] docs: update support matrix for Dynamo 0.7.0 (#4550) Signed-off-by: Dan Gil --- docs/reference/support-matrix.md | 54 +++++++++++++++++++++----------- 1 file changed, 36 insertions(+), 18 deletions(-) diff --git a/docs/reference/support-matrix.md b/docs/reference/support-matrix.md index 27e554d1ef..68e7baeaf7 100644 --- a/docs/reference/support-matrix.md +++ b/docs/reference/support-matrix.md @@ -13,11 +13,8 @@ This document provides the support matrix for Dynamo, including hardware, softwa | **CPU Architecture** | **Status** | | :------------------- | :----------- | | **x86_64** | Supported | -| **ARM64** | Experimental | +| **ARM64** | Supported | -> [!Warning] -> While **x86_64** architecture is supported on systems with a minimum of 32 GB RAM and at least 4 CPU cores, -> the **ARM64** support is experimental and may have limitations. ### GPU Compatibility @@ -38,11 +35,10 @@ If you are using a **GPU**, the following GPU models and architectures are suppo | :------------------- | :---------- | :--------------- | :----------- | | **Ubuntu** | 22.04 | x86_64 | Supported | | **Ubuntu** | 24.04 | x86_64 | Supported | -| **Ubuntu** | 24.04 | ARM64 | Experimental | +| **Ubuntu** | 24.04 | ARM64 | Supported | | **CentOS Stream** | 9 | x86_64 | Experimental | > [!Note] -> For **Linux**, the **ARM64** support is experimental and may have limitations. > Wheels are built using a manylinux_2_28-compatible environment and they have been validated on CentOS 9 and Ubuntu (22.04, 24.04). > > Compatibility with other Linux distributions is expected but has not been officially verified yet. @@ -62,16 +58,22 @@ If you are using a **GPU**, the following GPU models and architectures are suppo ### Build Dependency -| **Build Dependency** | **Version** | +| **Build Dependency** | **Version as of Dynamo v0.7.0** | | :------------------- | :------------------------------------------------------------------------------- | -| **TensorRT-LLM** | 1.1.0rc5 | +| **SGLang** | 0.5.3.post4 | +| **TensorRT-LLM** | 1.2.0rc2 | +| **vLLM** | 0.11.0 | | **NIXL** | 0.7.1 | -| **vLLM** | 0.10.1.1 | -| **SGLang** | 0.5.3rc0 | + > [!Important] > Specific versions of TensorRT-LLM supported by Dynamo are subject to change. Currently TensorRT-LLM does not support Python 3.11 so installation of the ai-dynamo[trtllm] will fail. +### CUDA Support by Framework +| **Dynamo Version** | **SGLang** | **TensorRT-LLM** | **vLLM** | +| :------------------- | :-----------------------| :-----------------------| :-----------------------| +| **Dynamo 0.7.0** | CUDA 12.8 | CUDA 13.0 | CUDA 12.8 | + ## Cloud Service Provider Compatibility ### AWS @@ -81,21 +83,37 @@ If you are using a **GPU**, the following GPU models and architectures are suppo | **Amazon Linux** | 2023 | x86_64 | Supported¹ | > [!Caution] -> ¹ There is a known issue with the TensorRT-LLM framework when running the AL2023 container locally with `docker run --network host ...` due to a [bug](https://github.com/mpi4py/mpi4py/discussions/491#discussioncomment-12660609) in mpi4py. To avoid this issue, replace the `--network host` flag with more precise networking configuration by mapping only the necessary ports (e.g., 4222 for nats, 2379/2380 for etcd, 8000 for frontend). +> There is a known issue with the TensorRT-LLM framework when running the AL2023 container locally with `docker run --network host ...` due to a [bug](https://github.com/mpi4py/mpi4py/discussions/491#discussioncomment-12660609) in mpi4py. To avoid this issue, replace the `--network host` flag with more precise networking configuration by mapping only the necessary ports (e.g., 4222 for nats, 2379/2380 for etcd, 8000 for frontend). ## Build Support **Dynamo** currently provides build support in the following ways: -- **Wheels**: Pre-built Python wheels are only available for **x86_64 Linux**. - No wheels are available for other platforms at this time. +- **Wheels**: We distribute Python wheels of Dynamo and KV Block Manager: + - [ai-dynamo](https://pypi.org/project/ai-dynamo/) + - [ai-dynamo-runtime](https://pypi.org/project/ai-dynamo-runtime/) + - **New as of Dynamo v0.7.0:** [kvbm](https://pypi.org/project/kvbm/) as a standalone implementation. + +- **Dynamo Runtime Images**: We distribute multi-arch images (x86 & ARM64 compatible) of the Dynamo Runtime for each of the LLM inference frameworks on [NGC](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/collections/ai-dynamo): + - [SGLang](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/sglang-runtime) + - [TensorRT-LLM](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/tensorrtllm-runtime) + - [vLLM](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/vllm-runtime) + +- **Dynamo Kubernetes Operator Images**: We distribute multi-arch images (x86 & ARM64 compatible) of the Dynamo Operator on [NGC](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/collections/ai-dynamo): + - [kubernetes-operator](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/kubernetes-operator) to simplify deployments of Dynamo Graphs. -- **Runtime Container Images**: We distribute only **AMD64** images of the runtime target on [NGC](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/collections/ai-dynamo) for [TensorRT-LLM](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/tensorrtllm-runtime), [vLLM](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/vllm-runtime), and [SGLang](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/sglang-runtime). - Users must build the container image from source if they require an **ARM64** image. +- **Dynamo Frontend Images**: We distribute multi-arch images (x86 & ARM64 compatible) of the Dynamo Frontend on [NGC](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/collections/ai-dynamo): + - **New as of Dynamo v0.7.0:** [dynamo-frontend](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/dynamo-frontend) as a standalone implementation. -- **Deployment-supportive Images**: [NGC](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/collections/ai-dynamo) hosts the [Dynamo kubernetes-operator](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/kubernetes-operator) to simplify deployments of Dynamo Graphs. - It is currently provided as an **AMD64** image only. +- **Helm Charts**: [NGC](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/collections/ai-dynamo) hosts the helm charts supporting Kubernetes deployments of Dynamo: + - [Dynamo CRDs](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/helm-charts/dynamo-crds) + - [Dynamo Platform](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/helm-charts/dynamo-platform) + - [Dynamo Graph](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/helm-charts/dynamo-graph) -- **Helm Charts**: [NGC](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/collections/ai-dynamo) hosts the helm charts supporting Kubernetes deployments of Dynamo. [Dynamo CRDs](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/helm-charts/dynamo-crds), [Dynamo Platform](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/helm-charts/dynamo-platform), and [Dynamo Graph](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/helm-charts/dynamo-graph) are available. +- **Rust Crates**: + - [dynamo-runtime](https://crates.io/crates/dynamo-runtime/) + - [dynamo-async-openai](https://crates.io/crates/dynamo-async-openai/) + - [dynamo-parsers](https://crates.io/crates/dynamo-parsers/) + - [dynamo-llm](https://crates.io/crates/dynamo-llm/) Once you've confirmed that your platform and architecture are compatible, you can install **Dynamo** by following the instructions in the [Quick Start Guide](https://github.com/ai-dynamo/dynamo/blob/main/README.md#installation). From f87c6a257e1054b63be277c84ae45837e752944b Mon Sep 17 00:00:00 2001 From: dagil-nvidia Date: Mon, 24 Nov 2025 19:22:17 -0600 Subject: [PATCH 2/2] Remove Dynamo Frontend images from support matrix Removed information about Dynamo Frontend images from the support matrix. Signed-off-by: dagil-nvidia --- docs/reference/support-matrix.md | 3 --- 1 file changed, 3 deletions(-) diff --git a/docs/reference/support-matrix.md b/docs/reference/support-matrix.md index 68e7baeaf7..2efb446874 100644 --- a/docs/reference/support-matrix.md +++ b/docs/reference/support-matrix.md @@ -102,9 +102,6 @@ If you are using a **GPU**, the following GPU models and architectures are suppo - **Dynamo Kubernetes Operator Images**: We distribute multi-arch images (x86 & ARM64 compatible) of the Dynamo Operator on [NGC](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/collections/ai-dynamo): - [kubernetes-operator](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/kubernetes-operator) to simplify deployments of Dynamo Graphs. -- **Dynamo Frontend Images**: We distribute multi-arch images (x86 & ARM64 compatible) of the Dynamo Frontend on [NGC](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/collections/ai-dynamo): - - **New as of Dynamo v0.7.0:** [dynamo-frontend](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/dynamo-frontend) as a standalone implementation. - - **Helm Charts**: [NGC](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/collections/ai-dynamo) hosts the helm charts supporting Kubernetes deployments of Dynamo: - [Dynamo CRDs](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/helm-charts/dynamo-crds) - [Dynamo Platform](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/helm-charts/dynamo-platform)