You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/guides/dynamo_run.md
+1-1
Original file line number
Diff line number
Diff line change
@@ -342,7 +342,7 @@ See instructions [here](/examples/tensorrt_llm/README.md#run-container) to run t
342
342
343
343
Execute the following to load the TensorRT-LLM model specified in the configuration.
344
344
```
345
-
dynamo run out=pystr:/workspace/examples/tensorrt_llm/engines/agg_engine.py -- --engine_args /workspace/examples/tensorrt_llm/configs/llm_api_config.yaml
345
+
dynamo run out=pystr:/workspace/examples/tensorrt_llm/engines/trtllm_engine.py -- --engine_args /workspace/examples/tensorrt_llm/configs/llm_api_config.yaml
Copy file name to clipboardExpand all lines: examples/tensorrt_llm/README.md
+42-9
Original file line number
Diff line number
Diff line change
@@ -25,6 +25,14 @@ This directory contains examples and reference implementations for deploying Lar
25
25
See [deployment architectures](../llm/README.md#deployment-architectures) to learn about the general idea of the architecture.
26
26
Note that this TensorRT-LLM version does not support all the options yet.
27
27
28
+
Note: TensorRT-LLM disaggregation does not support conditional disaggregation yet. You can only configure the deployment to always use aggregate or disaggregated serving.
29
+
30
+
## Getting Started
31
+
32
+
1. Choose a deployment architecture based on your requirements
33
+
2. Configure the components as needed
34
+
3. Deploy using the provided scripts
35
+
28
36
### Prerequisites
29
37
30
38
Start required services (etcd and NATS) using [Docker Compose](../../deploy/docker-compose.yml)
@@ -68,6 +76,29 @@ This build script internally points to the base container image built with step
68
76
```
69
77
## Run Deployment
70
78
79
+
This figure shows an overview of the major components to deploy:
0 commit comments