Skip to content

Commit 58d6cf1

Browse files
committed
Update quick-start-recipe-for-gpt-oss-on-trtllm.md
Signed-off-by: dongfengy <[email protected]>
1 parent 3d4018e commit 58d6cf1

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

docs/source/deployment-guide/quick-start-recipe-for-gpt-oss-on-trtllm.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ There are multiple MOE backends inside TensorRT LLM. Here are the support matrix
2525

2626
| Device | Activation Type | MoE Weights Type | MoE Backend | Use Case |
2727
|---------------------- |-----------------|------------------|-------------|--------------------------------|
28-
| B200/GB200/B300/GB300 | MXFP8 | MXFP4 | TRTLLM | Low Latency and Max Throughput |
28+
| B200/GB200/B300/GB300 | MXFP8 | MXFP4 | TRTLLM | Low Latency and Max Throughput |
2929

3030
The default moe backend is `CUTLASS`, so for the best possible perf, one must set the `moe_config.backend` explicitly to run the model.
3131
`CUTLASS` was better for max throughput at first but now we have optimized `TRTLLM` moe to be universally faster.

0 commit comments

Comments
 (0)