Update quick-start-recipe-for-gpt-oss-on-trtllm.md

dongfengy · web-flow · commit 9da466477e84 · 2025-10-27T13:14:25.000-07:00
Signed-off-by: dongfengy &lt;99041270+dongfengy@users.noreply.github.com&gt;
diff --git a/docs/source/deployment-guide/quick-start-recipe-for-gpt-oss-on-trtllm.md b/docs/source/deployment-guide/quick-start-recipe-for-gpt-oss-on-trtllm.md
@@ -23,9 +23,9 @@ The guide is intended for developers and practitioners seeking high-throughput o
 
 There are multiple MOE backends inside TensorRT LLM. Here are the support matrix of the MOE backends.
 
-| Device     | Activation Type | MoE Weights Type | MoE Backend | Use Case       |
-|------------|------------------|------------------|-------------|----------------|
-| B200/GB200/B300/GB300 | MXFP8            | MXFP4            | TRTLLM      | Low Latency and  Max Throughput   |
+| Device                | Activation Type | MoE Weights Type | MoE Backend | Use Case                       |
+|---------------------- |-----------------|------------------|-------------|--------------------------------|
+| B200/GB200/B300/GB300 |      MXFP8      |       MXFP4      |    TRTLLM   | Low Latency and Max Throughput |
 
 The default moe backend is `CUTLASS`, so for the best possible perf, one must set the `moe_config.backend` explicitly to run the model.
 `CUTLASS` was better for max throughput at first but now we have optimized `TRTLLM` moe to be universally faster.