Add Spec dec Bench example #474

IzzyPutterman · 2025-10-28T05:52:34Z

What does this PR do?

Type of change: New feature

Overview: Specdec bench example

Usage

# Add a code snippet demonstrating how to use this

Testing

Before your PR is "Ready for review"

Make sure you read and follow Contributor guidelines and your commits are signed.
Is this change backward compatible?: Yes
Did you write any new necessary tests?: No
Did you add or update any necessary documentation?: Yes
Did you update Changelog?: Yes

Additional Information

codecov · 2025-10-28T06:05:15Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.52%. Comparing base (8cf516e) to head (17f968e).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #474      +/-   ##
==========================================
+ Coverage   73.46%   73.52%   +0.06%     
==========================================
  Files         180      181       +1     
  Lines       18161    18207      +46     
==========================================
+ Hits        13342    13387      +45     
- Misses       4819     4820       +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

examples/specdec_bench/specdec_bench/datasets/__init__.py

kevalmorabia97 · 2025-10-29T18:50:30Z

Need to mention this new example in https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/CHANGELOG.rst under 0.40 release section (new one)

kevalmorabia97 · 2025-10-29T18:51:10Z

Missing requirements.txt and README.md in examples/specdec_bench

IzzyPutterman · 2025-11-05T02:37:55Z

Missing requirements.txt and README.md in examples/specdec_bench

I added a section in the readme for how to install. Creating a requirements would get nasty as it techincally supports vLLM, SGLang, and TRTLLM. Simpler to say run in an env which already has one of these installed.

h-guo18 · 2025-11-06T01:33:00Z

examples/specdec_bench/specdec_bench/metrics/timing.py

+        self.out["TTFT Time"] = compute_statistics(ttft_time)
+        if tpot_time:
+            self.out["Generation Step Time"] = compute_statistics(tpot_time)
+            self.out["Generation Tokens Per Second"] = compute_statistics(gen_tp_time)


Questions: Is there a way to get per_user_tps and per_gpu_tps using specbench?

The Generation TPS is per_user_tps technically. I can rename to add a Request prefix to these.

Thanks. Is per_gpu_tps feasible to implement in specbench? It would help to plot figures like this, where we can understand the serving performance directly:

h-guo18

A high level question: TRTLLM-bench seems to have the same metrics available. Vllm also have a similar vllm bench: link.

Is there any difference between the result from spechbench and trtllm-bench, or shall we reuse existing functionalities from trtllm-bench?

IzzyPutterman · 2025-11-06T02:22:34Z

A high level question: TRTLLM-bench seems to have the same metrics available. Vllm also have a similar vllm bench: link.

Is there any difference between the result from spechbench and trtllm-bench, or shall we reuse existing functionalities from trtllm-bench?

Great question! Both of those benchmarks try to do the same thing, however they are not unified. With this one you can be guaranteed to send the exact same tokens to the engine in all cases, and have it respect the same chat template and tokenizer. trtllm-bench tends to ignore_eos (which is bad for specdec) and requires the input to already be tokenized (a big source of error depending on how you actually do it).
Also this provides easier ways to get more advanced metrics like how AR/AL changes over the time of a request.

kevalmorabia97 · 2025-11-06T19:54:57Z

examples/specdec_bench/specdec_bench/metrics/aa_timing.py

+# limitations under the License.
+
+try:
+    import tiktoken


Lets add non-deployment related requirements.txt for dependencies not part of modelopt (setup.py) like tiktoken and maybe others

Fine since all the 3 base docker images already have everything installed

Signed-off-by: Izzy Putterman <[email protected]>

examples/speculative_decoding/README.md

Signed-off-by: Keval Morabia <[email protected]>

## What does this PR do? **Type of change:** New feature  **Overview:** Specdec bench example ## Usage  ```python # Add a code snippet demonstrating how to use this ``` ## Testing  ## Before your PR is "*Ready for review*"  - **Make sure you read and follow [Contributor guidelines](https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/CONTRIBUTING.md)** and your commits are signed. - **Is this change backward compatible?**: Yes  - **Did you write any new necessary tests?**: No - **Did you add or update any necessary documentation?**: Yes - **Did you update [Changelog](https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/CHANGELOG.rst)?**: Yes  ## Additional Information  --------- Signed-off-by: Izzy Putterman <[email protected]> Signed-off-by: Keval Morabia <[email protected]> Co-authored-by: Keval Morabia <[email protected]> Signed-off-by: mxin <[email protected]>

IzzyPutterman requested a review from a team as a code owner October 28, 2025 05:52

IzzyPutterman requested a review from RalphMao October 28, 2025 05:52

IzzyPutterman force-pushed the iputterman/specdec-bench branch from de9df92 to 7cbcc0c Compare October 28, 2025 15:14

IzzyPutterman requested review from h-guo18, kevalmorabia97 and yeyu-nvidia October 28, 2025 15:14

kevalmorabia97 removed the request for review from RalphMao October 29, 2025 16:38

kevalmorabia97 reviewed Oct 29, 2025

View reviewed changes

examples/specdec_bench/specdec_bench/datasets/__init__.py Show resolved Hide resolved

IzzyPutterman force-pushed the iputterman/specdec-bench branch from 7cbcc0c to 8402f45 Compare November 5, 2025 02:24

IzzyPutterman requested a review from a team as a code owner November 5, 2025 02:24

IzzyPutterman requested a review from kevalmorabia97 November 5, 2025 02:24

IzzyPutterman force-pushed the iputterman/specdec-bench branch from 8402f45 to 4111724 Compare November 5, 2025 02:32

h-guo18 reviewed Nov 6, 2025

View reviewed changes

h-guo18 approved these changes Nov 6, 2025

View reviewed changes

IzzyPutterman force-pushed the iputterman/specdec-bench branch 3 times, most recently from 172f919 to 0d55bb0 Compare November 6, 2025 03:00

ChenhanYu approved these changes Nov 6, 2025

View reviewed changes

kevalmorabia97 reviewed Nov 6, 2025

View reviewed changes

kevalmorabia97 approved these changes Nov 6, 2025

View reviewed changes

kevalmorabia97 changed the title ~~Draft: Specdec Bench: Initial~~ Add Spec dec Bench example Nov 6, 2025

Specdec Bench: Initial

81c6e15

Signed-off-by: Izzy Putterman <[email protected]>

IzzyPutterman force-pushed the iputterman/specdec-bench branch from 0d55bb0 to 81c6e15 Compare November 6, 2025 20:44

IzzyPutterman requested a review from a team as a code owner November 6, 2025 20:44

kevalmorabia97 reviewed Nov 6, 2025

View reviewed changes

examples/speculative_decoding/README.md Outdated Show resolved Hide resolved

Apply suggestions from code review

17f968e

Signed-off-by: Keval Morabia <[email protected]>

kevalmorabia97 approved these changes Nov 6, 2025

View reviewed changes

kevalmorabia97 enabled auto-merge (squash) November 6, 2025 20:54

kevalmorabia97 merged commit 5adb9ba into main Nov 6, 2025
26 checks passed

kevalmorabia97 deleted the iputterman/specdec-bench branch November 6, 2025 21:21

Add Spec dec Bench example #474

Add Spec dec Bench example #474

Uh oh!

Conversation

IzzyPutterman commented Oct 28, 2025 • edited by kevalmorabia97 Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Usage

Testing

Before your PR is "Ready for review"

Additional Information

Uh oh!

codecov bot commented Oct 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

kevalmorabia97 commented Oct 29, 2025

Uh oh!

kevalmorabia97 commented Oct 29, 2025

Uh oh!

IzzyPutterman commented Nov 5, 2025

Uh oh!

h-guo18 Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

IzzyPutterman Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

h-guo18 Nov 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

IzzyPutterman Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

h-guo18 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

IzzyPutterman commented Nov 6, 2025

Uh oh!

kevalmorabia97 Nov 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kevalmorabia97 Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

IzzyPutterman commented Oct 28, 2025 •

edited by kevalmorabia97

Loading

codecov bot commented Oct 28, 2025 •

edited

Loading

h-guo18 Nov 6, 2025 •

edited

Loading

h-guo18 left a comment •

edited

Loading

kevalmorabia97 Nov 6, 2025 •

edited

Loading