NVIDIA / TensorRT-Model-Optimizer Public

Notifications You must be signed in to change notification settings
Fork 200
Star 1.6k

Code
Issues 75
Pull requests 40
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Security
Insights

Pull requests: NVIDIA/TensorRT-Model-Optimizer

Labels 25 Milestones 0

New pull request New

40 Open 297 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

MLA eagle for K2

#615 opened Nov 27, 2025 by h-guo18 • Draft

draft: Add per block MSE for NVFP4 and INT4

#613 opened Nov 27, 2025 by Fridah-nv • Draft

[5680954,5620660@2][ONNX][Autocast] Update value info in converted graph

#611 opened Nov 26, 2025 by gcunhase

Loading…

Add checkpoint save/load to ForwardHook + add IterativeChannelContributionHook

#610 opened Nov 26, 2025 by danielkorzekwa

Loading…

Support attention quantization for diffusers >= 0.35.0

#608 opened Nov 25, 2025 by shengliangxu • Draft

Add pruning checkpoints for the compress algorithm

#607 opened Nov 25, 2025 by danielkorzekwa

Loading…

Fix extra args and --component-dtype default value

#605 opened Nov 24, 2025 by shengliangxu

Loading…

Convert compressed-tensor int4 format to GPTQ int4 format

#590 opened Nov 20, 2025 by Edwardf0t1

Loading…

make eagle embedding optional

#589 opened Nov 20, 2025 by yeyu-nvidia

Loading…

Yeyu/remove embedding from eagle

#585 opened Nov 20, 2025 by yeyu-nvidia • Draft

Product Rename: TensorRT Model Optimizer to Model Optimizer

#583 opened Nov 20, 2025 by kevalmorabia97

Loading…

1 of 2 tasks

support for newer checkpoints

#582 opened Nov 20, 2025 by binghanc • Draft

Feat: SGL backend for online SD training

#564 opened Nov 14, 2025 by h-guo18

Loading…

Fix hf_quant_config with kv cache type

#557 opened Nov 14, 2025 by jenchen13

Loading…

GPTQ Lite implementation

#555 opened Nov 13, 2025 by sugunav14

Loading…

1 of 2 tasks

[OMNIML-2850] [3/n] Adds sparse attention calibration

#538 opened Nov 11, 2025 by kaix-nv

Loading…

[OMNIML-2852] [2/n] Add Core Sparse Attention Infrastructure

#527 opened Nov 7, 2025 by kaix-nv

Loading…

parallel eagle draft

#523 opened Nov 6, 2025 by yeyu-nvidia • Draft

[Bug #193] fix fp8 blockwise real quantization

#522 opened Nov 6, 2025 by meenchen

Loading…

Support AWQ fake quant for vLLM MoE models

#521 opened Nov 6, 2025 by meenchen • Draft

[Draft] [5526696] Add kv cache quantization support for onnx quantization

#486 opened Oct 31, 2025 by zhanghaoc

Loading…

Yeyu/set block

#480 opened Oct 28, 2025 by yeyu-nvidia • Draft

Preserve original rope scaling type in export due to transformers library AutoConfig issue

#452 opened Oct 17, 2025 by Edwardf0t1

Loading…

[1/2] Registry interface for custom quantization functional backend

#449 opened Oct 17, 2025 by realAsma

Loading…

[OMNIML-2673]Create an example for running diffusion models using auto deploy

#443 opened Oct 16, 2025 by ajrasane • Draft

Previous 1 2 Next

Previous Next

ProTip! Type g i on any issue or pull request to go back to the issue listing page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!