Skip to content

Pull requests: NVIDIA/TensorRT-Model-Optimizer

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Add option to real quantize the model
#473 opened Oct 28, 2025 by ajrasane Loading…
[Autocast] Optimize _add_cast runtime
#469 opened Oct 27, 2025 by aboubezari Loading…
[5597780] Add support for FP16-only custom ops
#460 opened Oct 23, 2025 by gcunhase Loading…
Support MOE Export for Nemotron H
#447 opened Oct 17, 2025 by jenchen13 Loading…
2 tasks done
Fix ONNX FP8 scaling
#446 opened Oct 17, 2025 by Darth-Kronos Loading…
[OMNIML-2857] Support the DeepSeek V3.2 model
#435 opened Oct 14, 2025 by cjluo-nv Loading…
Yeyu/debug paralllel draft
#429 opened Oct 13, 2025 by yeyu-nvidia Loading…
[New feature] Add Support For Sparse Attention
#408 opened Oct 7, 2025 by kaix-nv Loading…
Explicitly register real quant gemms
#402 opened Oct 6, 2025 by cjluo-nv Loading…
ProTip! Follow long discussions with comments:>50.