Skip to content

Pull requests: NVIDIA/TensorRT-LLM

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[None][feat] Implement support for Falcon-H1 models
#8135 opened Oct 3, 2025 by dbari Draft
4 tasks
[None][chore] AutoDeploy: clean up accuracy test configs
#8134 opened Oct 3, 2025 by lucaslie Loading…
1 task done
[None][feat] AutoDeploy: Nemotron-H accuracy test
#8133 opened Oct 3, 2025 by lucaslie Loading…
1 task done
draft: all reduce
#8124 opened Oct 2, 2025 by NVShreyas Draft
1 task
[TRTLLM-6342][feat] Factory TP sharding of quantized models AutoDeploy <NV> AutoDeploy Backend
#8123 opened Oct 2, 2025 by greg-kwasniewski1 Loading…
1 task done
[TRTLLM-8413][chore] resolve sampling defaults in OpenAI API backend
#8121 opened Oct 2, 2025 by ixlmar Loading…
1 task done
[None][test] Add accuracy test for Qwen3Next model
#8111 opened Oct 1, 2025 by Funatiq Loading…
1 task
[doc] Add Qwen3 Next Guide to Core README Community want to contribute PRs initiated from Community
#8101 opened Sep 30, 2025 by faradawn Loading…
1 task
[None][fix] Avoid unnecessary concat in attn_output_gate case.
#8094 opened Sep 30, 2025 by yuxianq Loading…
1 task done
[None][fix] Disable DeepGEMM for Qwen3 MoE Attention layers
#8087 opened Sep 30, 2025 by achartier Loading…
1 task done
[None][feat] add RocketKV support (experimental)
#8086 opened Sep 30, 2025 by lfr-0531 Loading…
1 task done
[None][fix] Add Lock to protect mReqeustToSession
#8085 opened Sep 30, 2025 by chuangz0 Loading…
1 task done
ProTip! Type g p on any issue or pull request to go back to the pull request listing page.