NVIDIA / TensorRT-LLM Public

Notifications You must be signed in to change notification settings
Fork 1.3k
Star 10k

Code
Issues 446
Pull requests 140
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: NVIDIA/TensorRT-LLM

Labels 36 Milestones 0

New pull request New

140 Open 552 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

infra: add pre-commit check to github actions

#3129 opened Mar 27, 2025 by tburt-nv

Loading…

feat: Cohere2ForCausalLM support (Command-A, Command-R7B)

#3128 opened Mar 27, 2025 by aikitoria

Loading…

feat: FP8 Rowwise quantization support for Cohere models

#3127 opened Mar 27, 2025 by aikitoria

Loading…

[draft]chore: update libcutlass library with FP4 quantize linear layout change

#3126 opened Mar 27, 2025 by nv-guomingz

Loading…

infra: update concurrency control

#3120 opened Mar 27, 2025 by niukuo

Loading…

chore: Stabilize ABI boundary for internal kernel library

#3117 opened Mar 27, 2025 by tongyuantongyu • Draft

fix: Reverse graph size order

#3116 opened Mar 27, 2025 by jiahanc

Loading…

refactor:[AutoDeploy] Enhance RoPE support

#3115 opened Mar 26, 2025 by Fridah-nv • Draft

1 of 3 tasks

chore: add sqlite to rocky container

#3114 opened Mar 26, 2025 by tburt-nv • Draft

fix: Early exit cmake if find_library() does not find any lib

#3113 opened Mar 26, 2025 by WilliamTambellini

Loading…

citest

#3110 opened Mar 26, 2025 by tensorrt-cicd

Loading…

infra: [TRTLLM-4308] Add Bot help

#3107 opened Mar 26, 2025 by ZhanruiSunCh

Loading…

fix: Fix C++ decoder synchronization in PyTorch

#3106 opened Mar 26, 2025 by dcampora

Loading…

feat: Optionally split MoE inputs into chunks to reduce GPU memory usage

#3104 opened Mar 26, 2025 by jinyangyuan-nvidia • Draft

refactor: Simplify disableLookahead and improve numDecodingEngineTokens handling

#3103 opened Mar 26, 2025 by Funatiq

Loading…

perf: Enhance LLM API perf

#3102 opened Mar 26, 2025 by kaiyux • Draft

chore: Ucx ip port remove mpi depend

#3101 opened Mar 26, 2025 by chuangz0

Loading…

infra: Support get file change for github PR

#3098 opened Mar 26, 2025 by ZhanruiSunCh

Loading…

infra: Add test list name check

#3097 opened Mar 26, 2025 by EmmaQiaoCh

Loading…

bug: Fix hang bug when context server doesn't have enough capacity for KV Cache

#3095 opened Mar 26, 2025 by Tabrizian

Loading…

test: tests[CI]: remove closed bugs

#3094 opened Mar 26, 2025 by xinhe-nv

Loading…

perf: Add optimizations for deepseek in min latency mode

#3093 opened Mar 26, 2025 by zongfeijing

Loading…

feat: Run PyExecutor's inference flow to estimate max_num_tokens for kv_cache_manager

#3092 opened Mar 26, 2025 by HuiGao-NV

Loading…

chore: Blossom debug hook

#3091 opened Mar 26, 2025 by yiqingy0

Loading…

fix: dist-serving streaming mode returns http 500

#3087 opened Mar 26, 2025 by zhengd-nv

Loading…

Previous 1 2 3 4 5 6 Next

Previous Next

ProTip! Adding no:label will show everything without a label.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly