vllm-project / vllm Public

Notifications You must be signed in to change notification settings
Fork 5.8k
Star 38.8k

Code
Issues 1.3k
Pull requests 443
Discussions
Actions
Projects 2
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: vllm-project/vllm

Labels 59 Milestones 0

New pull request New

443 Open 6,058 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

[BugFix] Illegal memory access for MoE On H20

#13693 opened Feb 22, 2025 by Abatom

Loading…

[Bugfix] Fix benchmark script bug: inaccurate stats for vllm backend when max_model_len < input_len + output_len ready

ONLY add when PR is ready to merge/full CI is needed

structured-output

#13691 opened Feb 22, 2025 by WangErXiao

Loading…

[V1][Minor] Use FakeAttentionMetadata for dummy run ready

ONLY add when PR is ready to merge/full CI is needed

#13689 opened Feb 22, 2025 by WoosukKwon

Loading…

[Core][Distributed] Use IPC (domain socket) ZMQ socket for local comms

#13688 opened Feb 21, 2025 by njhill

Loading…

[Bugfix][Model] OLMo 2: split qkv correctly for GQA and MQA ready

ONLY add when PR is ready to merge/full CI is needed

#13687 opened Feb 21, 2025 by 2015aroras

Loading…

enable users to select triton fa for MLA backend needs-rebase rocm

#13685 opened Feb 21, 2025 by qli88

Loading…

[Model] GPTBigCodeForEmbedding supporting token span classification

#13684 opened Feb 21, 2025 by michaelrglass

Loading…

[Bugfix][API Server] Fix invalid usage of 'ge' and 'le' in port valid… frontend ready

ONLY add when PR is ready to merge/full CI is needed

#13672 opened Feb 21, 2025 by WangErXiao

Loading…

[Misc] Capture and log the time of loading weights ready

ONLY add when PR is ready to merge/full CI is needed

#13666 opened Feb 21, 2025 by waltforme

Loading…

Correction to TP logic for Mamba Mixer 2 when Num Groups not divisible by TP Size ready

ONLY add when PR is ready to merge/full CI is needed

#13660 opened Feb 21, 2025 by fabianlim

Loading…

[model][refactor] remove cuda hard code in models and layers speculative-decoding

#13658 opened Feb 21, 2025 by MengqingCao

Loading…

[ROCM] fix native attention function call ready

ONLY add when PR is ready to merge/full CI is needed

#13650 opened Feb 21, 2025 by gongdao123

Loading…

docs: Add a note on full CI run in contributing guide documentation

Improvements or additions to documentation

#13646 opened Feb 21, 2025 by terrytangyuan

Loading…

[v1] torchrun compatibility ci/build v1

#13642 opened Feb 21, 2025 by youkaichao

Loading…

Fix some issues with benchmark data output ci/build

#13641 opened Feb 21, 2025 by huydhn

Loading…

[V1][PP] Continue scheduling prefill chunks v1

#13637 opened Feb 21, 2025 by comaniac

Loading…

[ci] Use env var to control whether to use S3 bucket in CI ci/build speculative-decoding structured-output

#13634 opened Feb 20, 2025 by khluu • Draft

[WIP][Model] Extend Ultravox to accept audio longer than 30s

#13631 opened Feb 20, 2025 by farzadab • Draft

3 tasks

[WIP][Kernel] Flashinfer MLA support v1

#13630 opened Feb 20, 2025 by LucasWilkinson • Draft

use whl path to install torch ci/build

#13627 opened Feb 20, 2025 by Chenyaaang

Loading…

[Model][Speculative Decoding] Expand DeepSeek MTP code to support k > n_predict speculative-decoding

#13626 opened Feb 20, 2025 by benchislett

Loading…

[Kernel] Optimize moe intermediate_cache usage

#13625 opened Feb 20, 2025 by mgoin

Loading…

[Bugfix] Flush TunableOp results before worker processes are destroyed. rocm

#13623 opened Feb 20, 2025 by naromero77amd

Loading…

[Frontend] [Minor] Fix tqdm progress bar for n > 1 frontend

#13621 opened Feb 20, 2025 by franzscherr

Loading…

[Misc] Bump compressed-tensors ci/build ready

ONLY add when PR is ready to merge/full CI is needed

#13619 opened Feb 20, 2025 by dsikka

Loading…

Previous 1 2 3 4 5 … 17 18 Next

Previous Next

ProTip! Find all pull requests that aren't related to any open issues with -linked:issue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly