vllm-project / llm-compressor Public

Notifications You must be signed in to change notification settings
Fork 313
Star 2.4k

Code
Issues 78
Pull requests 43
Discussions
Actions
Projects
Wiki
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Wiki
Security
Insights

Pull requests: vllm-project/llm-compressor

Labels 22 Milestones 0

New pull request New

43 Open 951 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Add MSE vs MinMax observer comparison tests

#2110 opened Dec 11, 2025 by GOavi101

Loading…

Fix deprecated torch_dtype usage in transformers loading

#2109 opened Dec 11, 2025 by jangel97

Loading…

[Bug fix] fix Qwen3VLMoe

#2104 opened Dec 9, 2025 by Wangzheee

Loading…

add kv quant example autoround

For any PR / issue related to autoround support

#2100 opened Dec 5, 2025 by mengniwang95

Loading…

[Utils] Delete deprecated utilities

#2098 opened Dec 5, 2025 by kylesayrs • Draft

Linearize gpt_oss model and add separate example to qunatize it to w4a8

#2091 opened Dec 3, 2025 by isharif168

Loading…

feat: add importance-aware mixed-precision quantization

#2083 opened Dec 2, 2025 by wangwenmingaa

Loading…

[test] add e2e test for qwen3 moe w4a16 ready

When a PR is ready for review

#2071 opened Nov 25, 2025 by HDCharles • Draft

[Performance] Batched calibration ready

When a PR is ready for review

#2054 opened Nov 20, 2025 by kylesayrs

Loading…

[Misc] Remove is_moe_model ready

When a PR is ready for review

#2053 opened Nov 20, 2025 by kylesayrs

Loading…

Testing Clean-up

#2045 opened Nov 18, 2025 by dsikka • Draft

Modernize transformers module with type hints and generic types

#2034 opened Nov 14, 2025 by sugatmahanti

Loading…

Support wInt4aFp8 for moe

#2027 opened Nov 12, 2025 by Wangzheee

Loading…

[Sequential Onloading] Support onloading and offloading frozen dataclasses

#2016 opened Nov 10, 2025 by kylesayrs

Loading…

[TypeHint] Fix format_calibration_data type hint

#2012 opened Nov 10, 2025 by kylesayrs

Loading…

Implement propagate_error argument ready

When a PR is ready for review

#2008 opened Nov 10, 2025 by kylesayrs

Loading…

Granite4 FP8 Block Quantization

#2001 opened Nov 6, 2025 by krishnateja95

Loading…

[Kimi Linear] FP8 Example

#1986 opened Oct 31, 2025 by dsikka • Draft

[AWQ] Allow users to disable quantization during AWQ

#1973 opened Oct 28, 2025 by brian-dellabetta • Draft

[AWQ] Generalize AWQ quantization ready

When a PR is ready for review

#1961 opened Oct 22, 2025 by kylesayrs

Loading…

4 tasks done

[Oneshot] Add validation for empty dataset and enhance oneshot function parameters

#1957 opened Oct 21, 2025 by ArkaSanka

Loading…

[Autowrapper] Trace vision tower for better offloading

#1948 opened Oct 18, 2025 by kylesayrs • Draft

[Observers] Change MSE global scale objective function

#1935 opened Oct 14, 2025 by kylesayrs • Draft

[Attention] Support FP4 attention quantization nvfp4

For any PR / issue related to NVFP4 support

#1924 opened Oct 14, 2025 by kylesayrs

Loading…

add gpt oss nvfp4 example

#1885 opened Sep 30, 2025 by shanjiaz • Draft

Previous 1 2 Next

Previous Next

ProTip! no:milestone will show everything without a milestone.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!