Skip to content

Pull requests: vllm-project/llm-compressor

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

refactor: modernize recipe module with Python 3.10+ type hints Refactor Code cleanup and/or improvements to existing features two-reviews When a PR requires two reviews
#2719 opened May 18, 2026 by AsadShahid04 Loading…
build SmoothQuant mapping dynamically for Qwen3.5 MoE hybrid attention enhancement New feature or request qwen For any PR / issue related to Qwen support smoothquant For any issue / PR related to SmoothQuant support transforms Related to transforms-based modifiers like SpinQuant and Quip two-reviews When a PR requires two reviews
#2718 opened May 18, 2026 by zhangxin81 Loading…
Remove IMatrixGatherer Refactor Code cleanup and/or improvements to existing features transforms Related to transforms-based modifiers like SpinQuant and Quip two-reviews When a PR requires two reviews
#2716 opened May 18, 2026 by dshane1903 Loading…
enhance AutoRoundModifier performance by skipping useless model forward in calibration stage. autoround For any PR / issue related to autoround support enhancement New feature or request two-reviews When a PR requires two reviews
#2713 opened May 15, 2026 by xin3he Contributor Loading…
Update AutoRound examples to suit general usage autoround For any PR / issue related to autoround support documentation Improvements or additions to documentation fp8 For any issue / PR related to FP8 support llama For any PR / issue related to Llama herd support qwen For any PR / issue related to Qwen support two-reviews When a PR requires two reviews
#2712 opened May 15, 2026 by xin3he Contributor Loading…
[LM Eval] Update lm eval tests to spin-up vLLM awq For any issue / PR related to AWQ support fp8 For any issue / PR related to FP8 support gptq For any PR / issue related to GPTQ support moe nvfp4 For any PR / issue related to NVFP4 support ready When a PR is ready for review w4a16
#2710 opened May 13, 2026 by dsikka Collaborator Loading…
[DRAFT] Add AutoAWQ conversion smoke test (Depends on CT #701) awq For any issue / PR related to AWQ support good first issue A good first issue for users wanting to contribute two-reviews When a PR requires two reviews
#2709 opened May 13, 2026 by orestis-z Collaborator Loading…
refactor: modernize modifiers module with Python 3.10+ type hints ready When a PR is ready for review Refactor Code cleanup and/or improvements to existing features two-reviews When a PR requires two reviews
#2704 opened May 12, 2026 by dshane1903 Loading…
Implement REAP pruning modifier for MoE expert pruning enhancement New feature or request moe two-reviews When a PR requires two reviews
#2703 opened May 12, 2026 by eldarkurtic Collaborator Loading…
refactor: modernize type hints in core/session_functions.py (part of #1927) Refactor Code cleanup and/or improvements to existing features two-reviews When a PR requires two reviews
#2702 opened May 12, 2026 by MohibUllahKhanSherwani Loading…
[Docs] Update W8A16 docs documentation Improvements or additions to documentation w4a16
#2700 opened May 12, 2026 by kylesayrs Collaborator Loading…
[Tracing] Support tracing cache ready When a PR is ready for review Refactor Code cleanup and/or improvements to existing features tracing Issues related to model tracing two-reviews When a PR requires two reviews
#2686 opened May 5, 2026 by kylesayrs Collaborator Loading…
fix: add enable_thinking flag and reasoning dataset example for Qwen3-Next AWQ awq For any issue / PR related to AWQ support bug Something isn't working documentation Improvements or additions to documentation enhancement New feature or request qwen For any PR / issue related to Qwen support two-reviews When a PR requires two reviews w4a16
#2681 opened May 2, 2026 by jayakumarpujar Contributor Loading…
3 tasks
Update observer and modifier docs for refactored observer API awq For any issue / PR related to AWQ support documentation Improvements or additions to documentation gptq For any PR / issue related to GPTQ support Refactor Code cleanup and/or improvements to existing features two-reviews When a PR requires two reviews
#2671 opened Apr 30, 2026 by HDCharles Collaborator Loading…
2 tasks
feat: concurrent KLD evaluation without enforce_eager (closes #2667, refs #2646) enhancement New feature or request ready When a PR is ready for review two-reviews When a PR requires two reviews
#2668 opened Apr 29, 2026 by jayakumarpujar Contributor Loading…
[Refactor] Consolidate Intermediate Offloading awq For any issue / PR related to AWQ support gptq For any PR / issue related to GPTQ support needs-rebase ready When a PR is ready for review Refactor Code cleanup and/or improvements to existing features two-reviews When a PR requires two reviews
#2664 opened Apr 28, 2026 by menogrey Contributor Loading…
Feat/issue 2646 enhancement New feature or request ready When a PR is ready for review two-reviews When a PR requires two reviews
#2663 opened Apr 28, 2026 by rpathade Loading…
[Examples] Kimi K2.6 enhancement New feature or request fp8 For any issue / PR related to FP8 support model_free_ptq For any PR/issue related to the `model_free_ptq` pathway nvfp4 For any PR / issue related to NVFP4 support ready When a PR is ready for review
#2662 opened Apr 27, 2026 by brian-dellabetta Collaborator Loading…
1 task done
Add MixFP4A16 quantization recipe support enhancement New feature or request fp8 For any issue / PR related to FP8 support Refactor Code cleanup and/or improvements to existing features two-reviews When a PR requires two reviews
#2657 opened Apr 27, 2026 by revollllt Loading…
[Model] DeepSeekV4 quality-failed
#2655 opened Apr 26, 2026 by kylesayrs Collaborator Draft
Transformers v5 needs-rebase
#2647 opened Apr 24, 2026 by kylesayrs Collaborator Draft
[do not land] GPTQ actorder regression test suite awq For any issue / PR related to AWQ support fp8 For any issue / PR related to FP8 support gptq For any PR / issue related to GPTQ support llama For any PR / issue related to Llama herd support qwen For any PR / issue related to Qwen support w4a16
#2643 opened Apr 22, 2026 by HDCharles Collaborator Draft
3 tasks
add example of w8a8fp8 for qwen3.5 documentation Improvements or additions to documentation enhancement New feature or request fp8 For any issue / PR related to FP8 support qwen For any PR / issue related to Qwen support two-reviews When a PR requires two reviews
#2631 opened Apr 20, 2026 by zhangxin81 Loading…
Adding test_group to lm-eval configs enhancement New feature or request fp8 For any issue / PR related to FP8 support nvfp4 For any PR / issue related to NVFP4 support two-reviews When a PR requires two reviews w4a16
#2623 opened Apr 16, 2026 by debroy-rh Loading…
ProTip! Find all pull requests that aren't related to any open issues with -linked:issue.