[3/N][Refactor][Quantization]remove packed_modules_mapping from models #3021

22dimensions · 2025-09-19T00:31:06Z

What this PR does / why we need it?

Some custom models in vllm-ascend define packed_modules_mapping, which prevent keeping same model class with vllm community. So move these custom packed_modules_mapping to quant utils.py. After this pr, some custom models can be removed.

Does this PR introduce any user-facing change?

tested by CI

How was this patch tested?

tested by CI

vLLM version: v0.10.2
vLLM main: vllm-project/vllm@5089fd7

github-actions · 2025-09-19T00:31:16Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

gemini-code-assist

Code Review

This pull request refactors the quantization logic by centralizing the packed_modules_mapping from individual model files into a single dictionary in vllm_ascend/quantization/utils.py. This is a good improvement for maintainability. However, I've found a critical issue where the mapping for the qwen2_5_vl model was removed but not added to the new centralized map, which will likely break quantization for that model. I've also suggested a cleanup in get_quant_method to remove a now-unused parameter, improving code clarity.

vllm_ascend/quantization/utils.py

wangxiyuan · 2025-09-19T06:27:46Z

vllm_ascend/quantization/quant_config.py

+        "experts":
+        ["experts.0.gate_proj", "experts.0.up_proj", "experts.0.down_proj"]
+    },
+    "qwen3_next": {


once the model file of qwen3-next and qwen2.5 vl is removed from vllm-ascend, this mapping can be removed as well

cc @wxsIcey please clean this qwen3-next as well.

wangxiyuan · 2025-09-19T11:04:33Z

vllm_ascend/quantization/quant_config.py

                         prefix: str) -> Optional["QuantizeMethodBase"]:
+        vllm_config = get_current_vllm_config()
+        model_type = vllm_config.model_config.hf_config.model_type
+        if model_type in packed_modules_model_mapping:


Suggested change

if model_type in packed_modules_model_mapping:

if model_type in packed_modules_model_mapping.keys():

maybe should be this ?

Signed-off-by: 22dimensions <waitingwind@foxmail.com>

github-actions bot added the module:quantization label Sep 19, 2025

gemini-code-assist bot reviewed Sep 19, 2025

View reviewed changes

vllm_ascend/quantization/utils.py Outdated Show resolved Hide resolved

vllm_ascend/quantization/utils.py Show resolved Hide resolved

22dimensions force-pushed the remove_packed_module branch 4 times, most recently from 09b8b4e to c693211 Compare September 19, 2025 03:42

github-actions bot added the module:tests label Sep 19, 2025

wangxiyuan added ready read for review ready-for-test start test by label for PR labels Sep 19, 2025

wangxiyuan reviewed Sep 19, 2025

View reviewed changes

[3/N][Refactor][Quantization]remove packed_modules_mapping from models

6a91933

Signed-off-by: 22dimensions <waitingwind@foxmail.com>

22dimensions force-pushed the remove_packed_module branch from c693211 to 6a91933 Compare September 19, 2025 11:17

wangxiyuan approved these changes Sep 19, 2025

View reviewed changes

wangxiyuan merged commit 0942d9a into vllm-project:main Sep 19, 2025
17 checks passed

Yikun mentioned this pull request Sep 20, 2025

[Bug]: Remove outofdate commits to improve perf test #3051

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[3/N][Refactor][Quantization]remove packed_modules_mapping from models #3021

[3/N][Refactor][Quantization]remove packed_modules_mapping from models #3021

22dimensions commented Sep 19, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Sep 19, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

wangxiyuan Sep 19, 2025

Uh oh!

wangxiyuan Sep 19, 2025

Uh oh!

Uh oh!

Uh oh!

	if model_type in packed_modules_model_mapping:
	if model_type in packed_modules_model_mapping.keys():

[3/N][Refactor][Quantization]remove packed_modules_mapping from models #3021

[3/N][Refactor][Quantization]remove packed_modules_mapping from models #3021

Conversation

22dimensions commented Sep 19, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

github-actions bot commented Sep 19, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

wangxiyuan Sep 19, 2025

Choose a reason for hiding this comment

Uh oh!

wangxiyuan Sep 19, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

22dimensions commented Sep 19, 2025 •

edited by github-actions bot

Loading