Skip to content

Conversation

22dimensions
Copy link
Collaborator

remove old quantization model, and new models will be added to testcase later.

@Yikun
Copy link
Collaborator

Yikun commented May 29, 2025

@22dimensions Could we add new test first (vllm-ascend/Qwen2.5-0.5B-Instruct-w8a8-new) and then remove the old one.

@22dimensions
Copy link
Collaborator Author

22dimensions commented May 29, 2025

@22dimensions Could we add new test first (vllm-ascend/Qwen2.5-0.5B-Instruct-w8a8-new) and then remove the old one.

The new quantization model's name is vllm-ascend/Qwen2.5-0.5B-Instruct-W8A8, but i can't create this model in modelscope, cause it is case insensitive, so i need to delete old model first. And the new PR will be merged toady.

@22dimensions 22dimensions force-pushed the remove_model branch 4 times, most recently from ded4251 to 85bb352 Compare May 29, 2025 12:12
@wangxiyuan wangxiyuan mentioned this pull request Jun 4, 2025
76 tasks
Copy link
Collaborator

@Yikun Yikun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you estimate how long this test will be added? LGTM if CI passed

@22dimensions 22dimensions force-pushed the remove_model branch 13 times, most recently from 088f8d6 to 6450cfb Compare June 8, 2025 18:19

QUANTIZATION_MODELS = [
"vllm-ascend/Qwen2.5-0.5B-Instruct-W8A8-new",
"vllm-ascend/DeepSeek-V2-Lite-W8A8"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can DeepSeek-V2-Lite-W8A8 work with only one card?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It works on local machine with 32GB device memory, but it doesn't works on CI machine. So I remove it temporally. I will add this model later.

@22dimensions 22dimensions force-pushed the remove_model branch 2 times, most recently from 8f06555 to 9082250 Compare June 9, 2025 10:49
Signed-off-by: 22dimensions <waitingwind@foxmail.com>
@wangxiyuan wangxiyuan changed the title remove old quantization model [CI] remove old quantization model Jun 10, 2025
@wangxiyuan wangxiyuan merged commit 5cd5d64 into vllm-project:main Jun 10, 2025
14 checks passed
momo609 pushed a commit to momo609/vllm-ascend that referenced this pull request Jun 17, 2025
remove old quantization model, and new models will be added to testcase
later.

Signed-off-by: 22dimensions <waitingwind@foxmail.com>
momo609 pushed a commit to momo609/vllm-ascend that referenced this pull request Jun 17, 2025
remove old quantization model, and new models will be added to testcase
later.

Signed-off-by: 22dimensions <waitingwind@foxmail.com>
Signed-off-by: wangxiaoxin (A) <wangxiaoxin7@huawei.com>
momo609 pushed a commit to momo609/vllm-ascend that referenced this pull request Jun 17, 2025
remove old quantization model, and new models will be added to testcase
later.

Signed-off-by: 22dimensions <waitingwind@foxmail.com>
Signed-off-by: wangxiaoxin (A) <wangxiaoxin7@huawei.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants