-
Notifications
You must be signed in to change notification settings - Fork 459
[CI] remove old quantization model #1003
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
2c7318e
to
e79210c
Compare
@22dimensions Could we add new test first ( |
The new quantization model's name is vllm-ascend/Qwen2.5-0.5B-Instruct-W8A8, but i can't create this model in modelscope, cause it is case insensitive, so i need to delete old model first. And the new PR will be merged toady. |
ded4251
to
85bb352
Compare
85bb352
to
61c0d8a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you estimate how long this test will be added? LGTM if CI passed
088f8d6
to
6450cfb
Compare
|
||
QUANTIZATION_MODELS = [ | ||
"vllm-ascend/Qwen2.5-0.5B-Instruct-W8A8-new", | ||
"vllm-ascend/DeepSeek-V2-Lite-W8A8" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can DeepSeek-V2-Lite-W8A8 work with only one card?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It works on local machine with 32GB device memory, but it doesn't works on CI machine. So I remove it temporally. I will add this model later.
8f06555
to
9082250
Compare
Signed-off-by: 22dimensions <waitingwind@foxmail.com>
9082250
to
a6443bf
Compare
remove old quantization model, and new models will be added to testcase later. Signed-off-by: 22dimensions <waitingwind@foxmail.com>
remove old quantization model, and new models will be added to testcase later. Signed-off-by: 22dimensions <waitingwind@foxmail.com> Signed-off-by: wangxiaoxin (A) <wangxiaoxin7@huawei.com>
remove old quantization model, and new models will be added to testcase later. Signed-off-by: 22dimensions <waitingwind@foxmail.com> Signed-off-by: wangxiaoxin (A) <wangxiaoxin7@huawei.com>
remove old quantization model, and new models will be added to testcase later.