[bugfix][torchair] fix wasted NPU memory buffer allocation for quantized deepseek with unquantized MTP layer #4029
image_a3_ubuntu.yml
on: pull_request
vllm-ascend image build
15m 41s
Artifacts
Produced during runtime
Name | Size | Digest | |
---|---|---|---|
vllm-project~vllm-ascend~WVH2YI.dockerbuild
|
104 KB |
sha256:55914fac11052e248e69bf069282a23c278a15c34bfd8c14978198d9a95e07cc
|
|