[bugfix][torchair] fix wasted NPU memory buffer allocation for quantized deepseek with unquantized MTP layer #4030
image_a3_ubuntu.yml
on: pull_request
vllm-ascend image build
15m 28s
Annotations
1 error
image / Ubuntu / a3
Canceling since a higher priority waiting request for image / Ubuntu / a3-refs/pull/3068/merge exists
|
Artifacts
Produced during runtime
Name | Size | Digest | |
---|---|---|---|
vllm-project~vllm-ascend~CYQXFM.dockerbuild
|
104 KB |
sha256:7938b26d382592ba14b0a7c40e61a1d3536a30a1f0a82ef4a3a598c3432b8179
|
|