[bugfix][torchair] fix wasted NPU memory buffer allocation for quantized deepseek with unquantized MTP layer #4037
image_a3_ubuntu.yml
on: pull_request
vllm-ascend image build
13m 14s
Artifacts
Produced during runtime
Name | Size | Digest | |
---|---|---|---|
vllm-project~vllm-ascend~MC0OQG.dockerbuild
|
107 KB |
sha256:6faf15bf918b7de381d623cce15344af9986caf8ac338696cb023779a6d54c4d
|
|