Skip to content

[bugfix][torchair] fix wasted NPU memory buffer allocation for quantized deepseek with unquantized MTP layer #616

[bugfix][torchair] fix wasted NPU memory buffer allocation for quantized deepseek with unquantized MTP layer

[bugfix][torchair] fix wasted NPU memory buffer allocation for quantized deepseek with unquantized MTP layer #616

Triggered via pull request September 20, 2025 14:38
Status Cancelled
Total duration 9s
Artifacts

vllm_ascend_test_full.yaml

on: pull_request
changes
5s
changes
Matrix: multicard e2e test - full
Matrix: singlecard e2e test - full
Fit to window
Zoom out
Zoom in

Annotations

1 error
test-full
Canceling since a higher priority waiting request for test-full-refs/pull/3068/merge exists