Skip to content

[bugfix][torchair] fix wasted NPU memory buffer allocation for quantized deepseek with unquantized MTP layer #5995

[bugfix][torchair] fix wasted NPU memory buffer allocation for quantized deepseek with unquantized MTP layer

[bugfix][torchair] fix wasted NPU memory buffer allocation for quantized deepseek with unquantized MTP layer #5995

Triggered via pull request September 20, 2025 14:12
@linfeng-yuanlinfeng-yuan
synchronize #3068
Status Success
Total duration 14s
Artifacts

format_pr_body.yaml

on: pull_request_target
update vLLM version
10s
update vLLM version
Fit to window
Zoom out
Zoom in

Annotations

1 warning
update vLLM version
The `python-version` input is not set. The version of Python currently in `PATH` will be used.