[bugfix][torchair] fix wasted NPU memory buffer allocation for quantized deepseek with unquantized MTP layer #5995
Triggered via pull request
September 20, 2025 14:12
linfeng-yuan
synchronize
#3068
Status
Success
Total duration
14s
Artifacts
–
format_pr_body.yaml
on: pull_request_target
update vLLM version
10s
Annotations
1 warning
update vLLM version
The `python-version` input is not set. The version of Python currently in `PATH` will be used.
|