Skip to content

[Perf] Add new npu_fused_infer_attention_score op to improve perfomance in splitfuse cases and resolve long-seq mask problems #8129

[Perf] Add new npu_fused_infer_attention_score op to improve perfomance in splitfuse cases and resolve long-seq mask problems

[Perf] Add new npu_fused_infer_attention_score op to improve perfomance in splitfuse cases and resolve long-seq mask problems #8129

Triggered via pull request September 16, 2025 12:31
@qyqc731qyqc731
synchronize #2962
qyqc731:main
Status Success
Total duration 5m 56s
Artifacts 2

release_whl.yml

on: pull_request
Matrix: build and release wheel
Fit to window
Zoom out
Zoom in

Artifacts

Produced during runtime
Name Size Digest
vllm-ascend-ubuntu-24.04-arm-py3.11-wheel
526 KB
sha256:62909b2ad2ec6b9dea0320a566013e6651da7a5621b2101b3a6d56f09442cac6
vllm-ascend-ubuntu-24.04-py3.11-wheel
535 KB
sha256:b158ccccbd0b721b133159b53e806c8755def82d2da27f6707e88600781fbba8