Skip to content

[Perf] Add new npu_fused_infer_attention_score op to improve perfomance in splitfuse cases and resolve long-seq mask problems #8365

[Perf] Add new npu_fused_infer_attention_score op to improve perfomance in splitfuse cases and resolve long-seq mask problems

[Perf] Add new npu_fused_infer_attention_score op to improve perfomance in splitfuse cases and resolve long-seq mask problems #8365

Triggered via pull request September 20, 2025 04:44
Status Success
Total duration 5m 16s
Artifacts 2

release_whl.yml

on: pull_request
Matrix: build and release wheel
Fit to window
Zoom out
Zoom in

Artifacts

Produced during runtime
Name Size Digest
vllm-ascend-ubuntu-24.04-arm-py3.11-wheel
573 KB
sha256:dd658bc03b9c584e09ecb9da3380c37670f1b26d6c33382335dfa4f9f29e01e2
vllm-ascend-ubuntu-24.04-py3.11-wheel
582 KB
sha256:4b6fab25c7491ee96abf5850e1d4bbcdb9f7b9ca9d30b83a38dea11f82b0704b