Skip to content

[Perf] Add new npu_fused_infer_attention_score op to improve perfomance in splitfuse cases and resolve long-seq mask problems #12887

[Perf] Add new npu_fused_infer_attention_score op to improve perfomance in splitfuse cases and resolve long-seq mask problems

[Perf] Add new npu_fused_infer_attention_score op to improve perfomance in splitfuse cases and resolve long-seq mask problems #12887

Triggered via pull request September 20, 2025 04:16
Status Cancelled
Total duration 28m 2s
Artifacts

vllm_ascend_test.yaml

on: pull_request
Matrix: singlecard e2e test - light
Matrix: unit test
Matrix: multicard e2e test - light
Fit to window
Zoom out
Zoom in

Annotations

4 errors
unit test (v0.10.2)
Process completed with exit code 1.
multicard e2e test - light (linux-aarch64-a2-2, v0.10.2)
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
multicard e2e test - light (linux-aarch64-a2-2, v0.10.2)
Canceling since a higher priority waiting request for test-refs/pull/2962/merge exists
test
Canceling since a higher priority waiting request for test-refs/pull/2962/merge exists