Skip to content

[Perf] Add new npu_fused_infer_attention_score op to improve perfomance in splitfuse cases and resolve long-seq mask problems #1282

[Perf] Add new npu_fused_infer_attention_score op to improve perfomance in splitfuse cases and resolve long-seq mask problems

[Perf] Add new npu_fused_infer_attention_score op to improve perfomance in splitfuse cases and resolve long-seq mask problems #1282

Triggered via pull request September 20, 2025 08:30
Status Cancelled
Total duration 1s
Artifacts

vllm_ascend_test_pd.yaml

on: pull_request
Matrix: vLLM Ascend prefilling decoding disaggregation test
Waiting for pending jobs
Fit to window
Zoom out
Zoom in

Annotations

1 error
e2e test / pd-disaggregation
Canceling since a higher priority waiting request for static-8-01-cards exists