[Perf] Add new npu_fused_infer_attention_score op to improve perfomance in splitfuse cases and resolve long-seq mask problems #4041
image_a3_ubuntu.yml
on: pull_request
vllm-ascend image build
11m 51s
Artifacts
Produced during runtime
Name | Size | Digest | |
---|---|---|---|
vllm-project~vllm-ascend~2AFGYC.dockerbuild
|
105 KB |
sha256:3245b1fbaecd9e64143fab0b97a51e4a2070f5c314b1eddd362f524dbc9a82ee
|
|