[Perf] Add new npu_fused_infer_attention_score op to improve perfomance in splitfuse cases and resolve long-seq mask problems #9583
image_ubuntu.yml
on: pull_request
vllm-ascend image build
10m 51s
Artifacts
Produced during runtime
Name | Size | Digest | |
---|---|---|---|
vllm-project~vllm-ascend~IJRFL5.dockerbuild
|
106 KB |
sha256:106e891f2bfd058b8e7636a60b1c44a250f00fef814b430eb80926b01bc091ae
|
|