Skip to content

Commit 507dce5

Browse files
authored
[BugFix] Fix flashcomm_v1 when engine v0 (#1859)
### What this PR does / why we need it? Fix missing attn_state attribute bug in engine_v0 scenario when enabling Flashcomm_v1. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? CI passed with new added/existing test. Signed-off-by: rjg-lyh <1318825571@qq.com>
1 parent a394155 commit 507dce5

File tree

1 file changed

+2
-0
lines changed

1 file changed

+2
-0
lines changed

vllm_ascend/models/qwen2.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33

44
import torch
55
import torch.nn.functional as F
6+
import vllm.envs as envs
67
from torch import nn
78
from transformers import Qwen2Config
89
from vllm.compilation.decorators import support_torch_compile
@@ -154,6 +155,7 @@ def forward(
154155
flashcomm_v1_enabled = False
155156
attn_metadata = get_forward_context().attn_metadata
156157
if ascend_envs.VLLM_ASCEND_ENABLE_FLASHCOMM == 1 and \
158+
envs.VLLM_USE_V1 and \
157159
attn_metadata is not None and \
158160
attn_metadata.attn_state != AscendAttentionState.DecodeOnly:
159161
flashcomm_v1_enabled = True

0 commit comments

Comments
 (0)