Skip to content

Commit ae2438b

Browse files
Merge pull request #3 from Delphine-Nic/long_seq_tmp
【bugfix】128K Long Sequence Freezes in CP&SP Scenario
2 parents 5fe2d7d + 6e9b4d1 commit ae2438b

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

vllm_ascend/worker/model_runner_v1.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -242,7 +242,7 @@ def __init__(self, vllm_config: VllmConfig, device: torch.device):
242242
self.attn_metadata_builder = self.attn_backend.get_builder_cls()(
243243
vllm_config, device)
244244
self.attn_mask_builder = AttentionMaskBuilder(
245-
self.model_config.max_model_len, self.dtype)
245+
self.model_config.max_model_len, self.dtype) if self.cp_size * self.sp_size == 1 else None
246246

247247
# Set up speculative decoding.
248248
self.use_aux_hidden_state_outputs = False

0 commit comments

Comments
 (0)