Bugfix_091dev_OOM #2910

wangxiaoteng888 · 2025-09-13T08:04:22Z

What this PR does / why we need it?

Resolving the issue of the chunked work size being too large, which leads to D node out-of-memory (OOM) errors.

Does this PR introduce any user-facing change?

no

How was this patch tested?

by ci

Signed-off-by: wangxiaoteng <wangxiaoteng@huawei.com>

gemini-code-assist

Code Review

This pull request aims to fix an out-of-memory error on decoder nodes by reducing the chunked_prefill_workspace_size. The change correctly identifies the consumer node and applies a smaller limit. However, the implementation for the consumer node is unnecessarily complex. I've provided a suggestion to simplify the logic, which makes the code more readable and maintainable while achieving the same result.

momo609 · 2025-09-14T06:30:20Z

vllm_ascend/attention/mla_v1.py

+            if vllm_config.kv_transfer_config.is_kv_consumer:
+                max_chunked_size = scheduler_config.max_num_seqs * self.block_size
+            else:
+                max_chunked_size = 128 * 1024


const variables should be defined at the front of file with bigger case, and don't forget to note it also.

weijinqian0 · 2025-09-14T13:19:11Z

vllm_ascend/attention/mla_v1.py

        self.chunked_prefill_enabled = scheduler_config.chunked_prefill_enabled
        if self.chunked_prefill_enabled:
+            if vllm_config.kv_transfer_config.is_kv_consumer:
+                max_chunked_size = scheduler_config.max_num_seqs * self.block_size


Is the OOM caused by the batch size being greater than 1024?

Signed-off-by: wangxiaoteng <wangxiaoteng@huawei.com>

bugfix_mla

bf03457

Signed-off-by: wangxiaoteng <wangxiaoteng@huawei.com>

gemini-code-assist bot reviewed Sep 13, 2025

View reviewed changes

momo609 reviewed Sep 14, 2025

View reviewed changes

weijinqian0 reviewed Sep 14, 2025

View reviewed changes

wangxiaoteng888 added 3 commits September 16, 2025 21:15

bugfix_mla

405cd8a

Signed-off-by: wangxiaoteng <wangxiaoteng@huawei.com>

bugfix_mla

dce78b8

Signed-off-by: wangxiaoteng <wangxiaoteng@huawei.com>

bugfix_mla

98d234a

Signed-off-by: wangxiaoteng <wangxiaoteng@huawei.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Bugfix_091dev_OOM #2910

Bugfix_091dev_OOM #2910

wangxiaoteng888 commented Sep 13, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

momo609 Sep 14, 2025

Uh oh!

weijinqian0 Sep 14, 2025

Uh oh!

Uh oh!

Bugfix_091dev_OOM #2910

Are you sure you want to change the base?

Bugfix_091dev_OOM #2910

Conversation

wangxiaoteng888 commented Sep 13, 2025

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

momo609 Sep 14, 2025

Choose a reason for hiding this comment

Uh oh!

weijinqian0 Sep 14, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!