Skip to content

Conversation

MengqingCao
Copy link
Collaborator

@MengqingCao MengqingCao commented Jun 6, 2025

What this PR does / why we need it?

Add with_prefill_across_dp to AscendMetadata to fix dp

This pr fixes the bug introduced by #1012, which add an arg with_prefill_across_dp when dp_size > 1.

Signed-off-by: MengqingCao <cmq0113@163.com>
@wangxiyuan wangxiyuan added the ready read for review label Jun 6, 2025
@wangxiyuan wangxiyuan merged commit c466324 into vllm-project:main Jun 6, 2025
26 checks passed
Yikun pushed a commit that referenced this pull request Jun 26, 2025
### What this PR does / why we need it?
After #1094, decode might be executed with non-compiled mode, despite of
`torchair_graph_config.enabled`, causing multistream mla to fail, which
assumes torchair compiled mode for decode when
`torchair_graph_config.enabled == True`.
Augment that assumption to fix this.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Tested both offline, and by graph mode mla e2e testcase.

---------

Signed-off-by: sdmyzlp <lrwei2@petalmail.com>
sdmyzlp added a commit to sdmyzlp/vllm-ascend that referenced this pull request Jun 27, 2025
After vllm-project#1094, decode might be executed with non-compiled mode, despite of
`torchair_graph_config.enabled`, causing multistream mla to fail, which
assumes torchair compiled mode for decode when
`torchair_graph_config.enabled == True`.
Augment that assumption to fix this.

Tested-by: weiguihua2 <weiguihua2@huawei.com>
Signed-off-by: sdmyzlp <lrwei2@petalmail.com>
weijinqian0 pushed a commit to weijinqian0/vllm-ascend that referenced this pull request Jun 30, 2025
### What this PR does / why we need it?
After vllm-project#1094, decode might be executed with non-compiled mode, despite of
`torchair_graph_config.enabled`, causing multistream mla to fail, which
assumes torchair compiled mode for decode when
`torchair_graph_config.enabled == True`.
Augment that assumption to fix this.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Tested both offline, and by graph mode mla e2e testcase.

---------

Signed-off-by: sdmyzlp <lrwei2@petalmail.com>
@MengqingCao MengqingCao deleted the fixdp branch July 8, 2025 02:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready read for review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants