support fused_moe_allgather_ep #1220

lyj-jjj · 2025-06-14T08:11:02Z

What this PR does / why we need it?

support fused_moe_allgather_ep

Does this PR introduce any user-facing change?

~~

How was this patch tested?

~~

Signed-off-by: lyj-jjj <liuyingjun5@huawei.com>

Yikun · 2025-06-14T09:20:05Z

Please update commits msg (epscially How was this patch tested?, should mention a E2E step how to reproduce)
Add UT or E2E test to avoid this being break again after merge.

Copilot

Pull Request Overview

This PR adds support for a new fused MoE allgather endpoint path (fused_moe_allgather_ep) in both the dynamic quantization workflow and the DeepSeek model.

Introduces fused_experts_with_allgather operator in w8a8_dynamic.py
Registers and checks a new VLLM_ENABLE_FUSED_EXPERTS_ALLGATHER_EP env var and etp_group
Updates DeepSeek V2 to conditionally prefer allgather-based MoE routing over all-to-all

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File	Description
vllm_ascend/quantization/w8a8_dynamic.py	Added `fused_experts_with_allgather`, imported `get_etp_group`
vllm_ascend/models/deepseek_v2.py	Added `fused_experts_allgather_ep_enabled` flag and logic tweaks
vllm_ascend/envs.py	Defined `VLLM_ENABLE_FUSED_EXPERTS_ALLGATHER_EP` environment var

Comments suppressed due to low confidence (2)

vllm_ascend/quantization/w8a8_dynamic.py:349

This new public function lacks a docstring explaining its purpose, parameters, and return value. Please add a clear docstring to improve maintainability and usability.

def fused_experts_with_allgather(hidden_states: torch.Tensor,

vllm_ascend/models/deepseek_v2.py:332

[nitpick] The flag enable_alltoall_ep is overloaded to control both all-to-all and allgather behaviors. Consider renaming it to something like alltoall_ep_enabled for clarity.

        else:

vllm_ascend/quantization/w8a8_dynamic.py

vllm_ascend/models/deepseek_v2.py

Signed-off-by: lyj-jjj <liuyingjun5@huawei.com>

lyj-jjj · 2025-06-16T12:29:28Z

Please update commits msg (epscially How was this patch tested?, should mention a E2E step how to reproduce)

Add UT or E2E test to avoid this being break again after merge.

Signed-off-by: lyj-jjj <liuyingjun5@huawei.com>

jianzs · 2025-06-16T14:56:34Z

This pr can be merged once the CI is happy. cc @Yikun @wangxiyuan

vllm_ascend/quantization/w8a8_dynamic.py

Signed-off-by: lyj-jjj <liuyingjun5@huawei.com>

wangxiyuan

This PR is fine. But I'd like to merge this after #1229. what do you think? @jianzs @Yikun

vllm_ascend/envs.py

lyj-jjj · 2025-06-17T03:55:18Z

I think my PR can be merged first because this PR is already ready and thoroughly tested with complete unit tests. It has been waiting to be merged for several days. The other PR involves significant refactoring and it's unclear when it can be merged, so I do not want my PR to be blocked by the other one. @wangxiyuan @Yikun @jianzs

github-actions · 2025-06-17T09:51:32Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Signed-off-by: lyj-jjj <liuyingjun5@huawei.com>

vllm_ascend/utils.py

Signed-off-by: lyj-jjj <liuyingjun5@huawei.com>

lyj-jjj added 2 commits June 14, 2025 14:30

support fused_moe_allgather_ep

ca078be

Signed-off-by: lyj-jjj <liuyingjun5@huawei.com>

support fused_moe_allgather_ep

cc4dcf5

Signed-off-by: lyj-jjj <liuyingjun5@huawei.com>

github-actions bot added module:core module:quantization labels Jun 14, 2025

support fused_moe_allgather_ep

a7c11ba

Signed-off-by: lyj-jjj <liuyingjun5@huawei.com>

realliujiaxu approved these changes Jun 14, 2025

View reviewed changes

jianzs mentioned this pull request Jun 14, 2025

suppor fused_allgather_ep #1215

Closed

jianzs requested a review from Copilot June 15, 2025 07:39

Copilot AI reviewed Jun 15, 2025

View reviewed changes

vllm_ascend/quantization/w8a8_dynamic.py Show resolved Hide resolved

vllm_ascend/quantization/w8a8_dynamic.py Outdated Show resolved Hide resolved

vllm_ascend/models/deepseek_v2.py Outdated Show resolved Hide resolved

support fused_moe_allgather_ep

a9d5ea8

Signed-off-by: lyj-jjj <liuyingjun5@huawei.com>

github-actions bot added the module:tests label Jun 16, 2025

jianzs added the ready-for-test start test by label for PR label Jun 16, 2025

lyj-jjj closed this Jun 16, 2025

lyj-jjj reopened this Jun 16, 2025

lyj-jjj added 2 commits June 16, 2025 20:56

support fused_moe_allgather_ep

bd0be4b

Signed-off-by: lyj-jjj <liuyingjun5@huawei.com>

support fused_moe_allgather_ep

54d209a

Signed-off-by: lyj-jjj <liuyingjun5@huawei.com>

jianzs requested a review from realliujiaxu June 16, 2025 15:06

jianzs reviewed Jun 16, 2025

View reviewed changes

vllm_ascend/quantization/w8a8_dynamic.py Show resolved Hide resolved

vllm_ascend/quantization/w8a8_dynamic.py Outdated Show resolved Hide resolved

support fused_moe_allgather_ep

1bd889a

Signed-off-by: lyj-jjj <liuyingjun5@huawei.com>

wangxiyuan reviewed Jun 17, 2025

View reviewed changes

vllm_ascend/envs.py Show resolved Hide resolved

github-actions bot added the merge-conflicts label Jun 17, 2025

Merge branch 'main' into lyj-main-allgather-update

e4c390e

github-actions bot removed the merge-conflicts label Jun 18, 2025

support fused_moe_allgather_ep

d6792fa

Signed-off-by: lyj-jjj <liuyingjun5@huawei.com>

github-actions bot added the module:ops label Jun 18, 2025

support fused_moe_allgather_ep

f3f62b7

Signed-off-by: lyj-jjj <liuyingjun5@huawei.com>

realliujiaxu reviewed Jun 18, 2025

View reviewed changes

vllm_ascend/utils.py Outdated Show resolved Hide resolved

support fused_moe_allgather_ep

3e88e4c

Signed-off-by: lyj-jjj <liuyingjun5@huawei.com>

lyj-jjj closed this Jun 18, 2025

lyj-jjj reopened this Jun 18, 2025

lyj-jjj added 3 commits June 18, 2025 23:33

support fused_moe_allgather_ep

206e3cf

Signed-off-by: lyj-jjj <liuyingjun5@huawei.com>

support fused_moe_allgather_ep

21b63d4

Signed-off-by: lyj-jjj <liuyingjun5@huawei.com>

support fused_moe_allgather_ep

406bd05

Signed-off-by: lyj-jjj <liuyingjun5@huawei.com>

lyj-jjj closed this Jun 20, 2025

lyj-jjj reopened this Jun 20, 2025

lyj-jjj closed this Jun 21, 2025

lyj-jjj deleted the lyj-main-allgather-update branch July 1, 2025 11:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

support fused_moe_allgather_ep #1220

support fused_moe_allgather_ep #1220

Uh oh!

lyj-jjj commented Jun 14, 2025

Uh oh!

Yikun commented Jun 14, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lyj-jjj commented Jun 16, 2025

Uh oh!

jianzs commented Jun 16, 2025

Uh oh!

Uh oh!

Uh oh!

wangxiyuan left a comment

Uh oh!

Uh oh!

lyj-jjj commented Jun 17, 2025

Uh oh!

github-actions bot commented Jun 17, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

support fused_moe_allgather_ep #1220

support fused_moe_allgather_ep #1220

Uh oh!

Conversation

lyj-jjj commented Jun 14, 2025

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

Yikun commented Jun 14, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lyj-jjj commented Jun 16, 2025

Uh oh!

jianzs commented Jun 16, 2025

Uh oh!

Uh oh!

Uh oh!

wangxiyuan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

lyj-jjj commented Jun 17, 2025

Uh oh!

github-actions bot commented Jun 17, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants