[Dist][EP] Remove ETP/EP maintained in vllm-ascend #1681

MengqingCao · 2025-07-09T01:55:00Z

What this PR does / why we need it?

Remove ETP/EP maintained in branch main. We drop this as there is no relevant scenarios to use ETP now, and we may subsequently advocate implementing expert tensor parallelism in vLLM to support scenarios where the expert is needed to be sliced

This is a part of #1422 backport.

Fixes #1396 #1154

Does this PR introduce any user-facing change?

We'll not maintain etp/ep in vllm-ascend anymore, and use the tp/ep in vllm instead.

How was this patch tested?

CI passed with new added and existing test.

vLLM version: v0.9.2
vLLM main: vllm-project/vllm@fe8a2c5

codecov · 2025-07-09T07:21:39Z

Codecov Report

Attention: Patch coverage is 36.36364% with 7 lines in your changes missing coverage. Please review.

Project coverage is 53.41%. Comparing base (f9dfde0) to head (fa7a1f0).
Report is 13 commits behind head on main.

Files with missing lines	Patch %	Lines
vllm_ascend/ops/fused_moe.py	12.50%	7 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1681      +/-   ##
==========================================
- Coverage   54.18%   53.41%   -0.77%     
==========================================
  Files          74       72       -2     
  Lines        9235     9053     -182     
==========================================
- Hits         5004     4836     -168     
+ Misses       4231     4217      -14

Flag	Coverage Δ
unittests	`53.41% <36.36%> (-0.77%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

github-actions · 2025-07-10T06:31:15Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

MengqingCao · 2025-07-11T06:58:33Z

tests/e2e/multicard/test_torchair_graph_mode.py

@@ -129,6 +129,7 @@ def _pangu_torchair_test_fixture(
            distributed_executor_backend="mp",
            enforce_eager=False,
            additional_config=additional_config,
+            enable_expert_parallel=True,


We must set enable_expert_parallel to True when using pangu with the ep in vllm

We should do this until disabling ep is supported in pangu. plz help review this, thanks! cc @Angazenn

So maybe the related doc should be updated as well. For example https://vllm-ascend.readthedocs.io/en/latest/tutorials/multi_npu_moge.html

So maybe the related doc should be updated as well. For example https://vllm-ascend.readthedocs.io/en/latest/tutorials/multi_npu_moge.html

Now the example is updated, thanks!

tests/e2e/multicard/test_ep.py

wangxiyuan · 2025-07-14T03:34:33Z

tests/e2e/multicard/test_torchair_graph_mode.py

@@ -129,6 +129,7 @@ def _pangu_torchair_test_fixture(
            distributed_executor_backend="mp",
            enforce_eager=False,
            additional_config=additional_config,
+            enable_expert_parallel=True,


So maybe the related doc should be updated as well. For example https://vllm-ascend.readthedocs.io/en/latest/tutorials/multi_npu_moge.html

MengqingCao · 2025-07-15T08:38:49Z

@jianzs @ttanzhiqiang could you help review this pr?

vllm_ascend/ops/fused_moe.py

jianzs · 2025-07-15T09:24:17Z

vllm_ascend/models/pangu_moe.py

+        tp_rank = get_tp_group().rank_in_group
+        tp_size = get_tp_group().world_size


I think it's fine to use tp group here? cause this is the weight loader for linear layer

vllm_ascend/models/pangu_moe.py

ttanzhiqiang · 2025-07-15T09:36:20Z

@MengqingCao

If Prefill/decode uses AllGather or NaiveMulticast solution at the same time, this is ETP logic.
If Prefill/decode uses All2All/MC2 solution at the same time, this is EP logic.
Prefill uses AllGatherEP solution (using VLLM_ENABLE_FUSED_EXPERTS_ALLGATHER_EP switch), and Decode uses MC2 solution, this is AllGatherEP logic. .
In the PD separation scenario, the strategies used by P and D are separate.
After this pr, will the get_fused_moe_state function still have ep=1?

MengqingCao · 2025-07-15T12:53:40Z

@MengqingCao

If Prefill/decode uses AllGather or NaiveMulticast solution at the same time, this is ETP logic.

If Prefill/decode uses All2All/MC2 solution at the same time, this is EP logic.

Prefill uses AllGatherEP solution (using VLLM_ENABLE_FUSED_EXPERTS_ALLGATHER_EP switch), and Decode uses MC2 solution, this is AllGatherEP logic. .

Thanks for this info, but I'm not formalir with the different fused moe state, maybe I need to read more code to understand the above 1, 2 and 3.

In the PD separation scenario, the strategies used by P and D are separate.
After this pr, will the get_fused_moe_state function still have ep=1?

I think we still have ep=1 when ep is disabled, you can refer to https://github.com/vllm-project/vllm/blob/235bfd5dfe0975e42b115cfb910e73eff5c670d8/vllm/model_executor/layers/fused_moe/config.py#L274-L281

github-actions · 2025-07-16T02:59:12Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

MengqingCao · 2025-07-16T06:09:03Z

@ttanzhiqiang maybe we should remove the best example on A2 in #1101, WDYT?

ApsarasX · 2025-07-16T07:12:20Z

@ttanzhiqiang maybe we should remove the best example on A2 in #1101, WDYT?

I think you can remove these scripts, as we only need to maintain them internally on our side.

MengqingCao · 2025-07-16T07:25:23Z

@ttanzhiqiang maybe we should remove the best example on A2 in #1101, WDYT?

I think you can remove these scripts, as we only need to maintain them internally on our side.

OK, I want to remove it mainly because it is a best example on ETP, which is removed here. I'll remove it then

Signed-off-by: MengqingCao <cmq0113@163.com>

MengqingCao · 2025-07-18T01:16:04Z

Hi @jianzs @ttanzhiqiang @ApsarasX , your suggestions are addressed now, could you take a look again? thanks!

wangxiyuan · 2025-07-21T01:08:27Z

Let's merge this first. Before next release, we should do a deep test about TP and EP

MingXiangL · 2025-07-21T02:41:17Z

Let's merge this first. Before next release, we should do a deep test about TP and EP

Do you have a timeline for the next release? Also, are there any temporary solutions for the bug described in #1396?

github-actions bot added documentation Improvements or additions to documentation module:tests module:ops module:core module:quantization labels Jul 9, 2025

MengqingCao force-pushed the etp branch 2 times, most recently from 0e03620 to 5a32cb8 Compare July 10, 2025 03:09

github-actions bot added the merge-conflicts label Jul 10, 2025

MengqingCao force-pushed the etp branch from 5a32cb8 to de38213 Compare July 10, 2025 11:55

github-actions bot removed the merge-conflicts label Jul 10, 2025

MengqingCao commented Jul 11, 2025

View reviewed changes

MengqingCao mentioned this pull request Jul 14, 2025

[ExternalDP][RL] Make external DP support on EP and ETP #1384

Closed

wangxiyuan reviewed Jul 14, 2025

View reviewed changes

MengqingCao force-pushed the etp branch from 8fb2cd7 to f73bbbc Compare July 15, 2025 01:15

wangxiyuan approved these changes Jul 15, 2025

View reviewed changes

jianzs reviewed Jul 15, 2025

View reviewed changes

vllm_ascend/ops/fused_moe.py Outdated Show resolved Hide resolved

jianzs reviewed Jul 15, 2025

View reviewed changes

ApsarasX reviewed Jul 15, 2025

View reviewed changes

vllm_ascend/models/pangu_moe.py Show resolved Hide resolved

github-actions bot added the merge-conflicts label Jul 16, 2025

MengqingCao force-pushed the etp branch from 61dee65 to f47716a Compare July 17, 2025 03:13

github-actions bot removed the merge-conflicts label Jul 17, 2025

MengqingCao force-pushed the etp branch 2 times, most recently from ae9f97d to cf0c584 Compare July 17, 2025 06:29

[Dist][EP] Remove ETP/EP maintained in vllm-ascend

fa7a1f0

Signed-off-by: MengqingCao <cmq0113@163.com>

MengqingCao force-pushed the etp branch from cf0c584 to fa7a1f0 Compare July 17, 2025 12:15

wangxiyuan approved these changes Jul 18, 2025

View reviewed changes

This was referenced Jul 20, 2025

[Bug]: external_launcher下无法感知data_parallel_size 导致cpu_group计算异常 assert self.cpu_group is not None #1154

Open

[Bug]: assert self.cpu_group is not None #1396

Closed

wangxiyuan approved these changes Jul 21, 2025

View reviewed changes

wangxiyuan merged commit 8cfd257 into vllm-project:main Jul 21, 2025
24 checks passed

MengqingCao deleted the etp branch July 21, 2025 01:50

		tp_rank = get_tp_group().rank_in_group
		tp_size = get_tp_group().world_size

[Dist][EP] Remove ETP/EP maintained in vllm-ascend #1681

[Dist][EP] Remove ETP/EP maintained in vllm-ascend #1681

Conversation

MengqingCao commented Jul 9, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

codecov bot commented Jul 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

github-actions bot commented Jul 10, 2025

Uh oh!

MengqingCao Jul 11, 2025

Choose a reason for hiding this comment

Uh oh!

MengqingCao Jul 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wangxiyuan Jul 14, 2025

Choose a reason for hiding this comment

Uh oh!

MengqingCao Jul 15, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

wangxiyuan Jul 14, 2025

Choose a reason for hiding this comment

Uh oh!

MengqingCao commented Jul 15, 2025

Uh oh!

Uh oh!

jianzs Jul 15, 2025

Choose a reason for hiding this comment

Uh oh!

MengqingCao Jul 15, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ttanzhiqiang commented Jul 15, 2025

Uh oh!

MengqingCao commented Jul 15, 2025

Uh oh!

github-actions bot commented Jul 16, 2025

Uh oh!

MengqingCao commented Jul 16, 2025

Uh oh!

ApsarasX commented Jul 16, 2025

Uh oh!

MengqingCao commented Jul 16, 2025

Uh oh!

MengqingCao commented Jul 18, 2025

Uh oh!

Uh oh!

wangxiyuan commented Jul 21, 2025

Uh oh!

MingXiangL commented Jul 21, 2025

Uh oh!

Uh oh!

MengqingCao commented Jul 9, 2025 •

edited by github-actions bot

Loading

codecov bot commented Jul 9, 2025 •

edited

Loading

MengqingCao Jul 14, 2025 •

edited

Loading