Support pipeline parallel in V1 Engine #1700

weiguihua2 · 2025-07-09T08:11:52Z

What this PR does / why we need it?

This patch supports pipeline parallel in V1 Engine

Does this PR introduce any user-facing change?

Yes, users can run PP in V1

How was this patch tested?

Manully test

vLLM version: v0.9.2
vLLM main: vllm-project/vllm@31d5c17

codecov · 2025-07-09T10:03:41Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 54.42%. Comparing base (c30ddb8) to head (c4f5426).
Report is 112 commits behind head on main.

Additional details and impacted files

@@             Coverage Diff             @@
##             main    #1700       +/-   ##
===========================================
+ Coverage   27.39%   54.42%   +27.02%     
===========================================
  Files          56       80       +24     
  Lines        6191    10007     +3816     
===========================================
+ Hits         1696     5446     +3750     
- Misses       4495     4561       +66

Flag	Coverage Δ
unittests	`54.42% <ø> (+27.02%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

MengqingCao · 2025-07-10T06:27:20Z

overall lgtm, please add an e2e test for pp, thanks!

wangxiyuan · 2025-07-10T08:53:09Z

yes, the e2e test should be added at least. A simple one with qwen3-0.6b + PP=2 in multicard is welcome

.github/workflows/vllm_ascend_test.yaml

MengqingCao · 2025-07-11T01:07:27Z

tests/e2e/multicard/test_pipeline_parallel.py

+@pytest.mark.parametrize("pp_size", PIPELINE_PARALLELS)
+def test_models(model: str, tp_size: int, pp_size: int) -> None:
+    # Create an LLM.
+    llm = LLM(


I recommand to use tests.conftest.VllmRunner, as we don't need to do the resource clear by hand

tests/e2e/multicard/test_pipeline_parallel.py

Signed-off-by: weiguihua2 <weiguihua2@huawei.com>

weiguihua2 force-pushed the main branch from b735b12 to e547b7a Compare July 10, 2025 01:50

Yikun mentioned this pull request Jul 10, 2025

[0.9.1] Support pipeline parallel in V1 Engine #1557

Open

weiguihua2 force-pushed the main branch from e547b7a to 9b85f1b Compare July 10, 2025 02:59

Yikun changed the title ~~[cherry-pick]engineV1 support pipeline parallel~~ Support pipeline parallel in V1 Engine Jul 10, 2025

weiguihua2 force-pushed the main branch from 9b85f1b to 35c7f37 Compare July 10, 2025 10:53

github-actions bot added the module:tests label Jul 10, 2025

wangxiyuan reviewed Jul 11, 2025

View reviewed changes

.github/workflows/vllm_ascend_test.yaml Outdated Show resolved Hide resolved

wangxiyuan approved these changes Jul 11, 2025

View reviewed changes

MengqingCao reviewed Jul 11, 2025

View reviewed changes

cherry-pick:engineV1 support pipeline parallel

c4f5426

Signed-off-by: weiguihua2 <weiguihua2@huawei.com>

weiguihua2 force-pushed the main branch from 7e737e7 to c4f5426 Compare July 11, 2025 05:30

wangxiyuan merged commit aa4240c into vllm-project:main Jul 11, 2025
22 checks passed

Yikun mentioned this pull request Jul 12, 2025

Significant inference speed difference between vllm-ascend and MindIE #1621

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support pipeline parallel in V1 Engine #1700

Support pipeline parallel in V1 Engine #1700

Uh oh!

weiguihua2 commented Jul 9, 2025 •

edited by github-actions bot

Loading

Uh oh!

codecov bot commented Jul 9, 2025 •

edited

Loading

Uh oh!

MengqingCao commented Jul 10, 2025 •

edited

Loading

Uh oh!

wangxiyuan commented Jul 10, 2025 •

edited

Loading

Uh oh!

Uh oh!

MengqingCao Jul 11, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Support pipeline parallel in V1 Engine #1700

Support pipeline parallel in V1 Engine #1700

Uh oh!

Conversation

weiguihua2 commented Jul 9, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

codecov bot commented Jul 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

MengqingCao commented Jul 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wangxiyuan commented Jul 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

MengqingCao Jul 11, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

weiguihua2 commented Jul 9, 2025 •

edited by github-actions bot

Loading

codecov bot commented Jul 9, 2025 •

edited

Loading

MengqingCao commented Jul 10, 2025 •

edited

Loading

wangxiyuan commented Jul 10, 2025 •

edited

Loading