Skip to content

[RFC]: P/D Disaggregation Support #841

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
3 of 12 tasks
MengqingCao opened this issue May 14, 2025 · 0 comments
Open
3 of 12 tasks

[RFC]: P/D Disaggregation Support #841

MengqingCao opened this issue May 14, 2025 · 0 comments
Labels
RFC Request For Comments

Comments

@MengqingCao
Copy link
Collaborator

MengqingCao commented May 14, 2025

Motivation.

P/D Disaggregation plays a very important role in deploying vllm inference services in large-scale clusters. There is already a initial P/D Disaggregation support in vllm-ascend now, and we' ll continue to develop it with more parrallel mechanisms including tp, ep and dp, and graph mode integration, etc.

The related CI for 1p1d, xpyd scenarios will be integrated step by step, with or w/o parrallel mechanisms including tp, ep, dp, etc.

Proposed Change.

P/D Disaggregation

CI Machine Preparation

UT Integration

Feature coverage matrix

P/D Disaggregation tp ep dp
1p1d/xpyd
1p1d/xpyd
1p1d/xpyd
1p1d/xpyd
1p1d/xpyd
1p1d/xpyd
1p1d/xpyd
  • Basic P/D Disaggregation w/o parrallel mechanisms.
  • Adding the above parrallel mechanisms.
  • Adding graph mode
@MengqingCao MengqingCao added the RFC Request For Comments label May 14, 2025
@MengqingCao MengqingCao changed the title [RFC]: Building CI for P/D Disaggregation [RFC]: P/D Disaggregation Support May 14, 2025
wangxiyuan pushed a commit that referenced this issue May 14, 2025
### What this PR does / why we need it?
Add basic CI for PD disaggregation, and enable it when schedule and
label with `module:pd`

- Updated `.github/actionlint.yaml` to add a new self-hosted runner
configuration: `linux-arm64-npu-static-8`.
- Introduced a new GitHub Actions workflow
`.github/workflows/vllm_ascend_test_pd.yaml` for PD disaggregation
testing:
- Scheduled to run daily at 23:00 UTC and triggered by pull request
label `module:pd`.
- Added steps for baisci installation and other steps will add in
followup PR

Related: #841

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
- CI passed
- No trigger by default
<img width="847" alt="image"
src="https://github.com/user-attachments/assets/23aa128f-526d-447f-91c8-8ebf6be8400f"
/>
- Trigger only if we tag with pd
<img width="930" alt="image"
src="https://github.com/user-attachments/assets/aef1caca-2029-48e8-a6e6-860136adcd37"
/>

Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
RFC Request For Comments
Projects
None yet
Development

No branches or pull requests

1 participant