Skip to content

Conversation

jesse996
Copy link
Contributor

@jesse996 jesse996 commented Sep 8, 2025

What this PR does / why we need it?

based on the vllm-project/vllm#23770,
fix Async scheduling and PP compatibility with DP, also fixes issue with finished requests not being processed in async scheduling and PP cases, and possible worker race conditions.

Does this PR introduce any user-facing change?

How was this patch tested?

Signed-off-by: jesse <szxfml@gmail.com>
Copy link

github-actions bot commented Sep 8, 2025

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request aims to fix issues related to asynchronous scheduling and pipeline parallelism (PP) compatibility with data parallelism (DP). The changes in vllm_ascend/worker/worker_v1.py introduce a check to prevent deadlocks when no tokens are scheduled and refactor the model execution logic for intermediate pipeline ranks. While the refactoring improves clarity, it introduces a critical type error where a function returns a value that violates its type hint, potentially causing runtime failures. My review focuses on correcting this type violation to ensure the fix is robust.

Signed-off-by: jesse <szxfml@gmail.com>
Signed-off-by: jesse <szxfml@gmail.com>
Signed-off-by: jesse <szxfml@gmail.com>
Signed-off-by: jesse <szxfml@gmail.com>
Signed-off-by: jesse <szxfml@gmail.com>
Signed-off-by: jesse <szxfml@gmail.com>
Signed-off-by: jesse <szxfml@gmail.com>
Signed-off-by: jesse <szxfml@gmail.com>
Copy link

codecov bot commented Sep 9, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 75.25%. Comparing base (1bbb20e) to head (eada40c).
⚠️ Report is 26 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2796      +/-   ##
==========================================
+ Coverage   74.76%   75.25%   +0.48%     
==========================================
  Files         150      154       +4     
  Lines       20891    21413     +522     
==========================================
+ Hits        15620    16115     +495     
- Misses       5271     5298      +27     
Flag Coverage Δ
unittests 75.25% <100.00%> (+0.48%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Signed-off-by: jesse <szxfml@gmail.com>
Signed-off-by: jesse <szxfml@gmail.com>
Signed-off-by: jesse <szxfml@gmail.com>
@jesse996 jesse996 force-pushed the async-sched-dp branch 3 times, most recently from 3392700 to fcf6a9a Compare September 10, 2025 02:35
Copy link

This pull request has conflicts, please resolve those before we can evaluate the pull request.

@jesse996 jesse996 force-pushed the async-sched-dp branch 2 times, most recently from c14a251 to 629641c Compare September 10, 2025 06:51
Signed-off-by: jesse <szxfml@gmail.com>
Signed-off-by: jesse <szxfml@gmail.com>
Copy link

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Signed-off-by: jesse <szxfml@gmail.com>
Signed-off-by: jesse <szxfml@gmail.com>
Signed-off-by: jesse <szxfml@gmail.com>
Signed-off-by: jesse <szxfml@gmail.com>
@wangxiyuan wangxiyuan added ready read for review ready-for-test start test by label for PR labels Sep 18, 2025
@wangxiyuan wangxiyuan merged commit 833cd1b into vllm-project:main Sep 19, 2025
47 checks passed
@jesse996 jesse996 deleted the async-sched-dp branch September 19, 2025 05:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module:tests ready read for review ready-for-test start test by label for PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants