[AscendScheduler][Bugfix] Remove num_draft_tokens while allocating slots #1718

MengqingCao · 2025-07-10T06:20:53Z

What this PR does / why we need it?

Now there is no need to calculate num_draft_tokens when allocating slots.

This PR follows the changes in vllm: vllm-project/vllm#20701

Does this PR introduce any user-facing change?

N/A

How was this patch tested?

CI passed with existing test

vLLM version: v0.9.2
vLLM main: vllm-project/vllm@cc876d0

wangxiyuan · 2025-07-10T07:03:51Z

vllm_ascend/core/scheduler.py

@@ -282,15 +282,10 @@ def skip_cur_request():
                    req_index += 1
                    continue

-                num_draft_tokens = max(


please use vllm_version_is to keep it both work on 0.9.2 and main

Signed-off-by: MengqingCao <cmq0113@163.com>

codecov · 2025-07-10T07:42:13Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 54.57%. Comparing base (c30ddb8) to head (5eacc1b).
Report is 107 commits behind head on main.

Additional details and impacted files

@@             Coverage Diff             @@
##             main    #1718       +/-   ##
===========================================
+ Coverage   27.39%   54.57%   +27.18%     
===========================================
  Files          56       80       +24     
  Lines        6191     9968     +3777     
===========================================
+ Hits         1696     5440     +3744     
- Misses       4495     4528       +33

Flag	Coverage Δ
unittests	`54.57% <ø> (+27.18%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Signed-off-by: MengqingCao <cmq0113@163.com>

wangxiyuan · 2025-07-10T09:21:20Z

tests/e2e/singlecard/test_aclgraph.py

+MODELS = [
+    "Qwen/Qwen2.5-0.5B-Instruct",
+    # TODO: REVERT ME when oom is fixed
+    # "vllm-ascend/Qwen3-30B-A3B-Puring"


Just tested locally, the test passed. I guess it's the resource release problem on CI system.

Maybe, will raise another pr to fix it

wangxiyuan reviewed Jul 10, 2025

View reviewed changes

[AscendScheduler][Bugfix] Remove num_draft_tokens while allocating slots

06be93e

Signed-off-by: MengqingCao <cmq0113@163.com>

MengqingCao force-pushed the ascendscheduler branch from d5051ff to 06be93e Compare July 10, 2025 07:12

github-actions bot added the documentation Improvements or additions to documentation label Jul 10, 2025

wangxiyuan approved these changes Jul 10, 2025

View reviewed changes

ApsarasX approved these changes Jul 10, 2025

View reviewed changes

skip test on vllm-ascend/Qwen3-30B-A3B-Puring with aclgraph

5eacc1b

Signed-off-by: MengqingCao <cmq0113@163.com>

github-actions bot added the module:tests label Jul 10, 2025

wangxiyuan reviewed Jul 10, 2025

View reviewed changes

wangxiyuan merged commit cc210f4 into vllm-project:main Jul 10, 2025
22 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[AscendScheduler][Bugfix] Remove num_draft_tokens while allocating slots #1718

[AscendScheduler][Bugfix] Remove num_draft_tokens while allocating slots #1718

Uh oh!

MengqingCao commented Jul 10, 2025 •

edited by github-actions bot

Loading

Uh oh!

wangxiyuan Jul 10, 2025

Uh oh!

MengqingCao Jul 10, 2025

Uh oh!

codecov bot commented Jul 10, 2025 •

edited

Loading

Uh oh!

wangxiyuan Jul 10, 2025

Uh oh!

MengqingCao Jul 10, 2025

Uh oh!

Uh oh!

Uh oh!

[AscendScheduler][Bugfix] Remove num_draft_tokens while allocating slots #1718

[AscendScheduler][Bugfix] Remove num_draft_tokens while allocating slots #1718

Uh oh!

Conversation

MengqingCao commented Jul 10, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

wangxiyuan Jul 10, 2025

Choose a reason for hiding this comment

Uh oh!

MengqingCao Jul 10, 2025

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Jul 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

wangxiyuan Jul 10, 2025

Choose a reason for hiding this comment

Uh oh!

MengqingCao Jul 10, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

MengqingCao commented Jul 10, 2025 •

edited by github-actions bot

Loading

codecov bot commented Jul 10, 2025 •

edited

Loading