[Scheduler] validate max_num_batched_tokens and max_model_len in AscendSchedulerConfig #2434

linfeng-yuan · 2025-08-19T06:01:11Z

What this PR does / why we need it?

Add configuration check logic for ascend scheduler: if chunked_prefill is disabled, max_num_batched_tokens couldn't be less than max_model_len, following vLLM;

Does this PR introduce any user-facing change?

users cannot set max_num_batched_tokens smaller than max_model_len with ascend scheduler

How was this patch tested?

CI and vllm serving passed

vLLM version: v0.10.0
vLLM main: vllm-project/vllm@f77a080

gemini-code-assist

Code Review

This pull request introduces a validation check in AscendSchedulerConfig to prevent configurations where max_num_batched_tokens is less than max_model_len when chunked prefill is disabled. This is a valuable correctness fix that avoids unexpected rejection of long sequences. The implementation is sound, and the associated test updates and additions are thorough, covering both valid and invalid scenarios. The changes improve the robustness and user-friendliness of the scheduler configuration.

github-actions · 2025-08-19T06:12:53Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

…cendSchedulerConfig Signed-off-by: linfeng-yuan <1102311262@qq.com>

codecov · 2025-08-20T18:20:53Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 77.39%. Comparing base (1de16ea) to head (3ed37b2).
⚠️ Report is 12 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2434      +/-   ##
==========================================
+ Coverage   77.37%   77.39%   +0.01%     
==========================================
  Files         128      128              
  Lines       16455    16468      +13     
==========================================
+ Hits        12732    12745      +13     
  Misses       3723     3723

Flag	Coverage Δ
unittests	`77.39% <100.00%> (+0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

wangxiyuan · 2025-08-22T06:32:00Z

please update the commit message. Thanks

…ndSchedulerConfig (vllm-project#2434) ### What this PR does / why we need it? Add configuration check logic for ascend scheduler: if chunked_prefill is disabled, max_num_batched_tokens couldn't be less than max_model_len, following vLLM; ### Does this PR introduce _any_ user-facing change? users cannot set max_num_batched_tokens smaller than max_model_len with ascend scheduler ### How was this patch tested? CI and vllm serving passed - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@f77a080 Signed-off-by: linfeng-yuan <1102311262@qq.com>

gemini-code-assist bot reviewed Aug 19, 2025

View reviewed changes

github-actions bot added the module:tests label Aug 19, 2025

linfeng-yuan force-pushed the ascend_scheduler_parameter_checking branch from 3bc18ce to f697b20 Compare August 19, 2025 06:36

feat(config): validate max_num_batched_tokens and max_model_len in As…

3ed37b2

…cendSchedulerConfig Signed-off-by: linfeng-yuan <1102311262@qq.com>

linfeng-yuan force-pushed the ascend_scheduler_parameter_checking branch from f697b20 to 3ed37b2 Compare August 20, 2025 18:06

wangxiyuan approved these changes Aug 22, 2025

View reviewed changes

wangxiyuan approved these changes Aug 23, 2025

View reviewed changes

wangxiyuan merged commit 4af5b80 into vllm-project:main Aug 23, 2025
22 checks passed

Yikun mentioned this pull request Sep 20, 2025

[Bug]: Remove outofdate commits to improve perf test #3051

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Scheduler] validate max_num_batched_tokens and max_model_len in AscendSchedulerConfig #2434

[Scheduler] validate max_num_batched_tokens and max_model_len in AscendSchedulerConfig #2434

Uh oh!

linfeng-yuan commented Aug 19, 2025 •

edited

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

github-actions bot commented Aug 19, 2025

Uh oh!

codecov bot commented Aug 20, 2025 •

edited

Loading

Uh oh!

wangxiyuan commented Aug 22, 2025

Uh oh!

Uh oh!

Uh oh!

[Scheduler] validate max_num_batched_tokens and max_model_len in AscendSchedulerConfig #2434

[Scheduler] validate max_num_batched_tokens and max_model_len in AscendSchedulerConfig #2434

Uh oh!

Conversation

linfeng-yuan commented Aug 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

github-actions bot commented Aug 19, 2025

Uh oh!

codecov bot commented Aug 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

wangxiyuan commented Aug 22, 2025

Uh oh!

Uh oh!

Uh oh!

linfeng-yuan commented Aug 19, 2025 •

edited

Loading

codecov bot commented Aug 20, 2025 •

edited

Loading