Skip to content

Conversation

fems14
Copy link
Contributor

@fems14 fems14 commented Sep 22, 2025

What this PR does / why we need it?

A new kv_role "kv_both" is added to run mixed deployment scenarios. The mixed deployment will involve a decode phase, where with_prefill should be false.

Does this PR introduce any user-facing change?

How was this patch tested?

Signed-off-by: fems14 <1804143737@qq.com>
Copy link

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

@github-actions github-actions bot added the documentation Improvements or additions to documentation label Sep 22, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a bugfix for the mooncakestore connector to correctly handle the kv_both role in mixed deployment scenarios. The changes are consistent across the example deployment guide and the core logic in mooncake_engine.py and model_runner_v1.py. The logic to differentiate pure producers from mixed producer/consumer nodes for dummy runs is sound. The updated configurations in the deployment guide align with the code changes. Overall, the changes appear correct and effectively address the issue.

@fems14 fems14 changed the title mooncakestore connector mix bugfix Fix of DeepSeek-V3 Error in KV Pool Mixed Deployment Scenario Sep 22, 2025
@fems14 fems14 changed the title Fix of DeepSeek-V3 Error in KV Pool Mixed Deployment Scenario Fix of DeepSeek Error in KV Pool Mixed Deployment Scenario Sep 22, 2025
@MengqingCao MengqingCao added ready-for-test start test by label for PR ready read for review labels Sep 22, 2025
@wangxiyuan wangxiyuan merged commit 1c9f0fe into vllm-project:main Sep 22, 2025
48 checks passed
Mercykid-bash pushed a commit to Mercykid-bash/vllm-ascend that referenced this pull request Sep 22, 2025
…ect#3087)

### What this PR does / why we need it?
A new kv_role "kv_both" is added to run mixed deployment scenarios. The
mixed deployment will involve a decode phase, where with_prefill should
be false.

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.10.2
- vLLM main:
vllm-project/vllm@c60e613

Signed-off-by: fems14 <1804143737@qq.com>
Signed-off-by: Che Ruan <cr623@ic.ac.uk>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation ready read for review ready-for-test start test by label for PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants