Skip to content

Conversation

whx-sjtu
Copy link
Collaborator

@whx-sjtu whx-sjtu commented Sep 16, 2025

This PR puts the calculation of shared experts into a separate stream, overlaping with routing experts.

Copy link

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces multi-streaming for Mixture-of-Experts (MoE) models on Ascend NPUs to enable overlapping computation of shared experts and routing experts, which is a good performance optimization. The implementation logic for stream management appears correct. My review focuses on improving the robustness of the newly added utility functions npu_stream_switch and npu_wait_stream. By adding checks for None streams, these functions become safer and more reliable for future use across the codebase, preventing potential AttributeError exceptions and unexpected behavior.

Copy link

This pull request has conflicts, please resolve those before we can evaluate the pull request.

@wangxiyuan
Copy link
Collaborator

  1. Add e2e test
  2. set the default value to True

@github-actions github-actions bot added documentation Improvements or additions to documentation module:tests labels Sep 17, 2025
@whx-sjtu
Copy link
Collaborator Author

  1. set the default value to True

This will be done in PR #2681

@wangxiyuan wangxiyuan added ready read for review ready-for-test start test by label for PR labels Sep 18, 2025
Copy link

This pull request has conflicts, please resolve those before we can evaluate the pull request.

@github-actions github-actions bot added merge-conflicts and removed ready read for review labels Sep 18, 2025
@whx-sjtu whx-sjtu added the high high priority issue label Sep 18, 2025
Signed-off-by: whx-sjtu <2952154980@qq.com>
Signed-off-by: whx-sjtu <2952154980@qq.com>
Signed-off-by: whx-sjtu <2952154980@qq.com>
Signed-off-by: whx-sjtu <2952154980@qq.com>
Signed-off-by: whx-sjtu <2952154980@qq.com>
Signed-off-by: whx-sjtu <2952154980@qq.com>
Signed-off-by: whx-sjtu <2952154980@qq.com>
@whx-sjtu whx-sjtu removed the high high priority issue label Sep 18, 2025
@wangxiyuan wangxiyuan added the ready read for review label Sep 18, 2025
@wangxiyuan wangxiyuan merged commit 0a52676 into vllm-project:main Sep 19, 2025
33 of 34 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation module:core module:ops module:tests ready read for review ready-for-test start test by label for PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants