-
Notifications
You must be signed in to change notification settings - Fork 468
[Bugfix] Fix mc2 operator error in aclgraph + ep<16 scenario #2609
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a quickfix for an mc2
operator error that occurs with aclgraph
when the expert parallel size is less than 16. The fix involves forcing the allgathercommimpl
communication method in this scenario. Additionally, aclgraph
is disabled for w8a8
tests. My review focuses on making the main fix more precise to avoid potential performance regressions on hardware not affected by the bug.
vllm_ascend/ops/common_fused_moe.py
Outdated
moe_comm_method_name = forward_context.moe_comm_method_name | ||
if not self.moe_config.use_ep and moe_comm_method_name != "dummycommimpl": | ||
# TODO: Adjusted logic to differentiate between A2 and A3, we check ep_size here since mc2 only support ep_size >= 16 on A3 now | ||
if self.moe_config.ep_size < 16 and moe_comm_method_name != "dummycommimpl": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The added comment on line 244 states this limitation is specific to A3 hardware. However, this condition applies the restriction unconditionally, which could cause performance regressions on other hardware like A2. To align with the comment and make the fix more targeted, consider making the condition hardware-specific by checking the is_ascend_a3
flag from the mc2commimpl
instance.
if self.moe_config.ep_size < 16 and moe_comm_method_name != "dummycommimpl": | |
if self.mc2commimpl.is_ascend_a3 and self.moe_config.ep_size < 16 and moe_comm_method_name != "dummycommimpl": |
Codecov Report❌ Patch coverage is
❌ Your patch status has failed because the patch coverage (0.00%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## main #2609 +/- ##
=======================================
Coverage 72.40% 72.40%
=======================================
Files 146 146
Lines 21598 21598
=======================================
Hits 15639 15639
Misses 5959 5959
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
Signed-off-by: MengqingCao <cmq0113@163.com>
7662691
to
018ba13
Compare
all the CI passed, plz help merge this to recover CI cc @Yikun @wangxiyuan |
…into main_829 * 'main_829' of https://github.com/raindaywhu/vllm-ascend: [torchair]remove aicpu op (vllm-project#2640) bugfix for torchair graph (vllm-project#2639) [CI] fix UT error. (vllm-project#2644) [3/N][Feat][Graph] Support `all-to-all` and quantized models with ACL Graph (vllm-project#2614) [Bugfix] Fix mc2 operator error in aclgraph + ep<16 scenario (vllm-project#2609)
…oject#2609) ### What this PR does / why we need it? 1. quickfix mc2 operator error in aclgraph + ep<16 scenario to recover CI, will be refactorred in the future 2. disable aclgraph when testing w8a8 ### How was this patch tested? CI passed with existing test. - vLLM version: v0.10.1.1 - vLLM main: vllm-project/vllm@9508960 Signed-off-by: MengqingCao <cmq0113@163.com> Signed-off-by: lijiaojiao <lijiaojiao990304@163.com>
…oject#2609) ### What this PR does / why we need it? 1. quickfix mc2 operator error in aclgraph + ep<16 scenario to recover CI, will be refactorred in the future 2. disable aclgraph when testing w8a8 ### How was this patch tested? CI passed with existing test. - vLLM version: v0.10.1.1 - vLLM main: vllm-project/vllm@9508960 Signed-off-by: MengqingCao <cmq0113@163.com>
…oject#2609) ### What this PR does / why we need it? 1. quickfix mc2 operator error in aclgraph + ep<16 scenario to recover CI, will be refactorred in the future 2. disable aclgraph when testing w8a8 ### How was this patch tested? CI passed with existing test. - vLLM version: v0.10.1.1 - vLLM main: vllm-project/vllm@9508960 Signed-off-by: MengqingCao <cmq0113@163.com>
What this PR does / why we need it?
How was this patch tested?
CI passed with existing test.