-
Notifications
You must be signed in to change notification settings - Fork 513
[Enhance]: Add validation for expert parallelism settings #1199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Add validation to prevent simultaneous use of `--enable-expert-parallel` and `expert-tensor-parallel-size` configurations. These settings are mutually exclusive. Implementing this check prevents unexpected behavior and improves error tracing. If both settings are enabled concurrently, the system now throws an error, making it easier to identify and resolve configuration issues. Signed-off-by: Jade Zheng <zheng.shoujian@outlook.com>
Signed-off-by: Jade Zheng <zheng.shoujian@outlook.com>
Signed-off-by: Jade Zheng <zheng.shoujian@outlook.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Adds a validation step in check_ascend_config to prevent simultaneous use of --enable-expert-parallel and an expert_tensor_parallel_size greater than 1, avoiding conflicting configurations.
- Import
TYPE_CHECKINGand annotatevllm_configwithVllmConfigfor better type safety. - Introduce a runtime check that raises an error when expert parallelism flags conflict.
- Preserve existing experimental ACL graph warnings.
Comments suppressed due to low confidence (1)
vllm_ascend/ascend_config.py:171
- Consider adding a unit test to verify that enabling both
enable_expert_parallelandexpert_tensor_parallel_size > 1reliably raises the intended error.
# for expert parallelism
Signed-off-by: Jade Zheng <zheng.shoujian@outlook.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
| "raise an issue on https://github.com/vllm-project/vllm-ascend/issues" | ||
| " if you encourage any Error") | ||
|
|
||
| # for expert parallelism |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move this validation into check_and_update_config in platform.py?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function will be called within check_and_update_config
|
@Yikun @wangxiyuan ready to merge. |
|
can you update the tests/ut/test_ascend_config.py as well? |
|
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
|
ETP has been removed already. |
Add validation to prevent simultaneous use of
--enable-expert-parallelandexpert-tensor-parallel-sizeconfigurations. These settings are mutually exclusive. Implementing this check prevents unexpected behavior and improves error tracing.If both settings are enabled concurrently, the system now throws an error, making it easier to identify and resolve configuration issues.