-
Notifications
You must be signed in to change notification settings - Fork 542
[Doc] Add Qwen3-Omni-30B-A3B-Thinking Tutorials #3991
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This PR adds a new tutorial for running Qwen3-Omni-30B-A3B-Thinking on multiple NPUs. The documentation is comprehensive, but I've found a few critical issues in the provided code snippets that would prevent users from successfully running the examples. Specifically, there's an incorrect package name in a pip install command and an incorrect model path in the offline inference script. There is also a typo in the filename of the new document. Please address these issues to ensure the tutorial is accurate and easy to follow.
|
|
||
| ```bash | ||
| # If you already have transformers installed, please update transformer version >= 4.57.0.dev0 | ||
| # pip install transformer -U |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
|
||
|
|
||
| def main(): | ||
| MODEL_PATH = "/Qwen/Qwen3-Omni-30B-A3B-Thinking" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The MODEL_PATH is set to an absolute path "/Qwen/Qwen3-Omni-30B-A3B-Thinking". This is inconsistent with other tutorials and the online inference command in this same file, which use the model identifier directly. Using an absolute path might cause model loading to fail if the model is not present at that exact location in the container. It's better to use the model identifier and let vLLM handle the download and caching, especially since VLLM_USE_MODELSCOPE=True is set.
| MODEL_PATH = "/Qwen/Qwen3-Omni-30B-A3B-Thinking" | |
| MODEL_PATH = "Qwen/Qwen3-Omni-30B-A3B-Thinking" |
| @@ -0,0 +1,192 @@ | |||
| # Multi-NPU (Qwen3-Omni-30B-A3B-Thinking) | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0aab0b3 to
2f48bac
Compare
|
please following the change 5f08e07 We're working on tutorial refactor now. |
7b0e7ff to
6824773
Compare
Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
6824773 to
b12abbc
Compare
What this PR does / why we need it?
Add Qwen3-Omni-30B-A3B-Thinking Tutorials
Does this PR introduce any user-facing change?
No
How was this patch tested?