-
Notifications
You must be signed in to change notification settings - Fork 453
[CI/UT][Graph] Add ut for torchair graph mode #1103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
1cd7fd4
to
590b412
Compare
vllm_ascend/ops/attention.py
Outdated
num_heads = query.size(1) | ||
block_size = kv_cache.size(1) | ||
latent_kv_dim = kv_cache.size(3) - rope_dim | ||
block_size = kv_cache[0].size(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Torchair will get into this code path?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, with chunked prefill enabled.
but this is another bug. we should fix it later. @MengqingCao will add the test with ascend scheudler enabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have removed this fix now because there still exists oom issue on chunked prefill senario.
This pr will focus ut on ascend scheudler with chunked prefill disabled. cc @Yikun
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pr is ready for review now. plz take a look, thx!
* use vllm-ascend/DeepSeek-V3 * use spawn * update cleanup dist env and mem * update torchair config * disable eager * enable refresh * use random weight * some fixes Signed-off-by: MengqingCao <cmq0113@163.com>
Signed-off-by: MengqingCao <cmq0113@163.com>
Signed-off-by: Mengqing Cao <cmq0113@163.com>
Signed-off-by: Mengqing Cao <cmq0113@163.com>
Let's merge this to protect DeepSeek graph mode on V1 (with |
"ascend_scheduler_config": { | ||
"enabled": True, | ||
}, | ||
"refresh": True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
last question: why need refresh here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I merged this first feel free to open new PR to address this if needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
refresh
is used here because an initial setup of llm is already done in UTs, it need a refresh of additional config. And we don't need this in a normal usage
### What this PR does / why we need it? Add ut for torchair graph mode on DeepSeekV3 ### How was this patch tested? CI passed with new added test. --------- Signed-off-by: MengqingCao <cmq0113@163.com> Signed-off-by: Mengqing Cao <cmq0113@163.com>
### What this PR does / why we need it? Add ut for torchair graph mode on DeepSeekV3 ### How was this patch tested? CI passed with new added test. --------- Signed-off-by: MengqingCao <cmq0113@163.com> Signed-off-by: Mengqing Cao <cmq0113@163.com> Signed-off-by: wangxiaoxin (A) <wangxiaoxin7@huawei.com>
### What this PR does / why we need it? Add ut for torchair graph mode on DeepSeekV3 ### How was this patch tested? CI passed with new added test. --------- Signed-off-by: MengqingCao <cmq0113@163.com> Signed-off-by: Mengqing Cao <cmq0113@163.com> Signed-off-by: wangxiaoxin (A) <wangxiaoxin7@huawei.com>
Add ut for torchair graph mode on DeepSeekV3 CI passed with new added test. --------- Signed-off-by: MengqingCao <cmq0113@163.com> Signed-off-by: Mengqing Cao <cmq0113@163.com>
What this PR does / why we need it?
Add ut for torchair graph mode on DeepSeekV3
How was this patch tested?
CI passed with new added test.