Skip to content

Conversation

MengqingCao
Copy link
Collaborator

What this PR does / why we need it?

Add ut for torchair graph mode on DeepSeekV3

How was this patch tested?

CI passed with new added test.

@MengqingCao MengqingCao force-pushed the torchair branch 7 times, most recently from 1cd7fd4 to 590b412 Compare June 11, 2025 02:21
num_heads = query.size(1)
block_size = kv_cache.size(1)
latent_kv_dim = kv_cache.size(3) - rope_dim
block_size = kv_cache[0].size(1)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Torchair will get into this code path?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, with chunked prefill enabled.

but this is another bug. we should fix it later. @MengqingCao will add the test with ascend scheudler enabled.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have removed this fix now because there still exists oom issue on chunked prefill senario.
This pr will focus ut on ascend scheudler with chunked prefill disabled. cc @Yikun

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pr is ready for review now. plz take a look, thx!

@github-actions github-actions bot added documentation Improvements or additions to documentation and removed module:ops labels Jun 13, 2025
  * use vllm-ascend/DeepSeek-V3
  * use spawn
  * update cleanup dist env and mem
  * update torchair config
  * disable eager
  * enable refresh
  * use random weight
  * some fixes

Signed-off-by: MengqingCao <cmq0113@163.com>
Signed-off-by: MengqingCao <cmq0113@163.com>
Signed-off-by: Mengqing Cao <cmq0113@163.com>
Signed-off-by: Mengqing Cao <cmq0113@163.com>
@Yikun
Copy link
Collaborator

Yikun commented Jun 14, 2025

Let's merge this to protect DeepSeek graph mode on V1 (with ascend_scheduler enable) via vllm-ascend/DeepSeek-V3-Pruning e2e.

"ascend_scheduler_config": {
"enabled": True,
},
"refresh": True,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

last question: why need refresh here?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I merged this first feel free to open new PR to address this if needed

Copy link
Collaborator Author

@MengqingCao MengqingCao Jun 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

refresh is used here because an initial setup of llm is already done in UTs, it need a refresh of additional config. And we don't need this in a normal usage

@Yikun Yikun merged commit a3b5af8 into vllm-project:main Jun 14, 2025
9 checks passed
momo609 pushed a commit to momo609/vllm-ascend that referenced this pull request Jun 17, 2025
### What this PR does / why we need it?
Add ut for torchair graph mode on DeepSeekV3

### How was this patch tested?
CI passed with new added test.

---------

Signed-off-by: MengqingCao <cmq0113@163.com>
Signed-off-by: Mengqing Cao <cmq0113@163.com>
momo609 pushed a commit to momo609/vllm-ascend that referenced this pull request Jun 17, 2025
### What this PR does / why we need it?
Add ut for torchair graph mode on DeepSeekV3

### How was this patch tested?
CI passed with new added test.

---------

Signed-off-by: MengqingCao <cmq0113@163.com>
Signed-off-by: Mengqing Cao <cmq0113@163.com>
Signed-off-by: wangxiaoxin (A) <wangxiaoxin7@huawei.com>
momo609 pushed a commit to momo609/vllm-ascend that referenced this pull request Jun 17, 2025
### What this PR does / why we need it?
Add ut for torchair graph mode on DeepSeekV3

### How was this patch tested?
CI passed with new added test.

---------

Signed-off-by: MengqingCao <cmq0113@163.com>
Signed-off-by: Mengqing Cao <cmq0113@163.com>
Signed-off-by: wangxiaoxin (A) <wangxiaoxin7@huawei.com>
@MengqingCao MengqingCao deleted the torchair branch June 28, 2025 02:00
shiyuan680 pushed a commit to raindaywhu/vllm-ascend that referenced this pull request Jul 7, 2025
Add ut for torchair graph mode on DeepSeekV3

CI passed with new added test.

---------

Signed-off-by: MengqingCao <cmq0113@163.com>
Signed-off-by: Mengqing Cao <cmq0113@163.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation module:tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants