Skip to content

Upstream 091 eplb dynamic #1665

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 195 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
195 commits
Select commit Hold shift + click to select a range
2a94321
add eplb policy
Jun 10, 2025
e91956f
add eplb updator
Jun 10, 2025
66b3d2e
implementation of VllmEplbAdaptor and D2DExpertWeightLoader
wanghanqingLYT Jun 10, 2025
05a536c
add eplb policy and updator
raindaywhu Jun 10, 2025
24ca412
Merge pull request #39 from raindaywhu/dev_whq_eplb
wanghanqingLYT Jun 10, 2025
86fe2c0
determine num_dense_layers and num_moe_layers by refering to model co…
wanghanqingLYT Jun 10, 2025
caeaf2c
Merge pull request #41 from raindaywhu/dev_whq_eplb
wanghanqingLYT Jun 10, 2025
e68e522
EPLB add eplb_worker
qmkakaxi Jun 10, 2025
f450936
Merge pull request #42 from raindaywhu/dev_mereg_wjh
qmkakaxi Jun 10, 2025
d639144
add ssd loader
qmkakaxi Jun 10, 2025
f1f936b
EPLB moed load collect
qmkakaxi Jun 10, 2025
bd924f2
delete invalida import
qmkakaxi Jun 10, 2025
7e9bb54
Merge pull request #43 from raindaywhu/dev_mereg_wjh
qmkakaxi Jun 10, 2025
afcce8e
fix bugs in fused_experts_with_all2all
wanghanqingLYT Jun 11, 2025
bca9b34
Merge pull request #44 from raindaywhu/dev_whq_eplb
wanghanqingLYT Jun 11, 2025
cc88ea7
add eplb tabel generator
Jun 11, 2025
f2d0a75
add eplb tabel generator
raindaywhu Jun 11, 2025
78079a7
Adapt static EPLB
qmkakaxi Jun 11, 2025
22f03db
Merge branch 'master' into br_wjh_eplb
qmkakaxi Jun 11, 2025
485e3d0
add enable_eplb in ascend_config
qmkakaxi Jun 11, 2025
9e5e117
enable_eplb -> dynamic_eplb
qmkakaxi Jun 11, 2025
c28f6cb
add eplb policy
Jun 10, 2025
839cab1
add eplb updator
Jun 10, 2025
6ad801d
implementation of VllmEplbAdaptor and D2DExpertWeightLoader
wanghanqingLYT Jun 10, 2025
34109ac
determine num_dense_layers and num_moe_layers by refering to model co…
wanghanqingLYT Jun 10, 2025
c15f8a8
EPLB add eplb_worker
qmkakaxi Jun 10, 2025
1400ea3
add ssd loader
qmkakaxi Jun 10, 2025
28af393
EPLB moed load collect
qmkakaxi Jun 10, 2025
aa619f4
delete invalida import
qmkakaxi Jun 10, 2025
6fc343c
fix bugs in fused_experts_with_all2all
wanghanqingLYT Jun 11, 2025
b97e066
Adapt static EPLB
qmkakaxi Jun 11, 2025
474a5c3
add eplb tabel generator
Jun 11, 2025
807348f
add enable_eplb in ascend_config
qmkakaxi Jun 11, 2025
85d29c5
enable_eplb -> dynamic_eplb
qmkakaxi Jun 11, 2025
e4172aa
fix bugs in dynamioc eplb
wanghanqingLYT Jun 14, 2025
264a3a5
delete print in funsed_moe forward
wanghanqingLYT Jun 14, 2025
7334158
Merge pull request #52 from raindaywhu/dev_whq_eplb1
wanghanqingLYT Jun 14, 2025
a5dd04f
fix bugs caused by variable name old_placemet
wanghanqingLYT Jun 14, 2025
60c87b0
Merge pull request #53 from raindaywhu/dev_whq_eplb1
wanghanqingLYT Jun 14, 2025
241b722
move get_init_expert_map to forward_before
wanghanqingLYT Jun 14, 2025
b888701
Merge pull request #54 from raindaywhu/dev_whq_eplb1
wanghanqingLYT Jun 14, 2025
441508c
fix bug in log2phy in dynamic w8a8
wanghanqingLYT Jun 14, 2025
627757e
Merge pull request #55 from raindaywhu/dev_whq_eplb1
wanghanqingLYT Jun 14, 2025
88bb99d
fix bug for dim of updated_log2phy_map
wanghanqingLYT Jun 14, 2025
3913395
Merge pull request #56 from raindaywhu/dev_whq_eplb1
wanghanqingLYT Jun 14, 2025
dd55ce2
Merge remote-tracking branch 'origin/br_whq_eplb_main' into br_wjh_eplb
qmkakaxi Jun 16, 2025
e86964e
add dynamic_ep alg.
qmkakaxi Jun 16, 2025
9d1893a
fxi bugs
qmkakaxi Jun 16, 2025
230fd9c
Merge pull request #57 from raindaywhu/br_wjh_eplb
qmkakaxi Jun 16, 2025
0992bee
fix eplb update log
raindaywhu Jun 16, 2025
91ff797
Merge pull request #59 from raindaywhu/cy_eplb
raindaywhu Jun 16, 2025
0c6210f
fix bugsw
qmkakaxi Jun 16, 2025
3b7fd9b
Merge pull request #60 from raindaywhu/br_wjh_eplb
qmkakaxi Jun 16, 2025
9d63949
improve the implement of communication between main process and eplb …
wanghanqingLYT Jun 16, 2025
0269ef6
Merge pull request #61 from raindaywhu/dev_whq_eplb1
wanghanqingLYT Jun 16, 2025
ae01e08
add compose_expert_update_info_bipartite
qmkakaxi Jun 17, 2025
4b5cd84
adapt compose_expert_update_info_bipartite into eplb process
wanghanqingLYT Jun 17, 2025
74fe5ff
Merge branch 'br_whq_eplb_main' into br_wjh_eplb
wanghanqingLYT Jun 17, 2025
b7bfcc9
Merge pull request #62 from raindaywhu/br_wjh_eplb
wanghanqingLYT Jun 17, 2025
d5dc946
fix bugsw
qmkakaxi Jun 16, 2025
86be76f
improve the implement of communication between main process and eplb …
wanghanqingLYT Jun 16, 2025
03abde3
move generate log2ph map to eplb_worker
raindaywhu Jun 17, 2025
7f77443
fix bugsw
qmkakaxi Jun 16, 2025
0b8d00a
improve the implement of communication between main process and eplb …
wanghanqingLYT Jun 16, 2025
447360f
avoid frequetly synchronize between device and cpu when accessing to …
wanghanqingLYT Jun 17, 2025
acf2aee
Merge pull request #63 from raindaywhu/dev_whq_eplb1
wanghanqingLYT Jun 17, 2025
40ee72f
add gate for cacluate moe load
qmkakaxi Jun 17, 2025
baacad8
Merge pull request #64 from raindaywhu/br_wjh_eplb
qmkakaxi Jun 17, 2025
4380fdd
fix log2phy
raindaywhu Jun 17, 2025
556169d
Merge branch 'br_whq_eplb_main' into cy_eplb
raindaywhu Jun 17, 2025
347f60c
Merge branch 'br_whq_eplb_main' into cy_eplb
raindaywhu Jun 17, 2025
49efd9b
fix bugs in expert_map_per_layer_cpu
wanghanqingLYT Jun 17, 2025
352dbca
Merge pull request #66 from raindaywhu/dev_whq_eplb1
wanghanqingLYT Jun 17, 2025
d18aef1
fix log2phy
raindaywhu Jun 17, 2025
d6a76f8
Merge branch 'br_whq_eplb_main' into cy_eplb
raindaywhu Jun 17, 2025
53d0218
fix log2phy
Jun 17, 2025
af10b4a
mv log2phy into eplb worker
raindaywhu Jun 17, 2025
b39b6d2
Merge pull request #65 from raindaywhu/cy_eplb
raindaywhu Jun 17, 2025
1193c97
default 10 turns to wait worker finished
raindaywhu Jun 17, 2025
aa1660e
Merge pull request #67 from raindaywhu/cy_eplb
raindaywhu Jun 17, 2025
78b7480
fix bug in compose_expert_update_info_bipartite when adding node
wanghanqingLYT Jun 18, 2025
1d9b011
Merge pull request #68 from raindaywhu/dev_whq_eplb
wanghanqingLYT Jun 18, 2025
8e6b1ee
improve running time in generate_expert_d2d_transfer_task
wanghanqingLYT Jun 18, 2025
6d845f2
Merge pull request #69 from raindaywhu/dev_whq_eplb
wanghanqingLYT Jun 18, 2025
43def8a
add warm up & batch add
qmkakaxi Jun 18, 2025
130bbb9
Merge pull request #70 from raindaywhu/br_wjh_eplb
qmkakaxi Jun 18, 2025
9219cc8
delete layer moe load
qmkakaxi Jun 18, 2025
c600494
Merge pull request #71 from raindaywhu/br_wjh_eplb
qmkakaxi Jun 18, 2025
2125fe0
add get_tok_ids
qmkakaxi Jun 18, 2025
2403b59
Merge pull request #72 from raindaywhu/br_wjh_eplb
qmkakaxi Jun 18, 2025
4bda9ba
Extract cal_moe_load from deepseek_v2
qmkakaxi Jun 18, 2025
2e824cd
Merge pull request #73 from raindaywhu/br_wjh_eplb
qmkakaxi Jun 18, 2025
1b78fb2
running time reduction forward_before and forward_end
wanghanqingLYT Jun 19, 2025
53728f3
Merge pull request #74 from raindaywhu/dev_whq_eplb
wanghanqingLYT Jun 19, 2025
e4b1ba0
packed update info and put/get
qmkakaxi Jun 19, 2025
1c8edad
Merge pull request #75 from raindaywhu/br_wjh_eplb
qmkakaxi Jun 19, 2025
a9584dd
add get expert workload
Jun 19, 2025
6592d72
fix bug in pack update info
qmkakaxi Jun 19, 2025
e6a3851
Merge pull request #76 from raindaywhu/br_wjh_eplb
qmkakaxi Jun 19, 2025
17fc31e
improve implementation of generate_log2phy_map
wanghanqingLYT Jun 19, 2025
082e82d
Merge pull request #77 from raindaywhu/dev_whq_eplb
wanghanqingLYT Jun 19, 2025
926de75
Merge remote-tracking branch 'vllm_main/main' into br_main_into_eplb
qmkakaxi Jun 20, 2025
22de4ee
fix warm up & change init expert map from file
qmkakaxi Jun 20, 2025
e83f89d
add moe load in worker_v1
qmkakaxi Jun 20, 2025
3604f04
Merge remote-tracking branch 'vllm_main/main' into br_main_into_eplb
qmkakaxi Jun 20, 2025
2484055
Merge pull request #78 from raindaywhu/br_main_into_eplb_wjh
qmkakaxi Jun 20, 2025
1a21f30
fix warm up bugs
qmkakaxi Jun 20, 2025
38c1234
Merge pull request #79 from raindaywhu/br_main_into_eplb_wjh
qmkakaxi Jun 20, 2025
051f77a
fix log2phy bug
qmkakaxi Jun 20, 2025
7b6b474
Merge pull request #80 from raindaywhu/br_main_into_eplb_wjh
qmkakaxi Jun 20, 2025
d0c98c9
fix bugs: batch_isend_irecv synchronization and dtype bug in log2phy
wanghanqingLYT Jun 20, 2025
6226dee
Merge pull request #81 from raindaywhu/dev_whq_eplb1
wanghanqingLYT Jun 20, 2025
9295a9c
add another check for new placement generated by eplb algorithm
wanghanqingLYT Jun 21, 2025
67fa706
add dynamic_ep_v2
qmkakaxi Jun 21, 2025
2ccda78
Merge pull request #83 from raindaywhu/br_main_into_eplb_wjh
qmkakaxi Jun 21, 2025
7a11221
Merge pull request #82 from raindaywhu/dev_whq_eplb1
wanghanqingLYT Jun 21, 2025
771d4c7
fix dummy_run and profile_run
Jun 21, 2025
ff1076f
Merge pull request #84 from raindaywhu/cy_br_main_into_eplb
raindaywhu Jun 21, 2025
da27c2d
add mock experts_load data
Jun 21, 2025
70a922e
fix bugs in get_init_expert_map_from_file
wanghanqingLYT Jun 21, 2025
89f4376
Merge pull request #86 from raindaywhu/dev_whq_eplb1
wanghanqingLYT Jun 21, 2025
af31373
fix bug in init expert_map_per_layer_cpu
wanghanqingLYT Jun 21, 2025
5751c27
Merge pull request #87 from raindaywhu/dev_whq_eplb2
wanghanqingLYT Jun 21, 2025
613c030
add gate_eplb
qmkakaxi Jun 21, 2025
a2505fb
Merge pull request #88 from raindaywhu/br_main_into_eplb_wjh
qmkakaxi Jun 21, 2025
057a297
get_init_experts_map in warm up
qmkakaxi Jun 21, 2025
62108d7
Merge pull request #89 from raindaywhu/br_main_into_eplb_wjh
qmkakaxi Jun 21, 2025
9f498e9
add update_expert_load_statistical_period logic
Jun 21, 2025
11936d5
add generate expert map
qmkakaxi Jun 21, 2025
e83afa5
Merge remote-tracking branch 'origin/br_main_into_eplb' into br_main_…
qmkakaxi Jun 21, 2025
8907c9c
Merge branch 'br_main_into_eplb' into lt_dev
Jun 21, 2025
d0e8104
add generate_expert_map_all
qmkakaxi Jun 21, 2025
0c8318c
generate expert map
qmkakaxi Jun 21, 2025
ab4bfd2
init expert map
qmkakaxi Jun 21, 2025
adaed7b
Merge pull request #90 from raindaywhu/br_main_into_eplb_wjh
qmkakaxi Jun 21, 2025
e6e25f3
fix bugs in get_update_iteration
qmkakaxi Jun 21, 2025
0cfd62c
Merge pull request #91 from raindaywhu/br_main_into_eplb_wjh
qmkakaxi Jun 21, 2025
12f0c44
Merge branch 'br_main_into_eplb' into lt_dev
Jun 21, 2025
353150e
fix bug in get_init_expert_map_from_file
qmkakaxi Jun 21, 2025
43d4b87
Merge pull request #92 from raindaywhu/br_main_into_eplb_wjh
qmkakaxi Jun 21, 2025
f6830d4
update policy = 6
Jun 22, 2025
041e141
add load_gather_iteration
raindaywhu Jun 22, 2025
f4f9fd7
add code to guarantee there is no expert movement inside a NPU
wanghanqingLYT Jun 22, 2025
7371294
Merge pull request #93 from raindaywhu/dev_whq_eplb1
wanghanqingLYT Jun 22, 2025
5c09eab
新增日志
Jun 22, 2025
1e6b2c6
Merge branch 'lt_dev' of https://github.com/raindaywhu/vllm-ascend in…
Jun 22, 2025
017e0aa
Update policy_factory.py
wanghanqingLYT Jun 22, 2025
976eb9f
Merge pull request #94 from raindaywhu/dev_whq_eplb1
wanghanqingLYT Jun 22, 2025
d537fb2
update
Jun 22, 2025
1a8d238
Merge pull request #85 from raindaywhu/lt_dev
raindaywhu Jun 22, 2025
83f2d51
Merge branch 'br_main_into_eplb' of https://github.com/raindaywhu/vll…
Jun 22, 2025
9e2cca1
dummy run not add moe load
qmkakaxi Jun 22, 2025
5d1ce50
Merge pull request #95 from raindaywhu/br_main_into_eplb_wjh
qmkakaxi Jun 22, 2025
edb38e4
fix bug in compute moe load
qmkakaxi Jun 22, 2025
6bbdb15
Merge pull request #96 from raindaywhu/br_main_into_eplb_wjh
qmkakaxi Jun 22, 2025
8b31e79
fix bugs in forward_end
qmkakaxi Jun 22, 2025
5225f3c
Merge pull request #97 from raindaywhu/br_main_into_eplb_wjh
qmkakaxi Jun 22, 2025
2dba24d
Merge branch 'br_main_into_eplb' of https://github.com/raindaywhu/vll…
Jun 23, 2025
d4d0716
fix conflict
Jun 23, 2025
53e8949
fix some bug
Jun 23, 2025
98b9383
fix precision by fix a wrong branch condition in w8a8_dynamic.py
wanghanqingLYT Jun 23, 2025
a3544ce
Merge pull request #98 from raindaywhu/dev_whq_eplb3
wanghanqingLYT Jun 23, 2025
45766f6
fix code format alignment
Jun 23, 2025
6b36faf
update format
Jun 23, 2025
1a067a3
fix incident for function forward_end in eplb_updator.py
wanghanqingLYT Jun 23, 2025
fc88c4b
Merge pull request #100 from raindaywhu/dev_whq_eplb3
wanghanqingLYT Jun 23, 2025
9c329ed
optimize calculate moe load
qmkakaxi Jun 24, 2025
0897ccc
Merge pull request #101 from raindaywhu/br_main_into_eplb_wjh
qmkakaxi Jun 24, 2025
4980f2c
fix bug in moe load & add expert load to josn
qmkakaxi Jun 24, 2025
96fe998
Merge pull request #102 from raindaywhu/br_main_into_eplb_wjh
qmkakaxi Jun 24, 2025
da49def
merge from remote main
Jun 24, 2025
9d9c93a
update get_expert_load return type
Jun 24, 2025
162d106
fix bug when running benchmark by move forward_before behind return o…
wanghanqingLYT Jun 25, 2025
c57611c
Merge pull request #103 from raindaywhu/dev_whq_eplb1
wanghanqingLYT Jun 25, 2025
1f0b980
fix SwiftBalancer eplb algo
Jun 26, 2025
bfa07cf
Merge pull request #104 from raindaywhu/new_dev_main_cy
raindaywhu Jun 26, 2025
e7b7186
update get_expert_load logic
Jun 27, 2025
d018ec8
fix get_expert_load
qmkakaxi Jun 27, 2025
6a0a05e
delete invaild print
qmkakaxi Jun 27, 2025
1547810
delete empty tensor judgement
Jun 27, 2025
1b7b87b
Merge pull request #105 from raindaywhu/br_main_into_eplb_wjh
qmkakaxi Jun 27, 2025
969751a
merge from remote default branch and fix conflict
Jun 27, 2025
b0e68f7
merge default branch and fix conflict
Jun 27, 2025
3465ad6
relocate the code from the worker_runner to the server side.
Jun 28, 2025
0bab2cd
Merge pull request #99 from raindaywhu/lt_expert_load
raindaywhu Jun 28, 2025
ad5e7e1
collect moe load after dispatch
wanghanqingLYT Jun 30, 2025
e4cba5e
Merge branch 'br_main_into_eplb' into dev_whq_eplb2
wanghanqingLYT Jun 30, 2025
75992b9
Merge pull request #106 from raindaywhu/dev_whq_eplb2
wanghanqingLYT Jun 30, 2025
89bcf04
modify serialization of eplb process
wanghanqingLYT Jul 1, 2025
cfbe8b1
Merge pull request #107 from raindaywhu/dev_whq_eplb2
wanghanqingLYT Jul 2, 2025
2b62a47
improve d2d expert weight update impl in eplb_updator.py
wanghanqingLYT Jul 3, 2025
d79ace8
Merge pull request #108 from raindaywhu/dev_whq_eplb1
wanghanqingLYT Jul 4, 2025
9b32ca4
add function take_update_info_from_eplb_process
wanghanqingLYT Jul 7, 2025
0a5b075
Merge pull request #109 from raindaywhu/dev_whq_eplb1
wanghanqingLYT Jul 7, 2025
925914d
update
Jul 8, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
188 changes: 162 additions & 26 deletions .github/workflows/accuracy_test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,9 @@
name: Benchmarks / accuracy

on:
schedule:
# Runs every 6 hours
- cron: '0 */6 * * *'
pull_request:
types: [ labeled ]
workflow_dispatch:
Expand All @@ -34,6 +37,7 @@ on:
# Current supported vLLM versions
options:
- main
- v0.9.2
- v0.9.1
- v0.7.3
vllm-ascend-version:
Expand All @@ -42,16 +46,17 @@ on:
type: choice
options:
- main
- v0.9.1-dev
- v0.7.3-dev
models:
description: 'model:'
required: true
type: choice
options:
- all
- Qwen/Qwen2.5-7B-Instruct
- Qwen/Qwen2.5-VL-7B-Instruct
- Qwen/Qwen3-8B-Base
- Qwen/Qwen3-30B-A3B
default: 'all'

# Bash shells do not use ~/.profile or ~/.bashrc so these shells need to be explicitly
Expand All @@ -73,56 +78,57 @@ jobs:
${{
(contains(github.event.pull_request.labels.*.name, 'accuracy-test') ||
contains(github.event.pull_request.labels.*.name, 'vl-accuracy-test') ||
contains(github.event.pull_request.labels.*.name, 'moe-accuracy-test') ||
contains(github.event.pull_request.labels.*.name, 'dense-accuracy-test')) &&
contains(github.event.pull_request.labels.*.name, 'ready-for-test') ||
github.event_name == 'workflow_dispatch'
github.event_name == 'workflow_dispatch' || github.event_name == 'schedule'
}}
runs-on: >-
${{
(matrix.model_name == 'Qwen/Qwen2.5-VL-7B-Instruct' && 'linux-arm64-npu-4') ||
(matrix.model_name == 'Qwen/Qwen3-30B-A3B' && 'linux-arm64-npu-4') ||
'linux-arm64-npu-2'
}}
strategy:
matrix:
vllm_use_version: [0, 1]
vllm_use_version: [1]
# the accuracy test will run:
# 1. workflow_dispatch with models input
# - all: Qwen/Qwen2.5-7B-Instruct, Qwen/Qwen2.5-VL-7B-Instruct, Qwen/Qwen3-8B-Base
# - specified but not all: Qwen/Qwen2.5-7B-Instruct, Qwen/Qwen2.5-VL-7B-Instruct, Qwen/Qwen3-8B-Base
# - all: Qwen/Qwen3-30B-A3B, Qwen/Qwen2.5-VL-7B-Instruct, Qwen/Qwen3-8B-Base
# - specified but not all: Qwen/Qwen3-30B-A3B, Qwen/Qwen2.5-VL-7B-Instruct, Qwen/Qwen3-8B-Base
# 2. PR labeled with "*-accuracy-test"
# - accuracy-test: Qwen/Qwen2.5-7B-Instruct, Qwen/Qwen2.5-VL-7B-Instruct
# - dense-accuracy-test: Qwen/Qwen2.5-7B-Instruct
# - accuracy-test: Qwen/Qwen3-8B-Base, Qwen/Qwen2.5-VL-7B-Instruct, Qwen/Qwen3-30B-A3B
# - dense-accuracy-test: Qwen/Qwen3-8B-Base
# - vl-accuracy-test: Qwen/Qwen2.5-VL-7B-Instruct
# - moe-accuracy-test: Qwen/Qwen3-30B-A3B
model_name: ${{ fromJSON(
(github.event_name == 'schedule' &&
'["Qwen/Qwen3-30B-A3B","Qwen/Qwen2.5-VL-7B-Instruct","Qwen/Qwen3-8B-Base"]') ||
(github.event.inputs.models == 'all' &&
'["Qwen/Qwen2.5-7B-Instruct","Qwen/Qwen2.5-VL-7B-Instruct","Qwen/Qwen3-8B-Base"]') ||
(github.event.inputs.models == 'Qwen/Qwen2.5-7B-Instruct' &&
'["Qwen/Qwen2.5-7B-Instruct"]') ||
'["Qwen/Qwen3-30B-A3B","Qwen/Qwen2.5-VL-7B-Instruct","Qwen/Qwen3-8B-Base"]') ||
(github.event.inputs.models == 'Qwen/Qwen3-30B-A3B' &&
'["Qwen/Qwen3-30B-A3B"]') ||
(github.event.inputs.models == 'Qwen/Qwen2.5-VL-7B-Instruct' &&
'["Qwen/Qwen2.5-VL-7B-Instruct"]') ||
(github.event.inputs.models == 'Qwen/Qwen3-8B-Base' &&
'["Qwen/Qwen3-8B-Base"]') ||
contains(github.event.pull_request.labels.*.name, 'accuracy-test') &&
'["Qwen/Qwen2.5-7B-Instruct","Qwen/Qwen2.5-VL-7B-Instruct"]' ||
'["Qwen/Qwen3-8B-Base","Qwen/Qwen2.5-VL-7B-Instruct", "Qwen/Qwen3-30B-A3B"]' ||
contains(github.event.pull_request.labels.*.name, 'dense-accuracy-test') &&
'["Qwen/Qwen2.5-7B-Instruct"]' ||
'["Qwen/Qwen3-8B-Base"]' ||
contains(github.event.pull_request.labels.*.name, 'vl-accuracy-test') &&
'["Qwen/Qwen2.5-VL-7B-Instruct"]'
'["Qwen/Qwen2.5-VL-7B-Instruct"]' ||
contains(github.event.pull_request.labels.*.name, 'moe-accuracy-test') &&
'["Qwen/Qwen3-30B-A3B"]'
) }}
# Remove exclude after https://github.com/vllm-project/vllm-ascend/issues/1044 resolved
exclude:
- model_name: Qwen/Qwen2.5-VL-7B-Instruct
vllm_use_version: 1

fail-fast: false
name: ${{ matrix.model_name }} accuracy V${{ matrix.vllm_use_version }}
container:
image: m.daocloud.io/quay.io/ascend/cann:8.1.rc1-910b-ubuntu22.04-py3.10
env:
HF_ENDPOINT: https://hf-mirror.com
HF_TOKEN: ${{ secrets.HF_TOKEN }}
DATASET_SOURCE: ModelScope
VLLM_USE_MODELSCOPE: True
USE_MODELSCOPE_HUB: 1
# 1. If version specified (work_dispatch), do specified branch accuracy test
# 2. If no version (labeled PR), do accuracy test by default ref:
# The branch, tag or SHA to checkout. When checking out the repository that
Expand Down Expand Up @@ -158,7 +164,7 @@ jobs:
repository: vllm-project/vllm
path: ./vllm-empty
# Please also update this when bump matched version
ref: ${{ github.event.inputs.vllm-version || 'v0.9.1' }}
ref: ${{ github.event.inputs.vllm-version || 'v0.9.2' }}

- name: Install vllm-project/vllm from source
working-directory: ./vllm-empty
Expand All @@ -177,11 +183,28 @@ jobs:
PIP_EXTRA_INDEX_URL: https://mirrors.huaweicloud.com/ascend/repos/pypi
run: |
pip install -r requirements-dev.txt
pip install -e .
pip install -v -e .

- name: Get vLLM commit hash and URL
working-directory: ./vllm-empty
run: |
VLLM_COMMIT=$(git rev-parse --short=7 HEAD)
echo "VLLM_COMMIT=$VLLM_COMMIT" >> $GITHUB_ENV

- name: Get vLLM-Ascend commit hash and URL
working-directory: ./vllm-ascend
run: |
VLLM_ASCEND_COMMIT=$(git rev-parse --short=7 HEAD)
echo "VLLM_ASCEND_COMMIT=$VLLM_ASCEND_COMMIT" >> $GITHUB_ENV

- name: Print resolved hashes
run: |
echo "vLLM : ${{ env.VLLM_COMMIT }}"
echo "vLLM-Ascend: ${{ env.VLLM_ASCEND_COMMIT }}"

- name: Install lm-eval, ray, and datasets
run: |
pip install lm-eval
pip install lm-eval==0.4.8

- name: Collect version info
run: |
Expand Down Expand Up @@ -233,7 +256,10 @@ jobs:
--cann_version "${{ env.GHA_CANN_VERSION }}" \
--torch_npu_version "${{ env.GHA_TORCH_NPU_VERSION }}" \
--torch_version "${{ env.GHA_TORCH_VERSION }}" \
--vllm_version "${{ env.GHA_VLLM_VERSION }}"
--vllm_version "${{ env.GHA_VLLM_VERSION }}" \
--vllm_commit "${{ env.VLLM_COMMIT }}" \
--vllm_ascend_commit "${{ env.VLLM_ASCEND_COMMIT }}" \
--vllm_use_v1 "$VLLM_USE_V1"

- name: Generate step summary
if: ${{ always() }}
Expand All @@ -245,12 +271,122 @@ jobs:
SAFE_VLLM_ASCEND_VERSION="${GHA_VLLM_ASCEND_VERSION//\//-}"
echo "SAFE_VLLM_ASCEND_VERSION=$SAFE_VLLM_ASCEND_VERSION" >> "$GITHUB_ENV"

- name: Check report first line for failure
id: check_report
run: |
REPORT_PATH="./benchmarks/accuracy/${{ steps.report.outputs.markdown_name }}.md"
echo "Scanning $REPORT_PATH for ❌ …"
if grep -q '❌' "$REPORT_PATH"; then
echo "contains_fail=true" >> $GITHUB_OUTPUT
else
echo "contains_fail=false" >> $GITHUB_OUTPUT
fi

- name: Upload Report for V${{ matrix.vllm_use_version }}
if: ${{ github.event_name == 'workflow_dispatch' }}
if: ${{ github.event_name == 'workflow_dispatch' && steps.check_report.outputs.contains_fail == 'false' }}
uses: actions/upload-artifact@v4
with:
name: "${{ env.SAFE_VLLM_ASCEND_VERSION }}-${{ steps.report.outputs.markdown_name }}-report"
name: "report-${{ env.SAFE_VLLM_ASCEND_VERSION }}-${{ steps.report.outputs.markdown_name }}"
path: ./benchmarks/accuracy/${{ steps.report.outputs.markdown_name }}.md
if-no-files-found: warn
retention-days: 90
overwrite: true

create_pr:
runs-on: ubuntu-latest
needs: accuracy_tests
if: ${{ github.event_name == 'workflow_dispatch' }}
env:
UPSTREAM_REPO: vllm-project/vllm-ascend
steps:
- name: Checkout repository
uses: actions/checkout@v4
with:
repository: vllm-ascend-ci/vllm-ascend
token: ${{ secrets.PAT_TOKEN }}
ref: main

- name: Add upstream remote
run: |
git remote add upstream https://github.com/${{ env.UPSTREAM_REPO }}.git
git fetch upstream
git remote -v

- name: Set Git user info dynamically
run: |
git config user.name "${{ github.actor }}"
git config user.email "${{ github.actor }}@users.noreply.github.com"

- name: Create or switch to branch
run: |
TIMESTAMP=$(date +%Y%m%d%H%M%S)
BRANCH_NAME="auto-pr/accuracy-report-${TIMESTAMP}"
echo "BRANCH_NAME=${BRANCH_NAME}" >> $GITHUB_ENV
git checkout -B "${BRANCH_NAME}" upstream/${{ github.event.inputs.vllm-ascend-version }}

- name: Download only current run reports
uses: actions/download-artifact@v4
with:
path: ./docs/source/developer_guide/evaluation/accuracy_report
pattern: report-*
github-token: ${{ secrets.GITHUB_TOKEN }}
run-id: ${{ github.run_id }}

- name: Delete old report
run: |
find ./docs/source/developer_guide/evaluation/accuracy_report -maxdepth 1 -type f -name '*.md' ! -name 'index.md' -delete
find ./docs/source/developer_guide/evaluation/accuracy_report -mindepth 2 -type f -name '*.md' -exec mv -f {} ./docs/source/developer_guide/evaluation/accuracy_report \;
find ./docs/source/developer_guide/evaluation/accuracy_report -mindepth 1 -type d -empty -delete

- name: Update accuracy_report/index.md
run: |
REPORT_DIR="./docs/source/developer_guide/evaluation/accuracy_report"
INDEX_MD="$REPORT_DIR/index.md"
{
echo "# Accuracy Report"
echo ""
echo ":::{toctree}"
echo ":caption: Accuracy Report"
echo ":maxdepth: 1"

for report in "$REPORT_DIR"/*.md; do
filename="$(basename "$report" .md)"
if [ "$filename" != "index" ]; then
echo "$filename"
fi
done
echo ":::"
} > "$INDEX_MD"

- name: push accuracy report
env:
GITHUB_TOKEN: ${{ secrets.PAT_TOKEN }}
run: |
git add ./docs/source/developer_guide/evaluation/accuracy_report/*.md
git commit -s -m "[Doc] Update accuracy reports for ${{ github.event.inputs.vllm-ascend-version }}"
git push -f origin "${{ env.BRANCH_NAME }}"

- name: Create PR in upstream via API
uses: actions/github-script@v7
with:
github-token: ${{ secrets.PAT_TOKEN }}
script: |
const pr = await github.rest.pulls.create({
owner: 'vllm-project',
repo: 'vllm-ascend',
head: `vllm-ascend-ci:${{ env.BRANCH_NAME }}`,
base: '${{ github.event.inputs.vllm-ascend-version }}',
title: `[Doc] Update accuracy reports for ${{ github.event.inputs.vllm-ascend-version }}`,
body: `The accuracy results running on NPU Altlas A2 have changed, updating reports for:
${{
github.event.inputs.models == 'all'
&& 'All models (Qwen/Qwen3-30B-A3B, Qwen2.5-VL-7B-Instruct, Qwen3-8B-Base)'
|| github.event.inputs.models
}}

- [Workflow run][1]

[1]: ${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}`
});
core.info(`Created PR #${pr.data.number}`);

33 changes: 33 additions & 0 deletions .github/workflows/doc_codespell.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@

name: 'doc-codespell'

on:
pull_request:
branches:
- 'main'
- '*-dev'
paths:
- 'docs/**'

jobs:
codespell:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ["3.10"]
steps:
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5.6.0
with:
python-version: ${{ matrix.python-version }}
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements-lint.txt
- name: Run codespell check
run: |
CODESPELL_EXCLUDES=('--skip' 'tests/prompts/**,./benchmarks/sonnet.txt,*tests/lora/data/**,build/**,./vllm_ascend.egg-info/**')
CODESPELL_IGNORE_WORDS=('-L' 'CANN,cann,NNAL,nnal,ASCEND,ascend,EnQue,CopyIn,assertIn,rever')

codespell --toml pyproject.toml "${CODESPELL_EXCLUDES[@]}" "${CODESPELL_IGNORE_WORDS[@]}"
63 changes: 63 additions & 0 deletions .github/workflows/format_pr_body.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
#
# Copyright (c) 2025 Huawei Technologies Co., Ltd. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# This file is a part of the vllm-ascend project.
#

name: format / pr body

on:
# The PR updated when PR opened and push new commits
pull_request_target:
types: [opened, synchronize]
branches:
- 'main'

permissions:
pull-requests: write

jobs:
update-description:
name: update vLLM version
runs-on: ubuntu-latest

steps:
- name: Checkout vllm-project/vllm repo
uses: actions/checkout@v4
with:
repository: vllm-project/vllm
path: ./vllm-empty

- name: Get vLLM version
working-directory: ./vllm-empty
run: |
VLLM_COMMIT=$(git rev-parse HEAD)
echo "VLLM_COMMIT=https://github.com/vllm-project/vllm/commit/$VLLM_COMMIT" >> $GITHUB_ENV

- name: Checkout repository
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

- name: Set up Python
uses: actions/setup-python@42375524e23c412d93fb67b49958b491fce71c38 # v5.4.0

- name: Get vLLM release version
run: |
VLLM_VERSION=$(python3 docs/source/conf.py | jq .vllm_version | tr -d '"')
echo "VLLM_VERSION=$VLLM_VERSION" >> $GITHUB_ENV

- name: Update PR description
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
bash .github/format_pr_body.sh "${{ github.event.number }}" "${{ env.VLLM_VERSION }}" "${{ env.VLLM_COMMIT }}"
Loading
Loading