-
Notifications
You must be signed in to change notification settings - Fork 279
SwiftBalancer Zero OverHead Expert Movement #1855
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -37,6 +37,7 @@ def __init__(self, vllm_config): | |||
ascend_scheduler_config) | |||
|
|||
self.expert_map_path = additional_config.get("expert_map_path", None) | |||
self.dynamic_eplb = additional_config.get("dynamic_eplb", False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we use vllm eplb config enalbe_eplb
instead of adding a new config?
|
||
from abc import ABC, abstractmethod | ||
|
||
class EplbAdaptor(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's is this abstract used for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's is this abstract used for?
for SGlang/Vllm abstract
@@ -773,6 +775,32 @@ def load_weights(self, weights: Iterable[tuple[str, | |||
|
|||
return loaded_params | |||
|
|||
def get_expert_map(self, layer_id): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
vllm has the MixtureOfExperts interface, once we contribute to vllm, these func should be moved there.
And, what about Qwen Moe model?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Qwen Moe is doing test now, will be submit in other pr.
for name in self.expert_weight_names] | ||
) | ||
|
||
# def collect_topk_ids(self, dummy_run=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove comment code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
||
class DynamicTable: | ||
# workload_table: | ||
# 三维矩阵,[layer, gpus, experts_per_gpu_per_layer] -> value: 所在位置的热度 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use english
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
vllm_ascend/eplb/tool/eplb_utils.py
Outdated
import torch | ||
import random | ||
|
||
class ExpertMapUtils(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use class here is meaningless
@@ -0,0 +1,65 @@ | |||
import numpy as np |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move this file to example folder
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
vllm_ascend/eplb/tool/eplb_utils.py
Outdated
@@ -0,0 +1,114 @@ | |||
# |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove tool folder
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -0,0 +1,408 @@ | |||
# |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the worker module has only one file. I think the module is useless
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the worker module has been removed
@@ -0,0 +1,39 @@ | |||
# |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
vllm_ascend/eplb/__init__.py
is missied
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
|
||
return list(zip(send_all, recv_all, maps, log2phy_all, layer_ids)) | ||
|
||
class EplbProcess: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what will happen if the eplbprocess is down in a woker?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
eplb will not update anymore, how erver the forwarding is continuing
af52d72
to
c66e4ce
Compare
c66e4ce
to
b3417f6
Compare
into whq-v091-new * 'whq-v091-new' of https://github.com/raindaywhu/vllm-ascend: simplify eplb policy
add swift balancer doc
address review comments
# Conflicts: # vllm_ascend/eplb/tool/eplb_utils.py
fix commits
fix import path
fix import
… into whq-v091-new * 'whq-v091-new' of https://github.com/845473182/vllm-ascend: fix import fix param bug fix param bug 修改注册引用错误 修改注册引用错误
fix lint errors
Signed-off-by: raindaywhu <raindaywhu@163.com>
Signed-off-by: raindaywhu <raindaywhu@163.com>
Signed-off-by: raindaywhu <raindaywhu@163.com>
What this PR does / why we need it? ####Dynamic Experts load balance for MoE LLM Models Co-authored-by: wanghanqingLYT [hqwang12345@sina.com](mailto:hqwang12345@sina.com) Co-authored-by: njuyuan [yuanjl19@smail.nju.edu.cn](mailto:yuanjl19@smail.nju.edu.cn) Co-authored-by: qmkakaxi [wjh1594260677@qq.com](mailto:wjh1594260677@qq.com) Co-authored-by: Skywalker-EP [173723846@qq.com](mailto:173723846@qq.com) Co-authored-by: ZhengWG [zwg0606@gmail.com](mailto:zwg0606@gmail.com) Co-authored-by: GuoXiYuan [496444320@qq.com](mailto:496444320@qq.com) Co-authored-by: zyy-hw [zhangyuanyun@huawei.com](mailto:zhangyuanyun@huawei.com) Co-authored-by: ltdo111 [1061328217@qq.com](mailto:1061328217@qq.com) Fix commits ci of pr #1855 ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? --------- Signed-off-by: raindaywhu <raindaywhu@163.com> Signed-off-by: wanghanqingLYT <wanghanqing3@huawei.com> Co-authored-by: wanghanqingLYT <wanghanqing3@huawei.com>
What this PR does / why we need it?
Dynamic Experts load balance for MoE LLM Models
Does this PR introduce any user-facing change?
How was this patch tested?