Skip to content

Commit 30cfb79

Browse files
committed
remove chunked_prefill_for_mla related content
Signed-off-by: whx-sjtu <2952154980@qq.com>
1 parent 27dc04d commit 30cfb79

File tree

3 files changed

+0
-6
lines changed

3 files changed

+0
-6
lines changed

docs/source/locale/zh_CN/LC_MESSAGES/user_guide/configuration/additional_config.po

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -148,9 +148,6 @@ msgid ""
148148
" to be passed in."
149149
msgstr "在为MOE模型使用专家负载均衡时,需要传入专家映射路径。"
150150

151-
#: ../../user_guide/configuration/additional_config.md
152-
msgid "`chunked_prefill_for_mla`"
153-
msgstr "`chunked_prefill_for_mla`"
154151

155152
#: ../../user_guide/configuration/additional_config.md
156153
msgid "`False`"

docs/source/user_guide/configuration/additional_config.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,6 @@ The following table lists the additional configuration options available in vLLM
3030
| `ascend_scheduler_config` | dict | `{}` | The config options for ascend scheduler |
3131
| `refresh` | bool | `false` | Whether to refresh global ascend config content. This value is usually used by rlhf or ut/e2e test case. |
3232
| `expert_map_path` | str | `None` | When using expert load balancing for the MOE model, an expert map path needs to be passed in. |
33-
| `chunked_prefill_for_mla` | bool | `False` | Whether to enable the fused operator-like chunked_prefill. |
3433
| `kv_cache_dtype` | str | `None` | When using the kv cache quantization method, kv cache dtype needs to be set, currently only int8 is supported. |
3534

3635
The details of each config option are as follows:

vllm_ascend/ascend_config.py

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -45,8 +45,6 @@ def __init__(self, vllm_config):
4545
ascend_scheduler_config)
4646

4747
self.expert_map_path = additional_config.get("expert_map_path", None)
48-
self.chunked_prefill_for_mla = additional_config.get(
49-
"chunked_prefill_for_mla", False)
5048

5149

5250
class TorchairGraphConfig:

0 commit comments

Comments
 (0)