Skip to content

Commit 5736337

Browse files
committed
[Doc] Update support feature
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
1 parent 0665500 commit 5736337

File tree

1 file changed

+26
-26
lines changed

1 file changed

+26
-26
lines changed

docs/source/user_guide/support_matrix/supported_features.md

Lines changed: 26 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -4,37 +4,37 @@ The feature support principle of vLLM Ascend is: **aligned with the vLLM**. We a
44

55
You can check the [support status of vLLM V1 Engine][v1_user_guide]. Below is the feature support status of vLLM Ascend:
66

7-
| Feature | vLLM V0 Engine | vLLM V1 Engine | Next Step |
8-
|-------------------------------|----------------|----------------|------------------------------------------------------------------------|
9-
| Chunked Prefill | 🟢 Functional | 🟢 Functional | Functional, see detail note: [Chunked Prefill][cp] |
10-
| Automatic Prefix Caching | 🟢 Functional | 🟢 Functional | Functional, see detail note: [vllm-ascend#732][apc] |
11-
| LoRA | 🟢 Functional | 🟢 Functional | [vllm-ascend#396][multilora], [vllm-ascend#893][v1 multilora] |
12-
| Prompt adapter | 🔴 No plan | 🔴 No plan | This feature has been deprecated by vllm. |
13-
| Speculative decoding | 🟢 Functional | 🟢 Functional | Basic support |
14-
| Pooling | 🟢 Functional | 🟡 Planned | CI needed and adapting more models; V1 support rely on vLLM support. |
15-
| Enc-dec | 🔴 NO plan | 🟡 Planned | Plan in 2025.06.30 |
16-
| Multi Modality | 🟢 Functional | 🟢 Functional | [Tutorial][multimodal], optimizing and adapting more models |
17-
| LogProbs | 🟢 Functional | 🟢 Functional | CI needed |
18-
| Prompt logProbs | 🟢 Functional | 🟢 Functional | CI needed |
19-
| Async output | 🟢 Functional | 🟢 Functional | CI needed |
20-
| Multi step scheduler | 🟢 Functional | 🔴 Deprecated | [vllm#8779][v1_rfc], replaced by [vLLM V1 Scheduler][v1_scheduler] |
21-
| Best of | 🟢 Functional | 🔴 Deprecated | [vllm#13361][best_of], CI needed |
22-
| Beam search | 🟢 Functional | 🟢 Functional | CI needed |
23-
| Guided Decoding | 🟢 Functional | 🟢 Functional | [vllm-ascend#177][guided_decoding] |
24-
| Tensor Parallel | 🟢 Functional | 🟢 Functional | CI needed |
25-
| Pipeline Parallel | 🟢 Functional | 🟢 Functional | CI needed |
26-
| Expert Parallel | 🔴 NO plan | 🟢 Functional | CI needed; No plan on V0 support |
27-
| Data Parallel | 🔴 NO plan | 🟢 Functional | CI needed; No plan on V0 support |
28-
| Prefill Decode Disaggregation | 🟢 Functional | 🟢 Functional | 1P1D available, working on xPyD and V1 support. |
29-
| Quantization | 🟢 Functional | 🟢 Functional | W8A8 available, CI needed; working on more quantization method support |
30-
| Graph Mode | 🔴 NO plan | 🔵 Experimental| Experimental, see detail note: [vllm-ascend#767][graph_mode] |
31-
| Sleep Mode | 🟢 Functional | 🟢 Functional | level=1 available, CI needed, working on V1 support |
7+
| Feature | Status | Next Step |
8+
|-------------------------------|----------------|------------------------------------------------------------------------|
9+
| Chunked Prefill | 🟢 Functional | Functional, see detail note: [Chunked Prefill][cp] |
10+
| Automatic Prefix Caching | 🟢 Functional | Functional, see detail note: [vllm-ascend#732][apc] |
11+
| LoRA | 🟢 Functional | [vllm-ascend#396][multilora], [vllm-ascend#893][v1 multilora] |
12+
| Prompt adapter | 🔴 No plan | This feature has been deprecated by vLLM. |
13+
| Speculative decoding | 🟢 Functional | Basic support |
14+
| Pooling | 🟢 Functional | CI needed and adapting more models; V1 support rely on vLLM support. |
15+
| Enc-dec | 🟡 Planned | vLLM should support this feature first. |
16+
| Multi Modality | 🟢 Functional | [Tutorial][multimodal], optimizing and adapting more models |
17+
| LogProbs | 🟢 Functional | CI needed |
18+
| Prompt logProbs | 🟢 Functional | CI needed |
19+
| Async output | 🟢 Functional | CI needed |
20+
| Multi step scheduler | 🔴 Deprecated | [vllm#8779][v1_rfc], replaced by [vLLM V1 Scheduler][v1_scheduler] |
21+
| Best of | 🔴 Deprecated | [vllm#13361][best_of], CI needed |
22+
| Beam search | 🟢 Functional | CI needed |
23+
| Guided Decoding | 🟢 Functional | [vllm-ascend#177][guided_decoding] |
24+
| Tensor Parallel | 🟢 Functional | Make TP >4 work with graph mode |
25+
| Pipeline Parallel | 🚧 WIP | There is some known issue with ray. Working in progress |
26+
| Expert Parallel | 🟢 Functional | Dynamic EPLB support. |
27+
| Data Parallel | 🟢 Functional | Data Parallel support for Qwen3 MoE. |
28+
| Prefill Decode Disaggregation | 🚧 WIP | working on 1P1D and xPyD. |
29+
| Quantization | 🟢 Functional | W8A8 available; working on more quantization method support(W4A8, etc) |
30+
| Graph Mode | 🔵 Experimental| Experimental, see detail note: [vllm-ascend#767][graph_mode] |
31+
| Sleep Mode | 🟢 Functional | |
3232

3333
- 🟢 Functional: Fully operational, with ongoing optimizations.
3434
- 🔵 Experimental: Experimental support, interfaces and functions may change.
3535
- 🚧 WIP: Under active development, will be supported soon.
3636
- 🟡 Planned: Scheduled for future implementation (some may have open PRs/RFCs).
37-
- 🔴 NO plan / Deprecated: No plan for V0 or deprecated by vLLM v1.
37+
- 🔴 NO plan / Deprecated: No plan or deprecated by vLLM.
3838

3939
[v1_user_guide]: https://docs.vllm.ai/en/latest/getting_started/v1_user_guide.html
4040
[multimodal]: https://vllm-ascend.readthedocs.io/en/latest/tutorials/single_npu_multimodal.html

0 commit comments

Comments
 (0)