Skip to content

[Doc] Update support feature #1828

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 23, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
54 changes: 28 additions & 26 deletions docs/source/user_guide/support_matrix/supported_features.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,37 +4,37 @@ The feature support principle of vLLM Ascend is: **aligned with the vLLM**. We a

You can check the [support status of vLLM V1 Engine][v1_user_guide]. Below is the feature support status of vLLM Ascend:

| Feature | vLLM V0 Engine | vLLM V1 Engine | Next Step |
|-------------------------------|----------------|----------------|------------------------------------------------------------------------|
| Chunked Prefill | 🟢 Functional | 🟢 Functional | Functional, see detail note: [Chunked Prefill][cp] |
| Automatic Prefix Caching | 🟢 Functional | 🟢 Functional | Functional, see detail note: [vllm-ascend#732][apc] |
| LoRA | 🟢 Functional | 🟢 Functional | [vllm-ascend#396][multilora], [vllm-ascend#893][v1 multilora] |
| Prompt adapter | 🔴 No plan | 🔴 No plan | This feature has been deprecated by vllm. |
| Speculative decoding | 🟢 Functional | 🟢 Functional | Basic support |
| Pooling | 🟢 Functional | 🟡 Planned | CI needed and adapting more models; V1 support rely on vLLM support. |
| Enc-dec | 🔴 NO plan | 🟡 Planned | Plan in 2025.06.30 |
| Multi Modality | 🟢 Functional | 🟢 Functional | [Tutorial][multimodal], optimizing and adapting more models |
| LogProbs | 🟢 Functional | 🟢 Functional | CI needed |
| Prompt logProbs | 🟢 Functional | 🟢 Functional | CI needed |
| Async output | 🟢 Functional | 🟢 Functional | CI needed |
| Multi step scheduler | 🟢 Functional | 🔴 Deprecated | [vllm#8779][v1_rfc], replaced by [vLLM V1 Scheduler][v1_scheduler] |
| Best of | 🟢 Functional | 🔴 Deprecated | [vllm#13361][best_of], CI needed |
| Beam search | 🟢 Functional | 🟢 Functional | CI needed |
| Guided Decoding | 🟢 Functional | 🟢 Functional | [vllm-ascend#177][guided_decoding] |
| Tensor Parallel | 🟢 Functional | 🟢 Functional | CI needed |
| Pipeline Parallel | 🟢 Functional | 🟢 Functional | CI needed |
| Expert Parallel | 🔴 NO plan | 🟢 Functional | CI needed; No plan on V0 support |
| Data Parallel | 🔴 NO plan | 🟢 Functional | CI needed; No plan on V0 support |
| Prefill Decode Disaggregation | 🟢 Functional | 🟢 Functional | 1P1D available, working on xPyD and V1 support. |
| Quantization | 🟢 Functional | 🟢 Functional | W8A8 available, CI needed; working on more quantization method support |
| Graph Mode | 🔴 NO plan | 🔵 Experimental| Experimental, see detail note: [vllm-ascend#767][graph_mode] |
| Sleep Mode | 🟢 Functional | 🟢 Functional | level=1 available, CI needed, working on V1 support |
| Feature | Status | Next Step |
|-------------------------------|----------------|------------------------------------------------------------------------|
| Chunked Prefill | 🟢 Functional | Functional, see detail note: [Chunked Prefill][cp] |
| Automatic Prefix Caching | 🟢 Functional | Functional, see detail note: [vllm-ascend#732][apc] |
| LoRA | 🟢 Functional | [vllm-ascend#396][multilora], [vllm-ascend#893][v1 multilora] |
| Prompt adapter | 🔴 No plan | This feature has been deprecated by vLLM. |
| Speculative decoding | 🟢 Functional | Basic support |
| Pooling | 🟢 Functional | CI needed and adapting more models; V1 support rely on vLLM support. |
| Enc-dec | 🟡 Planned | vLLM should support this feature first. |
| Multi Modality | 🟢 Functional | [Tutorial][multimodal], optimizing and adapting more models |
| LogProbs | 🟢 Functional | CI needed |
| Prompt logProbs | 🟢 Functional | CI needed |
| Async output | 🟢 Functional | CI needed |
| Multi step scheduler | 🔴 Deprecated | [vllm#8779][v1_rfc], replaced by [vLLM V1 Scheduler][v1_scheduler] |
| Best of | 🔴 Deprecated | [vllm#13361][best_of] |
| Beam search | 🟢 Functional | CI needed |
| Guided Decoding | 🟢 Functional | [vllm-ascend#177][guided_decoding] |
| Tensor Parallel | 🟢 Functional | Make TP >4 work with graph mode |
| Pipeline Parallel | 🟢 Functional | Write official guide and tutorial. |
| Expert Parallel | 🟢 Functional | Dynamic EPLB support. |
| Data Parallel | 🟢 Functional | Data Parallel support for Qwen3 MoE. |
| Prefill Decode Disaggregation | 🚧 WIP | working on [1P1D] and xPyD. |
| Quantization | 🟢 Functional | W8A8 available; working on more quantization method support(W4A8, etc) |
| Graph Mode | 🔵 Experimental| Experimental, see detail note: [vllm-ascend#767][graph_mode] |
| Sleep Mode | 🟢 Functional | |

- 🟢 Functional: Fully operational, with ongoing optimizations.
- 🔵 Experimental: Experimental support, interfaces and functions may change.
- 🚧 WIP: Under active development, will be supported soon.
- 🟡 Planned: Scheduled for future implementation (some may have open PRs/RFCs).
- 🔴 NO plan / Deprecated: No plan for V0 or deprecated by vLLM v1.
- 🔴 NO plan / Deprecated: No plan or deprecated by vLLM.

[v1_user_guide]: https://docs.vllm.ai/en/latest/getting_started/v1_user_guide.html
[multimodal]: https://vllm-ascend.readthedocs.io/en/latest/tutorials/single_npu_multimodal.html
Expand All @@ -47,3 +47,5 @@ You can check the [support status of vLLM V1 Engine][v1_user_guide]. Below is th
[graph_mode]: https://github.com/vllm-project/vllm-ascend/issues/767
[apc]: https://github.com/vllm-project/vllm-ascend/issues/732
[cp]: https://docs.vllm.ai/en/stable/performance/optimization.html#chunked-prefill
[1P1D]: https://github.com/vllm-project/vllm-ascend/pull/950
[ray]: https://github.com/vllm-project/vllm-ascend/issues/1751
2 changes: 2 additions & 0 deletions docs/source/user_guide/support_matrix/supported_models.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
# Model Support

Get the newest info here: https://github.com/vllm-project/vllm-ascend/issues/1608

## Text-only Language Models

### Generative Models
Expand Down
Loading