vllm-project · Yikun · Jul 23, 2025 · Jul 16, 2025
diff --git a/docs/source/user_guide/support_matrix/supported_features.md b/docs/source/user_guide/support_matrix/supported_features.md
@@ -4,37 +4,37 @@ The feature support principle of vLLM Ascend is: **aligned with the vLLM**. We a
 
 You can check the [support status of vLLM V1 Engine][v1_user_guide]. Below is the feature support status of vLLM Ascend:
 
-| Feature                       | vLLM V0 Engine | vLLM V1 Engine | Next Step                                                              |
-|-------------------------------|----------------|----------------|------------------------------------------------------------------------|
-| Chunked Prefill               | 🟢 Functional  | 🟢 Functional  | Functional, see detail note: [Chunked Prefill][cp]                     |
-| Automatic Prefix Caching      | 🟢 Functional  | 🟢 Functional  | Functional, see detail note: [vllm-ascend#732][apc]                    |
-| LoRA                          | 🟢 Functional  | 🟢 Functional  | [vllm-ascend#396][multilora], [vllm-ascend#893][v1 multilora]          |
-| Prompt adapter                | 🔴 No plan     | 🔴 No plan     | This feature has been deprecated by vllm.                              |
-| Speculative decoding          | 🟢 Functional  | 🟢 Functional  | Basic support                                                          |
-| Pooling                       | 🟢 Functional  | 🟡 Planned     | CI needed and adapting more models; V1 support rely on vLLM support.   |
-| Enc-dec                       | 🔴 NO plan     | 🟡 Planned     | Plan in 2025.06.30                                                     |
-| Multi Modality                | 🟢 Functional  | 🟢 Functional  | [Tutorial][multimodal], optimizing and adapting more models            |
-| LogProbs                      | 🟢 Functional  | 🟢 Functional  | CI needed                                                              |
-| Prompt logProbs               | 🟢 Functional  | 🟢 Functional  | CI needed                                                              |
-| Async output                  | 🟢 Functional  | 🟢 Functional  | CI needed                                                              |
-| Multi step scheduler          | 🟢 Functional  | 🔴 Deprecated  | [vllm#8779][v1_rfc], replaced by [vLLM V1 Scheduler][v1_scheduler]     |
-| Best of                       | 🟢 Functional  | 🔴 Deprecated  | [vllm#13361][best_of], CI needed                                       |
-| Beam search                   | 🟢 Functional  | 🟢 Functional  | CI needed                                                              |
-| Guided Decoding               | 🟢 Functional  | 🟢 Functional  | [vllm-ascend#177][guided_decoding]                                     |
-| Tensor Parallel               | 🟢 Functional  | 🟢 Functional  | CI needed                                                              |
-| Pipeline Parallel             | 🟢 Functional  | 🟢 Functional  | CI needed                                                              |
-| Expert Parallel               | 🔴 NO plan     | 🟢 Functional  | CI needed; No plan on V0 support                                       |
-| Data Parallel                 | 🔴 NO plan     | 🟢 Functional  | CI needed;  No plan on V0 support                                      |
-| Prefill Decode Disaggregation | 🟢 Functional  | 🟢 Functional  | 1P1D available, working on xPyD and V1 support.                        |
-| Quantization                  | 🟢 Functional  | 🟢 Functional  | W8A8 available, CI needed; working on more quantization method support |
-| Graph Mode                    | 🔴 NO plan     | 🔵 Experimental| Experimental, see detail note: [vllm-ascend#767][graph_mode]           |
-| Sleep Mode                    | 🟢 Functional  | 🟢 Functional  | level=1 available, CI needed, working on V1 support                    |
+| Feature                       |      Status    | Next Step                                                              |
+|-------------------------------|----------------|------------------------------------------------------------------------|
+| Chunked Prefill               | 🟢 Functional  | Functional, see detail note: [Chunked Prefill][cp]                     |
+| Automatic Prefix Caching      | 🟢 Functional  | Functional, see detail note: [vllm-ascend#732][apc]                    |
+| LoRA                          | 🟢 Functional  | [vllm-ascend#396][multilora], [vllm-ascend#893][v1 multilora]          |
+| Prompt adapter                | 🔴 No plan     | This feature has been deprecated by vLLM.                              |
+| Speculative decoding          | 🟢 Functional  | Basic support                                                          |
+| Pooling                       | 🟢 Functional  | CI needed and adapting more models; V1 support rely on vLLM support.   |
+| Enc-dec                       | 🟡 Planned     | vLLM should support this feature first.                                |
+| Multi Modality                | 🟢 Functional  | [Tutorial][multimodal], optimizing and adapting more models            |
+| LogProbs                      | 🟢 Functional  | CI needed                                                              |
+| Prompt logProbs               | 🟢 Functional  | CI needed                                                              |
+| Async output                  | 🟢 Functional  | CI needed                                                              |
+| Multi step scheduler          | 🔴 Deprecated  | [vllm#8779][v1_rfc], replaced by [vLLM V1 Scheduler][v1_scheduler]     |
+| Best of                       | 🔴 Deprecated  | [vllm#13361][best_of]                                                  |
+| Beam search                   | 🟢 Functional  | CI needed                                                              |
+| Guided Decoding               | 🟢 Functional  | [vllm-ascend#177][guided_decoding]                                     |
+| Tensor Parallel               | 🟢 Functional  | Make TP >4 work with graph mode                                        |
+| Pipeline Parallel             | 🟢 Functional  | Write official guide and tutorial.                                     |
+| Expert Parallel               | 🟢 Functional  | Dynamic EPLB support.                                                  |
+| Data Parallel                 | 🟢 Functional  | Data Parallel support for Qwen3 MoE.                                   |
+| Prefill Decode Disaggregation | 🚧 WIP         | working on [1P1D] and xPyD.                                            |
+| Quantization                  | 🟢 Functional  | W8A8 available; working on more quantization method support(W4A8, etc) |
+| Graph Mode                    | 🔵 Experimental| Experimental, see detail note: [vllm-ascend#767][graph_mode]           |
+| Sleep Mode                    | 🟢 Functional  |                                                                        |
 
 - 🟢 Functional: Fully operational, with ongoing optimizations.
 - 🔵 Experimental: Experimental support, interfaces and functions may change.
 - 🚧 WIP: Under active development, will be supported soon.
 - 🟡 Planned: Scheduled for future implementation (some may have open PRs/RFCs).
-- 🔴 NO plan / Deprecated: No plan for V0 or deprecated by vLLM v1.
+- 🔴 NO plan / Deprecated: No plan or deprecated by vLLM.
 
 [v1_user_guide]: https://docs.vllm.ai/en/latest/getting_started/v1_user_guide.html
 [multimodal]: https://vllm-ascend.readthedocs.io/en/latest/tutorials/single_npu_multimodal.html
@@ -47,3 +47,5 @@ You can check the [support status of vLLM V1 Engine][v1_user_guide]. Below is th
 [graph_mode]: https://github.com/vllm-project/vllm-ascend/issues/767
 [apc]: https://github.com/vllm-project/vllm-ascend/issues/732
 [cp]: https://docs.vllm.ai/en/stable/performance/optimization.html#chunked-prefill
+[1P1D]: https://github.com/vllm-project/vllm-ascend/pull/950
+[ray]: https://github.com/vllm-project/vllm-ascend/issues/1751
diff --git a/docs/source/user_guide/support_matrix/supported_models.md b/docs/source/user_guide/support_matrix/supported_models.md
@@ -1,5 +1,7 @@
 # Model Support
 
+Get the newest info here: https://github.com/vllm-project/vllm-ascend/issues/1608
+
 ## Text-only Language Models
 
 ### Generative Models