-
Notifications
You must be signed in to change notification settings - Fork 338
Open
Labels
RFCRequest For CommentsRequest For Comments
Description
Motivation.
This RFC is used for trace the community work of official doc improvement. Currently, there are 3 important section need update.
- Tutorials
- User doc
- Developer doc
I'll list all the work items below. Everyone is welcome to take the task.
Proposed Change.
Tutorials
Problem:
- Moe model guide is missing. For example Qwen3 Moe
- No detial parallel case example
- QwQ is not very popular.
- V1 Engine should be used by default
Propose Change:
After improvment. The content be more clear(required : 313T+64GB):
- Single NPU(Qwen3-8B) aclgraph mode + eager mode @leo-pony Doc Enhancement: Single NPU(Qwen3-8B) aclgraph mode + eager mode #1374
- Single NPU(Qwen2.5-VL-7B)eager mode @shen-shanshan [Doc] Add Qwen2.5-VL eager mode doc #1394
- Single NPU(Qwen2.5-audio)eager mode @shen-shanshan [Doc] Add qwen2-audio eager mode tutorial #1371
- Single NPU(Qwen3 8B embedding)eager mode @wangxiyuan
- Multi NPU 2 card (Qwen3 MOE-30B) aclgraph mode + TP2 @leo-pony
- Multi NPU 4 card (Qwen3 32B) aclgraph mode + TP2 + DP2 + W8A8(optional) @22dimensions
- Multi Node 2 node (DeepSeek V3 0528 W8A8) TP4+DP4+EP Graph mode @Potabk
- Multi Node 4 node (DeepSeek R1) TP16+DP2 Graph mode @MengqingCao
- Multi Node 4 node (DeepSeek V3 0528 ) 1P1D Graph mode @wangxiyuan
User Guide
user guide should contain the usage for vLLM Ascend
Problem:
- A lot of usage is missing.
Propose Change:
-
Feature Support Matrix (Need refresh)
- index
- Graph mode Guide (Need refresh) @wangxiyuan [Doc] Fix doc typo #1424
- Quantization Guide Add user guide for quantization #1206 @22dimensions
- Disaggregated prefill Guide @zhangxinyuehfad
- EP Guide (for example, how to use MC2、duel batch、EPLB...) @MengqingCao
- Sleep Mode Guide @Potabk [Doc] Add sleep mode doc #1295
- Guided Decoding Guide @shen-shanshan
- Spec Decode Guide @mengwei805
- Lora Guide @paulyu12 [DOC] add LoRA user guide #1265
- VLLM supported feature(func call ...) @wangxiyuan
-
Model Support Matrix (Need refresh) @zhangxinyuehfad
-
Environment Vars (Need refresh) @shen-shanshan
I think we should change the shown way. Just like Additoinal config, to make it more clear by hand. -
Additional Config (Need refresh) @wangxiyuan [Doc] Fix doc typo #1424
-
Release note
Developer Guide
Problem:
- There is no feature or code guide for developers at all.
Propose Change:
- How to contribute (Need refresh) @Yikun [1/N][CI] Move linting system to pre-commits hooks #1256
- Version Policy @Yikun No need to refresh
- CI system @Yikun
- Test Guide @MengqingCao [Doc] Update FAQ and add test guidance #1360
- Design Documents
- Patch @wangxiyuan [Doc] Add patch doc #1414
- Code Architecture @MengqingCao
- Ops and Custom ops @Yikun
- pta ops
- custom ops
- fused moe - Modeling @shen-shanshan
what model is added/patched, why
how to add a new model [docs] Update guidance on how to implement and register new models #1126 - Attention @wangxiyuan
- default attention
- mla - Communicator @leo-pony
- Disaggregated Prefill @ganyi1996ppo
- Graph mode @zzzzwwjj
- Quantization @22dimensions
- Evaluation (Need refresh) @zhangxinyuehfad
- release guide @wangxiyuan
paulyu12, Yikun, tt545571022, shen-shanshan and ganyi1996ppo
Metadata
Metadata
Assignees
Labels
RFCRequest For CommentsRequest For Comments