-
Notifications
You must be signed in to change notification settings - Fork 251
Pull requests: vllm-project/vllm-ascend
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[V0 Deprecation] Remove V0 engine (component/ut/example)
module:core
module:tests
#1770
opened Jul 14, 2025 by
shen-shanshan
Loading…
[Test] Resolve vllm-ascend version
dense-accuracy-test
enable dense accuracy test for PR
ready-for-test
start test by label for PR
#1769
opened Jul 14, 2025 by
zhangxinyuehfad
Loading…
[PD Disagg][CI] Upgrade vllm version to fix ci
pd-test
enable pd test for PR
ready-for-test
start test by label for PR
#1765
opened Jul 14, 2025 by
MengqingCao
Loading…
[Misc] Remove VLLM_USE_V1 usage in code
module:core
module:tests
#1764
opened Jul 14, 2025 by
wangxiyuan
Loading…
【main】 Support SP for qwen2.5 and qwen3 moe
module:core
module:ops
#1761
opened Jul 12, 2025 by
lbk-sys
Loading…
[V0.9.1] torchair_graph bugfix when chunked_prefill is true
#1748
opened Jul 11, 2025 by
fems14
Loading…
[v0.9.1] fix async pullkv determin
merge-conflicts
#1746
opened Jul 11, 2025 by
ganyi1996ppo
Loading…
[V0.9.1] Add support for flashcomm_v1 in Qwen2.5
module:core
#1745
opened Jul 11, 2025 by
rjg-lyh
Loading…
flashcomm3 multi stream of moe layer
merge-conflicts
module:core
module:ops
module:quantization
#1744
opened Jul 11, 2025 by
wyhhyw123
Loading…
[Platform] Add support for Altlas A3 series
ci/build
module:core
#1740
opened Jul 11, 2025 by
wxsIcey
Loading…
Optimization of TP4 Parallelism in DeepSeek MLP Dense Layers
#1738
opened Jul 11, 2025 by
zhanghw0354
Loading…
[Doc] Add model costomization doc
documentation
Improvements or additions to documentation
#1737
opened Jul 11, 2025 by
shen-shanshan
Loading…
[2/N] Enable shellcheck and pymarkdown for lint system
documentation
Improvements or additions to documentation
module:tests
module:tools
#1735
opened Jul 11, 2025 by
Potabk
Loading…
[Test] Remove VLLM_USE_V1 in example and tests
module:tests
#1733
opened Jul 11, 2025 by
wangxiyuan
Loading…
[Perf] Reduce memory usage by splitting tokens in fused_experts
documentation
Improvements or additions to documentation
module:core
module:ops
module:quantization
module:tests
ready
read for review
#1729
opened Jul 10, 2025 by
ApsarasX
Loading…
[V0.9.1] add support for flashcomm2 in qwen3
merge-conflicts
module:core
#1726
opened Jul 10, 2025 by
David9857
Loading…
[BUGFIX] [v0.9.1-dev] Obtain the NPU ID of non-consecutive NPU cards
#1724
opened Jul 10, 2025 by
yangqinghao-cmss
Loading…
[v0.9.1]add rot_pos_emb()/get_window_index()/_process_image_input() to qwen2.5_vl_without_padding
#1705
opened Jul 9, 2025 by
zheliuyu
Loading…
[V0.9.1] Replace FA interface with FA_V2 to optimize perf in SelfAttention
#1701
opened Jul 9, 2025 by
rjg-lyh
Loading…
[V0.9.1]feat: Qwen3-dense model support dual-batch overlap(dbo)
#1699
opened Jul 9, 2025 by
ZhaoJiangJiang
Loading…
[WIP] dynamic eplb
module:core
module:ops
module:quantization
#1697
opened Jul 9, 2025 by
wanghanqingLYT
Loading…
support fa3 quant for v0.9.1-dev
module:quantization
module:tests
#1695
opened Jul 9, 2025 by
22dimensions
Loading…
[WIP][Prefill Performance] Parallel Strategy Optimizations (VRAM-for-Speed Tradeoff)
merge-conflicts
module:ops
module:quantization
#1687
opened Jul 9, 2025 by
SlightwindSec
Loading…
Previous Next
ProTip!
Filter pull requests by the default branch with base:main.