Skip to content

[Perf] Reduce memory usage by splitting tokens in fused_experts and avoiding unused tensor#833

Closed
ApsarasX wants to merge 2 commits intovllm-project:mainfrom
ApsarasX:wengang/memory-optimization
Closed

[Perf] Reduce memory usage by splitting tokens in fused_experts and avoiding unused tensor#833
ApsarasX wants to merge 2 commits intovllm-project:mainfrom
ApsarasX:wengang/memory-optimization

Commits

Commits on May 18, 2025