Skip to content

Conversation

varun-sundar-rabindranath
Copy link
Contributor

@varun-sundar-rabindranath varun-sundar-rabindranath commented Oct 17, 2025

Purpose

Update DeepEP commit to DeepEP main (73b6ea4) to pick up additional hidden-size support that is useful for gpt-oss low-latency DP/EP

Test Plan

Run unit test test_deepep_moe.py locally

e2e:
Server : VLLM_ALL2ALL_BACKEND="deepep_high_throughput" VLLM_USE_DEEP_GEMM=1 canhazgpu run -g2 -- vllm serve Qwen/Qwen3-30B-A3B-FP8 --trust-remote-code --enable-expert-parallel --data-parallel-size 2 -no-enable-prefix-caching

Server: VLLM_ALL2ALL_BACKEND="deepep_low_latency" VLLM_USE_DEEP_GEMM=1 canhazgpu run -g2 -- vllm serve Qwen/Qwen3-30B-A3B-FP8 --trust-remote-code --enable-expert-parallel --data-parallel-size 2 --port 9010 --no-enable-prefix-caching

lm_eval: lm_eval --model local-completions --tasks gsm8k --model_args model=Qwen/Qwen3-30B-A3B-FP8,base_url=http://127.0.0.1:9010/v1/completions,num_concurrent=30,max_retries=3 --limit 100

Test Result

|Tasks|Version|     Filter     |n-shot|  Metric   |   |Value|   |Stderr|
|-----|------:|----------------|-----:|-----------|---|----:|---|-----:|
|gsm8k|      3|flexible-extract|     5|exact_match|↑  | 0.94|±  |0.0239|
|     |       |strict-match    |     5|exact_match|↑  | 0.94|±  |0.0239|
|Tasks|Version|     Filter     |n-shot|  Metric   |   |Value|   |Stderr|
|-----|------:|----------------|-----:|-----------|---|----:|---|-----:|
|gsm8k|      3|flexible-extract|     5|exact_match|↑  | 0.86|±  |0.0349|
|     |       |strict-match    |     5|exact_match|↑  | 0.93|±  |0.0256|

Copy link

mergify bot commented Oct 17, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @varun-sundar-rabindranath.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Oct 17, 2025
@varun-sundar-rabindranath
Copy link
Contributor Author

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the DeepEP dependency to a newer commit, enabling support for a hidden size of 3072. The changes are straightforward, involving an update to the commit hash in the installation script and adding the new size to the supported sizes list. My review includes a performance-related suggestion to use a frozenset for the list of supported hidden sizes, which is more efficient for membership checks in a low-latency context.

# DeepEP low-latency kernels are compiled only for certain
# specific hidden sizes.
SUPPORTED_HIDDEN_SIZES = [2048, 2560, 4096, 5120, 6144, 7168]
SUPPORTED_HIDDEN_SIZES = [2048, 2560, 3072, 4096, 5120, 6144, 7168]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

For improved performance, consider using a frozenset for SUPPORTED_HIDDEN_SIZES. Membership testing (in) is O(1) on average for sets, while it is O(n) for lists. Given that this check is in a performance-sensitive path for low-latency kernels, this optimization is beneficial. A frozenset is suitable here as it's an immutable constant.

Suggested change
SUPPORTED_HIDDEN_SIZES = [2048, 2560, 3072, 4096, 5120, 6144, 7168]
SUPPORTED_HIDDEN_SIZES = frozenset([2048, 2560, 3072, 4096, 5120, 6144, 7168])

Copy link
Member

@mgoin mgoin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice :) Please fix the conflict and we can ready

Copy link
Member

@yewentao256 yewentao256 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the work!

@mgoin mgoin added ready ONLY add when PR is ready to merge/full CI is needed dependencies Pull requests that update a dependency file deepseek Related to DeepSeek models labels Oct 17, 2025
@mgoin mgoin enabled auto-merge (squash) October 17, 2025 23:02
Signed-off-by: Varun Sundar Rabindranath <vsundarr@redhat.com>
auto-merge was automatically disabled October 18, 2025 03:19

Head branch was pushed to by a user without write access

@DarkLight1337 DarkLight1337 merged commit 30a33b9 into vllm-project:main Oct 18, 2025
50 checks passed
lywa1998 pushed a commit to lywa1998/vllm that referenced this pull request Oct 20, 2025
Signed-off-by: Varun Sundar Rabindranath <vsundarr@redhat.com>
Co-authored-by: Varun Sundar Rabindranath <vsundarr@redhat.com>
adabeyta pushed a commit to adabeyta/vllm that referenced this pull request Oct 20, 2025
Signed-off-by: Varun Sundar Rabindranath <vsundarr@redhat.com>
Co-authored-by: Varun Sundar Rabindranath <vsundarr@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

deepseek Related to DeepSeek models dependencies Pull requests that update a dependency file ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants