[Model] Support deepseek with eagle #21086

xyang16 · 2025-07-17T01:57:00Z

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.

Purpose

This PR is to support running eagle speculative decoding on deepseek model. Changed the following file:

deepseek_eagle.py: deepseek eagle model definition
registry.py: add the model to registry

Test Plan

We have ran Deepseek with an eagle draft model.

export VLLM_USE_V1=1
export VLLM_MLA_DISABLE=1
vllm serve deepseek-ai/DeepSeek-R1 \
    --port 8080 \
    --tensor-parallel-size 8 \
    --pipeline-parallel-size 1 \
    --max-model-len 8192 \
    --max-num-seqs 8 \
    --trust-remote-code \
    --speculative-config '{"method": "eagle", "model":"eagle618/eagle-deepseek-r1", "num_speculative_tokens": 3}'

Test Result

Acceptance rate is around 60% for vanilla eagle head (and better acceptance rate for fine tuned eagle head).

INFO 07-16 18:05:38 [metrics.py:87] SpecDecoding metrics: Draft acceptance rate: 58.4%, Mean acceptance length: 2.75, Accepted: 163 tokens, Drafted: 279 tokens, Per-position acceptance rate: 0.882, 0.559, 0.312
INFO 07-16 18:08:09 [metrics.py:87] SpecDecoding metrics: Draft acceptance rate: 63.3%, Mean acceptance length: 2.90, Accepted: 167 tokens, Drafted: 264 tokens, Per-position acceptance rate: 0.886, 0.670, 0.341

(Optional) Documentation Update

gemini-code-assist

Code Review

This pull request adds support for Eagle speculative decoding with Deepseek models. I've found a few critical issues in the implementation that will prevent it from working correctly. The model implementation in deepseek_eagle.py incorrectly handles hidden state dimensions and is missing the lm_head layer, which will cause runtime errors. Additionally, the model registry key in registry.py seems to be incorrect, which would prevent the model from being loaded.

gemini-code-assist · 2025-07-17T01:58:52Z

vllm/model_executor/models/deepseek_eagle.py

+        self.fc = nn.Linear(
+            self.config.model.hidden_size * 2,
+            self.config.model.hidden_size,
+            bias=False,
+        )
+
+        self.enorm = RMSNorm(self.config.hidden_size,
+                             eps=self.config.rms_norm_eps)
+        self.hnorm = RMSNorm(self.config.hidden_size,
+                             eps=self.config.rms_norm_eps)
+        self.norm = RMSNorm(self.config.hidden_size,
+                            eps=self.config.rms_norm_eps)


The implementation of DeepseekV2Model assumes that the draft model and the target model share the same hidden size. For instance, self.hnorm is initialized with the draft model's hidden size (self.config.hidden_size) but is applied to hidden_states from the target model.

This assumption is incorrect for the models used in testing (deepseek-r1 has a hidden size of 4096, while eagle-deepseek-r1 has 1024), and will lead to a runtime error due to shape mismatch.

To fix this, you should explicitly use the hidden sizes from both the draft and target model configurations. You can access the target model's configuration via vllm_config.model_config.

target_config = vllm_config.model_config.hf_config draft_hidden_size = self.config.hidden_size target_hidden_size = target_config.hidden_size self.fc = nn.Linear( draft_hidden_size + target_hidden_size, draft_hidden_size, bias=False, ) self.enorm = RMSNorm(draft_hidden_size, eps=self.config.rms_norm_eps) self.hnorm = RMSNorm(target_hidden_size, eps=target_config.rms_norm_eps) self.norm = RMSNorm(draft_hidden_size, eps=self.config.rms_norm_eps)

vllm/model_executor/models/deepseek_eagle.py

vllm/model_executor/models/registry.py

github-actions · 2025-07-17T02:00:18Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

Ja1Zhou · 2025-07-17T05:03:55Z

Hi! I tried installing this pr from source. But got

OSError: eagle618/eagle-deepseek-r1 does not appear to have a file named configuration_deepseek.py. Checkout 'https://huggingface.co/eagle618/eagle-deepseek-r1/tree/main' for available files.

Should the auto_map field of config.json be fixed?

xyang16 · 2025-07-17T05:23:56Z

Hi! I tried installing this pr from source. But got
OSError: eagle618/eagle-deepseek-r1 does not appear to have a file named configuration_deepseek.py. Checkout 'https://huggingface.co/eagle618/eagle-deepseek-r1/tree/main' for available files.
Should the auto_map field of config.json be fixed?

Thanks for your comment, fixed now.

Ja1Zhou · 2025-07-17T16:58:41Z

Hi! I tried installing this pr from source. But got
OSError: eagle618/eagle-deepseek-r1 does not appear to have a file named configuration_deepseek.py. Checkout 'https://huggingface.co/eagle618/eagle-deepseek-r1/tree/main' for available files.
Should the auto_map field of config.json be fixed?
Thanks for your comment, fixed now.

Amazing work!

I wonder if you could share how you got eagle618/eagle-deepseek-r1? As this pr could also improve DS V3 etc. Thank you!

Signed-off-by: Xin Yang <xyangx@amazon.com>

mergify bot added deepseek Related to DeepSeek models new-model Requests to new models speculative-decoding labels Jul 17, 2025

gemini-code-assist bot reviewed Jul 17, 2025

View reviewed changes

xyang16 force-pushed the eagle branch from 21fd8a3 to 1bd04e4 Compare July 17, 2025 02:11

xyang16 changed the title ~~[v1] Support deepseek with eagle~~ [Model] Support deepseek with eagle Jul 17, 2025

xyang16 force-pushed the eagle branch 3 times, most recently from 96a0839 to 6981998 Compare July 18, 2025 01:53

[Model] Support deepseek with eagle

c4cda03

Signed-off-by: Xin Yang <xyangx@amazon.com>

xyang16 force-pushed the eagle branch from 6981998 to c4cda03 Compare July 18, 2025 02:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Model] Support deepseek with eagle #21086

[Model] Support deepseek with eagle #21086

xyang16 commented Jul 17, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Jul 17, 2025

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Jul 17, 2025

Uh oh!

Ja1Zhou commented Jul 17, 2025 •

edited

Loading

Uh oh!

xyang16 commented Jul 17, 2025

Uh oh!

Ja1Zhou commented Jul 17, 2025

Uh oh!

Uh oh!

Uh oh!

[Model] Support deepseek with eagle #21086

Are you sure you want to change the base?

[Model] Support deepseek with eagle #21086

Conversation

xyang16 commented Jul 17, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Essential Elements of an Effective PR Description Checklist

Purpose

Test Plan

Test Result

(Optional) Documentation Update

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jul 17, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Jul 17, 2025

Uh oh!

Ja1Zhou commented Jul 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

xyang16 commented Jul 17, 2025

Uh oh!

Ja1Zhou commented Jul 17, 2025

Uh oh!

Uh oh!

xyang16 commented Jul 17, 2025 •

edited by github-actions bot

Loading

Ja1Zhou commented Jul 17, 2025 •

edited

Loading