Skip to content

Conversation

JaredforReal
Copy link
Collaborator

@JaredforReal JaredforReal commented Sep 15, 2025

What type of PR is this?
docs: Model Performance Evaluation Guide

What this PR does / why we need it:

Evaluation makes routing data-driven. By measuring per-category accuracy on MMLU-Pro (and doing a quick sanity check with ARC), you can:

  • Select the right model for each category and rank them into categories.model_scores
  • Pick a sensible default_model based on overall performance
  • Decide when CoT prompting is worth the latency/cost tradeoff
  • Catch regressions when models, prompts, or parameters change
  • Keep changes reproducible and auditable for CI and releases

In short, evaluation converts anecdotes into measurable signals that improve quality, cost efficiency, and reliability of the router.

Which issue(s) this PR fixes:

Fixes #53

Release Notes: No

Signed-off-by: JaredforReal <w13431838023@gmail.com>
Copy link

netlify bot commented Sep 15, 2025

Deploy Preview for vllm-semantic-router ready!

Name Link
🔨 Latest commit 32a78e0
🔍 Latest deploy log https://app.netlify.com/projects/vllm-semantic-router/deploys/68ccbe10df43d000081ffa2e
😎 Deploy Preview https://deploy-preview-136--vllm-semantic-router.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Copy link

github-actions bot commented Sep 15, 2025

👥 vLLM Semantic Team Notification

The following members have been identified for the changed files in this PR and have been automatically assigned:

📁 website

Owners: @Xunzhuo
Files changed:

  • website/docs/training/model-performance-eval.md
  • website/static/img/bar.png
  • website/static/img/heatmap.png
  • website/docs/training/training-overview.md
  • website/sidebars.js

vLLM

🎉 Thanks for your contributions!

This comment was automatically generated based on the OWNER files in the repository.

@JaredforReal
Copy link
Collaborator Author

JaredforReal commented Sep 15, 2025

still lacks some plot and picture. I leave it as a draft. Love to receive feedback from the community!

Signed-off-by: JaredforReal <w13431838023@gmail.com>
Signed-off-by: JaredforReal <w13431838023@gmail.com>
@JaredforReal JaredforReal marked this pull request as ready for review September 17, 2025 07:46
Signed-off-by: JaredforReal <w13431838023@gmail.com>
@JaredforReal
Copy link
Collaborator Author

@Xunzhuo Thanks for your time! This PR is ready for review, learned a lot making this doc :)

rootfs
rootfs previously approved these changes Sep 18, 2025
@rootfs
Copy link
Collaborator

rootfs commented Sep 18, 2025

@JaredforReal can you run the lint? Thanks.

@JaredforReal
Copy link
Collaborator Author

@rootfs fixed! Thanks!

@Xunzhuo
Copy link
Member

Xunzhuo commented Sep 19, 2025

Thanks! /lgtm

@github-actions github-actions bot added the lgtm label Sep 19, 2025
@Xunzhuo Xunzhuo merged commit f90fbb6 into vllm-project:main Sep 19, 2025
9 checks passed
@JaredforReal JaredforReal deleted the eval_doc branch September 19, 2025 02:42
yossiovadia pushed a commit to yossiovadia/semantic-router that referenced this pull request Sep 22, 2025
yossiovadia pushed a commit to yossiovadia/semantic-router that referenced this pull request Oct 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[v0.1]Docs: Model performance evaluation guide

3 participants