docs: Model Performance Evaluation Guide #136

JaredforReal · 2025-09-15T11:08:55Z

What type of PR is this?
docs: Model Performance Evaluation Guide

What this PR does / why we need it:

Evaluation makes routing data-driven. By measuring per-category accuracy on MMLU-Pro (and doing a quick sanity check with ARC), you can:

Select the right model for each category and rank them into categories.model_scores
Pick a sensible default_model based on overall performance
Decide when CoT prompting is worth the latency/cost tradeoff
Catch regressions when models, prompts, or parameters change
Keep changes reproducible and auditable for CI and releases

In short, evaluation converts anecdotes into measurable signals that improve quality, cost efficiency, and reliability of the router.

Which issue(s) this PR fixes:

Fixes #53

Release Notes: No

Signed-off-by: JaredforReal <w13431838023@gmail.com>

netlify · 2025-09-15T11:09:00Z

✅ Deploy Preview for vllm-semantic-router ready!

Name	Link
🔨 Latest commit	`32a78e0`
🔍 Latest deploy log	https://app.netlify.com/projects/vllm-semantic-router/deploys/68ccbe10df43d000081ffa2e
😎 Deploy Preview	https://deploy-preview-136--vllm-semantic-router.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

github-actions · 2025-09-15T11:09:08Z

👥 vLLM Semantic Team Notification

The following members have been identified for the changed files in this PR and have been automatically assigned:

📁 `website`

Owners: @Xunzhuo
Files changed:

website/docs/training/model-performance-eval.md
website/static/img/bar.png
website/static/img/heatmap.png
website/docs/training/training-overview.md
website/sidebars.js

🎉 Thanks for your contributions!

This comment was automatically generated based on the OWNER files in the repository.

JaredforReal · 2025-09-15T11:10:31Z

still lacks some plot and picture. I leave it as a draft. Love to receive feedback from the community!

Signed-off-by: JaredforReal <w13431838023@gmail.com>

website/sidebars.js

Signed-off-by: JaredforReal <w13431838023@gmail.com>

JaredforReal · 2025-09-17T08:17:10Z

@Xunzhuo Thanks for your time! This PR is ready for review, learned a lot making this doc :)

rootfs · 2025-09-18T18:36:49Z

@JaredforReal can you run the lint? Thanks.

Signed-off-by: JaredforReal <w13431838023@gmail.com>

JaredforReal · 2025-09-19T02:28:15Z

@rootfs fixed! Thanks!

Xunzhuo · 2025-09-19T02:31:07Z

Thanks! /lgtm

docs: fix markdownlint

2e78bc2

Signed-off-by: JaredforReal <w13431838023@gmail.com>

github-actions bot assigned Xunzhuo Sep 15, 2025

JaredforReal added 2 commits September 15, 2025 19:16

docs: add Next in training-overview.md

658e294

Signed-off-by: JaredforReal <w13431838023@gmail.com>

fix deploy error

468fb15

Signed-off-by: JaredforReal <w13431838023@gmail.com>

Xunzhuo reviewed Sep 15, 2025

View reviewed changes

website/sidebars.js Outdated Show resolved Hide resolved

JaredforReal added 2 commits September 17, 2025 15:41

docs: add pngs and examples in doc & add doc to sidebar

f4d7de1

Signed-off-by: JaredforReal <w13431838023@gmail.com>

Merge branch 'main' into eval_doc

299cde3

JaredforReal marked this pull request as ready for review September 17, 2025 07:46

Fix: docs-build error

bf11d7e

Signed-off-by: JaredforReal <w13431838023@gmail.com>

rootfs approved these changes Sep 17, 2025

View reviewed changes

Merge branch 'main' into eval_doc

3dcfea4

JaredforReal requested a review from Xunzhuo September 18, 2025 13:51

rootfs previously approved these changes Sep 18, 2025

View reviewed changes

Merge branch 'main' into eval_doc

316caa4

JaredforReal added 2 commits September 19, 2025 10:08

Merge branch 'main' into eval_doc

5416fbf

fix: codespell in model-perf-eval

32a78e0

Signed-off-by: JaredforReal <w13431838023@gmail.com>

JaredforReal dismissed rootfs’s stale review via 32a78e0 September 19, 2025 02:21

Xunzhuo approved these changes Sep 19, 2025

View reviewed changes

github-actions bot added the lgtm label Sep 19, 2025

Xunzhuo merged commit f90fbb6 into vllm-project:main Sep 19, 2025
9 checks passed

JaredforReal deleted the eval_doc branch September 19, 2025 02:42

yossiovadia pushed a commit to yossiovadia/semantic-router that referenced this pull request Sep 22, 2025

docs: Model Performance Evaluation Guide (vllm-project#136)

1d37497

yossiovadia pushed a commit to yossiovadia/semantic-router that referenced this pull request Oct 8, 2025

docs: Model Performance Evaluation Guide (vllm-project#136)

e00a802

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

docs: Model Performance Evaluation Guide #136

docs: Model Performance Evaluation Guide #136

Uh oh!

JaredforReal commented Sep 15, 2025 •

edited

Loading

Uh oh!

netlify bot commented Sep 15, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Sep 15, 2025 •

edited

Loading

Uh oh!

JaredforReal commented Sep 15, 2025 •

edited

Loading

Uh oh!

Uh oh!

JaredforReal commented Sep 17, 2025

Uh oh!

rootfs commented Sep 18, 2025

Uh oh!

JaredforReal commented Sep 19, 2025

Uh oh!

Xunzhuo commented Sep 19, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

docs: Model Performance Evaluation Guide #136

docs: Model Performance Evaluation Guide #136

Uh oh!

Conversation

JaredforReal commented Sep 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

netlify bot commented Sep 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for vllm-semantic-router ready!

Uh oh!

github-actions bot commented Sep 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

👥 vLLM Semantic Team Notification

📁 website

🎉 Thanks for your contributions!

Uh oh!

JaredforReal commented Sep 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

JaredforReal commented Sep 17, 2025

Uh oh!

rootfs commented Sep 18, 2025

Uh oh!

JaredforReal commented Sep 19, 2025

Uh oh!

Xunzhuo commented Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

JaredforReal commented Sep 15, 2025 •

edited

Loading

netlify bot commented Sep 15, 2025 •

edited

Loading

github-actions bot commented Sep 15, 2025 •

edited

Loading

📁 `website`

JaredforReal commented Sep 15, 2025 •

edited

Loading

Xunzhuo commented Sep 19, 2025 •

edited

Loading