feat: add scripts to fine tune qwen3 for knowledge specialization #447

rootfs · 2025-10-15T18:15:51Z

What type of PR is this?

Fine tune qwen3 0.6B for specialization. There are two ways: with MMLU as training dataset (i.e. leakage) for demo purpose, and use other dataset for training (no leakage).

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes #239

Release Notes: Yes/No

Signed-off-by: Huamin Chen <hchen@redhat.com>

netlify · 2025-10-15T18:15:57Z

✅ Deploy Preview for vllm-semantic-router ready!

Name	Link
🔨 Latest commit	`01e9b4b`
🔍 Latest deploy log	https://app.netlify.com/projects/vllm-semantic-router/deploys/68f01e7b825c8a0008c5ad5e
😎 Deploy Preview	https://deploy-preview-447--vllm-semantic-router.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

github-actions · 2025-10-15T18:16:03Z

👥 vLLM Semantic Team Notification

The following members have been identified for the changed files in this PR and have been automatically assigned:

📁 `src`

Owners: @rootfs, @Xunzhuo, @wangchen615
Files changed:

src/training/training_lora/mmlu_pro_solver_lora/ft_qwen3_mmlu_solver_lora.py
src/training/training_lora/mmlu_pro_solver_lora/ft_qwen3_mmlu_solver_lora_no_leakage.py
src/training/training_lora/mmlu_pro_solver_lora/train_all_specialists.sh
src/training/training_lora/mmlu_pro_solver_lora/train_all_specialists_no_leakage.sh
src/training/training_lora/README.md
src/training/training_lora/classifier_model_fine_tuning_lora/ft_qwen3_generative_lora.py

📁 `Root Directory`

Owners: @rootfs, @Xunzhuo
Files changed:

examples/mcp-classifier-server/server_generative.py

🎉 Thanks for your contributions!

This comment was automatically generated based on the OWNER files in the repository.

Copilot

Pull Request Overview

This PR adds scripts and implementations to fine-tune Qwen3 0.6B models for specialized knowledge domains using MMLU-Pro benchmarks. The key focus is on training domain-specific expert models (math, science, law, humanities, etc.) with and without data leakage scenarios to demonstrate proper evaluation methodology.

Key changes:

Added complete training pipeline for 6 specialized Qwen3 models
Implemented both "leakage" (demo) and "no-leakage" (proper) training approaches
Included batch training scripts with comprehensive logging and result tracking

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
train_all_specialists_no_leakage.sh	Bash script for batch training 6 specialists using external datasets (no MMLU-Pro leakage)
train_all_specialists.sh	Bash script for batch training 6 specialists using MMLU-Pro data (with leakage for demo)
ft_qwen3_mmlu_solver_lora_no_leakage.py	Python training script using external datasets (GSM8K, MATH, ARC, etc.) for proper evaluation
ft_qwen3_mmlu_solver_lora.py	Python training script using MMLU-Pro data directly (demo/leakage version)
README.md	Updated documentation to include the new MMLU-Pro solver task

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-10-15T18:16:28Z

src/training/training_lora/mmlu_pro_solver_lora/train_all_specialists_no_leakage.sh

+
+TOTAL_MODELS=6
+COMPLETED_MODELS=0
+FAILED_MODELS=0


FAILED_MODELS is initialized as a scalar (=0) but later used as both an array (FAILED_MODELS+=(...)) and incremented as a counter. This will cause incorrect behavior. Use separate variables for the counter and the array.

Suggested change

FAILED_MODELS=0

FAILED_MODELS_COUNT=0

Signed-off-by: Huamin Chen <hchen@redhat.com>

rootfs · 2025-10-15T20:12:22Z

With the no_leakage training, the math accuracy is improved after fine tuning:

train_all_specialists_no_leakage.sh 2 100 5

2025-10-15 20:10:22 - common_lora_utils - INFO - IMPROVEMENT ANALYSIS (No Data Leakage)
2025-10-15 20:10:22 - common_lora_utils - INFO - 📊📊📊📊📊📊📊📊📊📊📊📊📊📊📊📊📊📊📊📊📊📊📊📊📊📊📊📊📊📊📊📊📊📊📊📊📊📊📊📊
2025-10-15 20:10:22 - common_lora_utils - INFO - 
================================================================================
2025-10-15 20:10:22 - common_lora_utils - INFO - OVERALL RESULTS:
2025-10-15 20:10:22 - common_lora_utils - INFO - ================================================================================
2025-10-15 20:10:22 - common_lora_utils - INFO -   Baseline (Untrained):     10.00%
2025-10-15 20:10:22 - common_lora_utils - INFO -   Post-training:            24.00%
2025-10-15 20:10:22 - common_lora_utils - INFO -   Absolute Improvement:     +14.00%
2025-10-15 20:10:22 - common_lora_utils - INFO -   Relative Improvement:     +140.0%

Signed-off-by: Huamin Chen <hchen@redhat.com>

feat: add scripts to fine tune qwen3 for knowledge specialization

f1dbb54

Signed-off-by: Huamin Chen <hchen@redhat.com>

rootfs requested review from Xunzhuo and wangchen615 as code owners October 15, 2025 18:15

rootfs requested a review from Copilot October 15, 2025 18:15

github-actions bot assigned rootfs, wangchen615 and Xunzhuo Oct 15, 2025

Copilot AI reviewed Oct 15, 2025

View reviewed changes

rootfs marked this pull request as draft October 15, 2025 18:48

rootfs added 2 commits October 15, 2025 19:22

fix

190131d

Signed-off-by: Huamin Chen <hchen@redhat.com>

fix

8e2e77b

Signed-off-by: Huamin Chen <hchen@redhat.com>

fix chat template issue

01e9b4b

Signed-off-by: Huamin Chen <hchen@redhat.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add scripts to fine tune qwen3 for knowledge specialization #447

feat: add scripts to fine tune qwen3 for knowledge specialization #447

rootfs commented Oct 15, 2025

Uh oh!

netlify bot commented Oct 15, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Oct 15, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Oct 15, 2025

Uh oh!

rootfs commented Oct 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feat: add scripts to fine tune qwen3 for knowledge specialization #447

Are you sure you want to change the base?

feat: add scripts to fine tune qwen3 for knowledge specialization #447

Conversation

rootfs commented Oct 15, 2025

Uh oh!

netlify bot commented Oct 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for vllm-semantic-router ready!

Uh oh!

github-actions bot commented Oct 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

👥 vLLM Semantic Team Notification

📁 src

📁 Root Directory

🎉 Thanks for your contributions!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Oct 15, 2025

Choose a reason for hiding this comment

Uh oh!

rootfs commented Oct 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

netlify bot commented Oct 15, 2025 •

edited

Loading

github-actions bot commented Oct 15, 2025 •

edited

Loading

📁 `src`

📁 `Root Directory`