-
Notifications
You must be signed in to change notification settings - Fork 19
Open
Description
Can anyone help me with the benchmark.
I am running Qwen3-80b model benchmark and facing many issues.
One thing is it is getting stuck at this using the evaluation script -
2025-09-19 13:21:36,783 - evaluation_logger_Attraction-62 - INFO - Test Example Attraction-62
2025-09-19 13:21:36,783 - evaluation_logger_Attraction-62 - INFO - Query: My child is a huge fan of the Harry Potter series. Can you check out what Harry Potter-themed attractions or activities are available in London? If the first one costs more than 100, keep checking the next ones until you find something that's under 100.
2025-09-19 13:21:37,349 - evaluation_logger_Attraction-62 - INFO - Function Calls:
[
{
"name": "Search_Attraction_Location",
"arguments": {
"query": "London"
}
}
]
2025-09-19 13:21:37,349 - evaluation_logger_Attraction-62 - INFO - Golden Function Call:
[
{
"name": "Search_Attraction_Location",
"arguments": {
"query": "Harry Potter, London"
}
},
{
"name": "Get_Attraction_Details",
"arguments": {
"slug": "prdg4urreipy-harry-potters-london-experience-tour"
}
}
]
Also I have changed Qwen Model and Runner to take vllm_url as input as it was not there, nothing else is changed.
VLLM Server Command:
vllm serve Qwen/Qwen3-Next-80B-A3B-Instruct --host "0.0.0.0" --port "8000" --uvicorn-log-level warning --served-model-name qwen3-next --trust-remote-code --gpu-memory-utilization "0.9" --enable-prefix-caching --max-model-len "131072" --enable-auto-tool-choice --tool-call-parser hermes --tensor-parallel-size "4" --speculative-config '{"method":"qwen3_next_mtp","num_speculative_tokens":2}'
Evaluation Script Command:
python evaluation.py --model_name=qwen3-next --vllm_url=http://0.0.0.0:8000/v1 --proc_num=1
Metadata
Metadata
Assignees
Labels
No labels