Getting stuck at tool call in evaluation of Qwen3

Can anyone help me with the benchmark.
I am running Qwen3-80b model benchmark and facing many issues.
One thing is it is getting stuck at this using the evaluation script - 
```
2025-09-19 13:21:36,783 - evaluation_logger_Attraction-62 - INFO - Test Example Attraction-62
2025-09-19 13:21:36,783 - evaluation_logger_Attraction-62 - INFO - Query: My child is a huge fan of the Harry Potter series. Can you check out what Harry Potter-themed attractions or activities are available in London? If the first one costs more than 100, keep checking the next ones until you find something that's under 100.
2025-09-19 13:21:37,349 - evaluation_logger_Attraction-62 - INFO - Function Calls: 
[
    {
        "name": "Search_Attraction_Location",
        "arguments": {
            "query": "London"
        }
    }
]

2025-09-19 13:21:37,349 - evaluation_logger_Attraction-62 - INFO - Golden Function Call: 
[
    {
        "name": "Search_Attraction_Location",
        "arguments": {
            "query": "Harry Potter, London"
        }
    },
    {
        "name": "Get_Attraction_Details",
        "arguments": {
            "slug": "prdg4urreipy-harry-potters-london-experience-tour"
        }
    }
]
```

Also I have changed Qwen Model and Runner to take vllm_url as input as it was not there, nothing else is changed.

### VLLM Server Command:
```
vllm serve Qwen/Qwen3-Next-80B-A3B-Instruct --host "0.0.0.0" --port "8000" --uvicorn-log-level warning --served-model-name qwen3-next --trust-remote-code --gpu-memory-utilization "0.9" --enable-prefix-caching --max-model-len "131072" --enable-auto-tool-choice --tool-call-parser hermes --tensor-parallel-size "4" --speculative-config '{"method":"qwen3_next_mtp","num_speculative_tokens":2}'
```

### Evaluation Script Command:
```
python evaluation.py --model_name=qwen3-next --vllm_url=http://0.0.0.0:8000/v1 --proc_num=1
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Getting stuck at tool call in evaluation of Qwen3 #7

VLLM Server Command:

Evaluation Script Command:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Getting stuck at tool call in evaluation of Qwen3 #7

Description

VLLM Server Command:

Evaluation Script Command:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions