Skip to content

Commit 7027cd5

Browse files
[Test] Update accuracy report
Signed-off-by: hfadzxy <starmoon_zhang@163.com>
1 parent 693f547 commit 7027cd5

File tree

7 files changed

+29
-4
lines changed

7 files changed

+29
-4
lines changed

tests/e2e/models/configs/DeepSeek-V2-Lite.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,6 @@
11
model_name: "deepseek-ai/DeepSeek-V2-Lite"
2+
runner: "linux-aarch64-a2-2"
3+
hardware: "Atlas A2 Series"
24
tasks:
35
- name: "gsm8k"
46
metrics:

tests/e2e/models/configs/Qwen2.5-VL-7B-Instruct.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,6 @@
11
model_name: "Qwen/Qwen2.5-VL-7B-Instruct"
2+
runner: "linux-aarch64-a2-1"
3+
hardware: "Atlas A2 Series"
24
model: "vllm-vlm"
35
tasks:
46
- name: "mmmu_val"

tests/e2e/models/configs/Qwen3-30B-A3B.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,6 @@
11
model_name: "Qwen/Qwen3-30B-A3B"
2+
runner: "linux-aarch64-a2-2"
3+
hardware: "Atlas A2 Series"
24
tasks:
35
- name: "gsm8k"
46
metrics:

tests/e2e/models/configs/Qwen3-8B-Base.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,6 @@
11
model_name: "Qwen/Qwen3-8B-Base"
2+
runner: "linux-aarch64-a2-1"
3+
hardware: "Atlas A2 Series"
24
tasks:
35
- name: "gsm8k"
46
metrics:
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1+
DeepSeek-V2-Lite.yaml
12
Qwen3-8B-Base.yaml
23
Qwen2.5-VL-7B-Instruct.yaml
34
Qwen3-30B-A3B.yaml

tests/e2e/models/report_template.md

Lines changed: 15 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,16 +2,28 @@
22

33
- **vLLM Version**: vLLM: {{ vllm_version }} ([{{ vllm_commit[:7] }}](https://github.com/vllm-project/vllm/commit/{{ vllm_commit }})), **vLLM Ascend Version**: {{ vllm_ascend_version }} ([{{ vllm_ascend_commit[:7] }}](https://github.com/vllm-project/vllm-ascend/commit/{{ vllm_ascend_commit }}))
44
- **Software Environment**: **CANN**: {{ cann_version }}, **PyTorch**: {{ torch_version }}, **torch-npu**: {{ torch_npu_version }}
5-
- **Hardware Environment**: Atlas A2 Series
5+
- **Hardware Environment**: {{ hardware }}
66
- **Parallel mode**: {{ parallel_mode }}
7-
- **Execution mode**: ACLGraph
7+
- **Execution mode**: {{ execution_model }}
88

99
**Command**:
1010

1111
```bash
1212
export MODEL_ARGS={{ model_args }}
1313
lm_eval --model {{ model_type }} --model_args $MODEL_ARGS --tasks {{ datasets }} \
14-
{% if apply_chat_template %} --apply_chat_template {{ apply_chat_template }} {% endif %} {% if fewshot_as_multiturn %} --fewshot_as_multiturn {{ fewshot_as_multiturn }} {% endif %} {% if num_fewshot is defined and num_fewshot != "N/A" %} --num_fewshot {{ num_fewshot }} {% endif %} {% if limit is defined and limit != "N/A" %} --limit {{ limit }} {% endif %} --batch_size {{ batch_size}}
14+
{% if apply_chat_template is defined and (apply_chat_template|string|lower in ["true", "1"]) -%}
15+
--apply_chat_template \
16+
{%- endif %}
17+
{% if fewshot_as_multiturn is defined and (fewshot_as_multiturn|string|lower in ["true", "1"]) -%}
18+
--fewshot_as_multiturn \
19+
{%- endif %}
20+
{% if num_fewshot is defined and num_fewshot != "N/A" -%}
21+
--num_fewshot {{ num_fewshot }} \
22+
{%- endif %}
23+
{% if limit is defined and limit != "N/A" -%}
24+
--limit {{ limit }} \
25+
{%- endif %}
26+
--batch_size {{ batch_size }}
1527
```
1628
1729
| Task | Metric | Value | Stderr |

tests/e2e/models/test_lm_eval_correctness.py

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -69,6 +69,8 @@ def generate_report(tp_size, eval_config, report_data, report_dir, env_config):
6969
if model_args.get('enable_expert_parallel', False):
7070
parallel_mode += " + EP"
7171

72+
execution_model = f"{'Eager' if model_args.get('enforce_eager', False) else 'ACLGraph'}"
73+
7274
report_content = template.render(
7375
vllm_version=env_config.vllm_version,
7476
vllm_commit=env_config.vllm_commit,
@@ -77,6 +79,7 @@ def generate_report(tp_size, eval_config, report_data, report_dir, env_config):
7779
cann_version=env_config.cann_version,
7880
torch_version=env_config.torch_version,
7981
torch_npu_version=env_config.torch_npu_version,
82+
hardware=eval_config.get("hardware", "unknown"),
8083
model_name=eval_config["model_name"],
8184
model_args=f"'{','.join(f'{k}={v}' for k, v in model_args.items())}'",
8285
model_type=eval_config.get("model", "vllm"),
@@ -87,7 +90,8 @@ def generate_report(tp_size, eval_config, report_data, report_dir, env_config):
8790
batch_size=eval_config.get("batch_size", "auto"),
8891
num_fewshot=eval_config.get("num_fewshot", "N/A"),
8992
rows=report_data["rows"],
90-
parallel_mode=parallel_mode)
93+
parallel_mode=parallel_mode,
94+
execution_model=execution_model)
9195

9296
report_output = os.path.join(
9397
report_dir, f"{os.path.basename(eval_config['model_name'])}.md")

0 commit comments

Comments
 (0)