Skip to content

Conversation

jiangyunfan1
Copy link
Contributor

@jiangyunfan1 jiangyunfan1 commented Oct 15, 2025

What this PR does / why we need it?

This PR adds the first aisbench case for nightly test, it lays a foundation for following performance and accuracy tests in nightly test.

Does this PR introduce any user-facing change?

No

How was this patch tested?

By running the test

jiangyunfan1 and others added 30 commits October 12, 2025 09:24
Signed-off-by: jiangyunfan1 <jiangyunfan1@h-partners.com>
Signed-off-by: jiangyunfan1 <jiangyunfan1@h-partners.com>
Signed-off-by: jiangyunfan1 <jiangyunfan1@h-partners.com>
Signed-off-by: jiangyunfan1 <jiangyunfan1@h-partners.com>
Signed-off-by: jiangyunfan1 <jiangyunfan1@h-partners.com>
Signed-off-by: jiangyunfan1 <jiangyunfan1@h-partners.com>
Signed-off-by: jiangyunfan1 <jiangyunfan1@h-partners.com>
Signed-off-by: jiangyunfan1 <jiangyunfan1@h-partners.com>
Signed-off-by: jiangyunfan1 <jiangyunfan1@h-partners.com>
Signed-off-by: jiangyunfan1 <jiangyunfan1@h-partners.com>
Signed-off-by: jiangyunfan1 <jiangyunfan1@h-partners.com>
Signed-off-by: jiangyunfan1 <jiangyunfan1@h-partners.com>
Signed-off-by: jiangyunfan1 <jiangyunfan1@h-partners.com>
Signed-off-by: jiangyunfan1 <jiangyunfan1@h-partners.com>
Signed-off-by: jiangyunfan1 <jiangyunfan1@h-partners.com>
Signed-off-by: wangli <wangli858794774@gmail.com>
Signed-off-by: wangli <wangli858794774@gmail.com>
Signed-off-by: wangli <wangli858794774@gmail.com>
Signed-off-by: wangli <wangli858794774@gmail.com>
Signed-off-by: wangli <wangli858794774@gmail.com>
Signed-off-by: wangli <wangli858794774@gmail.com>
Signed-off-by: wangli <wangli858794774@gmail.com>
Signed-off-by: wangli <wangli858794774@gmail.com>
Signed-off-by: wangli <wangli858794774@gmail.com>
Signed-off-by: wangli <wangli858794774@gmail.com>
Signed-off-by: wangli <wangli858794774@gmail.com>
Signed-off-by: wangli <wangli858794774@gmail.com>
Signed-off-by: wangli <wangli858794774@gmail.com>
Signed-off-by: wangli <wangli858794774@gmail.com>
Signed-off-by: wangli <wangli858794774@gmail.com>
Potabk and others added 10 commits October 12, 2025 09:24
Signed-off-by: wangli <wangli858794774@gmail.com>
Signed-off-by: wangli <wangli858794774@gmail.com>
Signed-off-by: wangli <wangli858794774@gmail.com>
Signed-off-by: wangli <wangli858794774@gmail.com>
Signed-off-by: wangli <wangli858794774@gmail.com>
Signed-off-by: wangli <wangli858794774@gmail.com>
Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
Signed-off-by: jiangyunfan1 <jiangyunfan1@h-partners.com>
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds new nightly tests using aisbench and a corresponding helper script. The overall approach is sound, but the new tools/aisbench.py script contains several critical and high-severity issues. These include potential UnboundLocalError exceptions due to incorrect variable scoping, a resource leak from an unmanaged subprocess, a risk of an infinite loop when monitoring the subprocess, and fragile logic that relies on hardcoded values or specific string formats. Addressing these issues is crucial for making the test runner robust and reliable.

Comment on lines 43 to 55
if self.task_type == "accuracy":
aisbench_cmd = [
'ais_bench', '--models', f'{self.request_conf}_custom',
'--datasets', f'{dataset_conf}', '--debug'
]
if self.task_type == "performance":
aisbench_cmd = [
'ais_bench', '--models', f'{self.request_conf}_custom',
'--datasets', f'{dataset_conf}_custom', '--debug', '--mode',
'perf'
]
if self.num_prompts:
aisbench_cmd.extend(['--num-prompts', str(self.num_prompts)])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The aisbench_cmd variable is only defined within the if self.task_type == "accuracy": and if self.task_type == "performance": blocks. If self.task_type has any other value, aisbench_cmd will be unbound when used on line 56, causing an UnboundLocalError. It's better to use an if/elif/else structure to handle all cases, raising an error for unsupported task types.

        if self.task_type == "accuracy":
            aisbench_cmd = [
                'ais_bench', '--models', f'{self.request_conf}_custom',
                '--datasets', f'{dataset_conf}', '--debug'
            ]
        elif self.task_type == "performance":
            aisbench_cmd = [
                'ais_bench', '--models', f'{self.request_conf}_custom',
                '--datasets', f'{dataset_conf}_custom', '--debug', '--mode',
                'perf'
            ]
            if self.num_prompts:
                aisbench_cmd.extend(['--num-prompts', str(self.num_prompts)])
        else:
            raise ValueError(f"Unsupported task_type: {self.task_type}")

Comment on lines +150 to +158
while True:
line = self.proc.stdout.readline().strip()
print(line)
if result_msg in line:
self.result_line = line
return
if "ERROR" in line:
raise RuntimeError(
"Some errors happen to Aisbench task.") from None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The while True loop reading from self.proc.stdout can hang indefinitely if the subprocess closes its standard output without printing either the expected result_msg or an "ERROR" line. readline() will block, waiting for data that will never arrive. The loop should also check if the process has terminated. When an error does occur, including stderr in the exception message would be very helpful for debugging.

        while True:
            line = self.proc.stdout.readline().strip()
            if not line and self.proc.poll() is not None:
                # Process ended without finding the result message
                stderr = self.proc.stderr.read()
                raise RuntimeError(f"Aisbench task finished unexpectedly. Stderr: {stderr}")

            print(line)
            if result_msg in line:
                self.result_line = line
                return
            if "ERROR" in line:
                stderr = self.proc.stderr.read()
                raise RuntimeError(f"Some errors happen to Aisbench task. Stderr: {stderr}")

Comment on lines +163 to +164
result_csv_file = os.path.join(result_dir, "gsm8kdataset.csv")
result_json_file = os.path.join(result_dir, "gsm8kdataset.json")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The result filenames gsm8kdataset.csv and gsm8kdataset.json are hardcoded. This restricts this method to only work with the gsm8k dataset. To make this runner more flexible and reusable for other datasets, the dataset name should be derived dynamically, for instance from self.dataset_conf.

        dataset_name = self.dataset_conf.split('/')[0]
        result_csv_file = os.path.join(result_dir, f"{dataset_name}dataset.csv")
        result_json_file = os.path.join(result_dir, f"{dataset_name}dataset.json")

Copy link

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

Signed-off-by: jiangyunfan1 <jiangyunfan1@h-partners.com>
@wangxiyuan wangxiyuan added the ready read for review label Oct 15, 2025
Signed-off-by: jiangyunfan1 <jiangyunfan1@h-partners.com>
Signed-off-by: jiangyunfan1 <jiangyunfan1@h-partners.com>
Signed-off-by: jiangyunfan1 <jiangyunfan1@h-partners.com>
Signed-off-by: jiangyunfan1 <jiangyunfan1@h-partners.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants