-
Notifications
You must be signed in to change notification settings - Fork 485
Add aisbench nightly test cases #3474
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: jiangyunfan1 <jiangyunfan1@h-partners.com>
Signed-off-by: jiangyunfan1 <jiangyunfan1@h-partners.com>
Signed-off-by: jiangyunfan1 <jiangyunfan1@h-partners.com>
Signed-off-by: jiangyunfan1 <jiangyunfan1@h-partners.com>
Signed-off-by: jiangyunfan1 <jiangyunfan1@h-partners.com>
Signed-off-by: jiangyunfan1 <jiangyunfan1@h-partners.com>
Signed-off-by: jiangyunfan1 <jiangyunfan1@h-partners.com>
Signed-off-by: jiangyunfan1 <jiangyunfan1@h-partners.com>
Signed-off-by: jiangyunfan1 <jiangyunfan1@h-partners.com>
Signed-off-by: jiangyunfan1 <jiangyunfan1@h-partners.com>
Signed-off-by: jiangyunfan1 <jiangyunfan1@h-partners.com>
Signed-off-by: jiangyunfan1 <jiangyunfan1@h-partners.com>
Signed-off-by: jiangyunfan1 <jiangyunfan1@h-partners.com>
Signed-off-by: jiangyunfan1 <jiangyunfan1@h-partners.com>
Signed-off-by: wangli <wangli858794774@gmail.com>
Signed-off-by: wangli <wangli858794774@gmail.com>
Signed-off-by: wangli <wangli858794774@gmail.com>
Signed-off-by: wangli <wangli858794774@gmail.com>
Signed-off-by: wangli <wangli858794774@gmail.com>
Signed-off-by: jiangyunfan1 <jiangyunfan1@h-partners.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request adds new nightly tests using aisbench
and a corresponding helper script. The overall approach is sound, but the new tools/aisbench.py
script contains several critical and high-severity issues. These include potential UnboundLocalError
exceptions due to incorrect variable scoping, a resource leak from an unmanaged subprocess, a risk of an infinite loop when monitoring the subprocess, and fragile logic that relies on hardcoded values or specific string formats. Addressing these issues is crucial for making the test runner robust and reliable.
if self.task_type == "accuracy": | ||
aisbench_cmd = [ | ||
'ais_bench', '--models', f'{self.request_conf}_custom', | ||
'--datasets', f'{dataset_conf}', '--debug' | ||
] | ||
if self.task_type == "performance": | ||
aisbench_cmd = [ | ||
'ais_bench', '--models', f'{self.request_conf}_custom', | ||
'--datasets', f'{dataset_conf}_custom', '--debug', '--mode', | ||
'perf' | ||
] | ||
if self.num_prompts: | ||
aisbench_cmd.extend(['--num-prompts', str(self.num_prompts)]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The aisbench_cmd
variable is only defined within the if self.task_type == "accuracy":
and if self.task_type == "performance":
blocks. If self.task_type
has any other value, aisbench_cmd
will be unbound when used on line 56, causing an UnboundLocalError
. It's better to use an if/elif/else
structure to handle all cases, raising an error for unsupported task types.
if self.task_type == "accuracy":
aisbench_cmd = [
'ais_bench', '--models', f'{self.request_conf}_custom',
'--datasets', f'{dataset_conf}', '--debug'
]
elif self.task_type == "performance":
aisbench_cmd = [
'ais_bench', '--models', f'{self.request_conf}_custom',
'--datasets', f'{dataset_conf}_custom', '--debug', '--mode',
'perf'
]
if self.num_prompts:
aisbench_cmd.extend(['--num-prompts', str(self.num_prompts)])
else:
raise ValueError(f"Unsupported task_type: {self.task_type}")
while True: | ||
line = self.proc.stdout.readline().strip() | ||
print(line) | ||
if result_msg in line: | ||
self.result_line = line | ||
return | ||
if "ERROR" in line: | ||
raise RuntimeError( | ||
"Some errors happen to Aisbench task.") from None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The while True
loop reading from self.proc.stdout
can hang indefinitely if the subprocess closes its standard output without printing either the expected result_msg
or an "ERROR" line. readline()
will block, waiting for data that will never arrive. The loop should also check if the process has terminated. When an error does occur, including stderr
in the exception message would be very helpful for debugging.
while True:
line = self.proc.stdout.readline().strip()
if not line and self.proc.poll() is not None:
# Process ended without finding the result message
stderr = self.proc.stderr.read()
raise RuntimeError(f"Aisbench task finished unexpectedly. Stderr: {stderr}")
print(line)
if result_msg in line:
self.result_line = line
return
if "ERROR" in line:
stderr = self.proc.stderr.read()
raise RuntimeError(f"Some errors happen to Aisbench task. Stderr: {stderr}")
result_csv_file = os.path.join(result_dir, "gsm8kdataset.csv") | ||
result_json_file = os.path.join(result_dir, "gsm8kdataset.json") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The result filenames gsm8kdataset.csv
and gsm8kdataset.json
are hardcoded. This restricts this method to only work with the gsm8k
dataset. To make this runner more flexible and reusable for other datasets, the dataset name should be derived dynamically, for instance from self.dataset_conf
.
dataset_name = self.dataset_conf.split('/')[0]
result_csv_file = os.path.join(result_dir, f"{dataset_name}dataset.csv")
result_json_file = os.path.join(result_dir, f"{dataset_name}dataset.json")
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
Signed-off-by: jiangyunfan1 <jiangyunfan1@h-partners.com>
Signed-off-by: jiangyunfan1 <jiangyunfan1@h-partners.com>
Signed-off-by: jiangyunfan1 <jiangyunfan1@h-partners.com>
Signed-off-by: jiangyunfan1 <jiangyunfan1@h-partners.com>
Signed-off-by: jiangyunfan1 <jiangyunfan1@h-partners.com>
What this PR does / why we need it?
This PR adds the first aisbench case for nightly test, it lays a foundation for following performance and accuracy tests in nightly test.
Does this PR introduce any user-facing change?
No
How was this patch tested?
By running the test