-
Notifications
You must be signed in to change notification settings - Fork 467
[Main][Feat]Set the Profiler parameters through environment variables consistent with vLLM #2608
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request aligns the profiler configuration in vllm-ascend
with the upstream vllm
project by allowing with_stack
and profile_memory
parameters to be set via environment variables. The changes are correct and the tests have been updated accordingly. I've added one high-severity suggestion to use getattr
for accessing these new environment variables. This will make the implementation more robust and backward-compatible with older versions of vllm
that may not have these variables defined, preventing potential AttributeError
exceptions at runtime.
vllm_ascend/worker/worker_v1.py
Outdated
with_stack=envs_vllm.VLLM_TORCH_PROFILER_WITH_STACK, | ||
profile_memory=envs_vllm.VLLM_TORCH_PROFILER_WITH_PROFILE_MEMORY, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To improve robustness and ensure backward compatibility, it's better to use getattr
with a default value when accessing these environment variables from envs_vllm
. This will prevent an AttributeError
if vllm-ascend
is run with an older version of vllm
where VLLM_TORCH_PROFILER_WITH_STACK
or VLLM_TORCH_PROFILER_WITH_PROFILE_MEMORY
are not defined. The default value of False
will maintain the previous behavior in such cases.
It would also be beneficial to add a new unit test to verify this fallback behavior, ensuring the profiler initializes with with_stack=False
and profile_memory=False
when the environment variables are not present.
with_stack=envs_vllm.VLLM_TORCH_PROFILER_WITH_STACK, | |
profile_memory=envs_vllm.VLLM_TORCH_PROFILER_WITH_PROFILE_MEMORY, | |
with_stack=getattr(envs_vllm, "VLLM_TORCH_PROFILER_WITH_STACK", False), | |
profile_memory=getattr(envs_vllm, "VLLM_TORCH_PROFILER_WITH_PROFILE_MEMORY", False), |
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2608 +/- ##
=======================================
Coverage 73.49% 73.50%
=======================================
Files 151 151
Lines 21927 21929 +2
=======================================
+ Hits 16116 16118 +2
Misses 5811 5811
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
5b96065
to
782e846
Compare
Signed-off-by: zhanghaiwen <zhanghaiwen@cmss.chinamobile.com>
Signed-off-by: zhanghaiwen <zhanghaiwen@cmss.chinamobile.com>
Signed-off-by: zhanghaiwen <zhanghaiwen@cmss.chinamobile.com>
Signed-off-by: zhanghaiwen <zhanghaiwen@cmss.chinamobile.com>
Signed-off-by: zhanghaiwen <zhanghaiwen@cmss.chinamobile.com>
782e846
to
79df9e2
Compare
@wangxiyuan @Yikun Please review when you have time. Thank you! |
… consistent with vLLM (vllm-project#2608) ### What this PR does / why we need it? Currently, when performing profiling in vLLM-Ascend, if you need to obtain the Python call stack, you have to manually modify the code. The code location is: [worker_v1.py#L337](https://github.com/vllm-project/vllm-ascend/blob/6c973361fc2eba5d3faa9b6b496b4b9fec4dc784/vllm_ascend/worker/worker_v1.py#L337) where you set with_stack to true. Now, in vLLM, you can set whether to obtain the Python call stack through an environment variable. The relevant PR is: [#21803](vllm-project/vllm#21803) and the documentation is: [profiling](https://docs.vllm.ai/en/latest/contributing/profiling.html?h=vllm_torch_profiler_with_stack#profile-with-pytorch-profiler) This PR sets the profiler initialization parameters by using the same environment variable as vLLM, eliminating the need for manual code modification. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? CI passed with new added/existing test. - vLLM version: v0.10.1.1 - vLLM main: vllm-project/vllm@0235103 --------- Signed-off-by: zhanghaiwen <zhanghaiwen@cmss.chinamobile.com> Co-authored-by: zhanghaiwen <zhanghaiwen@cmss.chinamobile.com> Signed-off-by: offline0806 <z00858301@china.huawei.com>
… consistent with vLLM (vllm-project#2608) ### What this PR does / why we need it? Currently, when performing profiling in vLLM-Ascend, if you need to obtain the Python call stack, you have to manually modify the code. The code location is: [worker_v1.py#L337](https://github.com/vllm-project/vllm-ascend/blob/6c973361fc2eba5d3faa9b6b496b4b9fec4dc784/vllm_ascend/worker/worker_v1.py#L337) where you set with_stack to true. Now, in vLLM, you can set whether to obtain the Python call stack through an environment variable. The relevant PR is: [#21803](vllm-project/vllm#21803) and the documentation is: [profiling](https://docs.vllm.ai/en/latest/contributing/profiling.html?h=vllm_torch_profiler_with_stack#profile-with-pytorch-profiler) This PR sets the profiler initialization parameters by using the same environment variable as vLLM, eliminating the need for manual code modification. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? CI passed with new added/existing test. - vLLM version: v0.10.1.1 - vLLM main: vllm-project/vllm@0235103 --------- Signed-off-by: zhanghaiwen <zhanghaiwen@cmss.chinamobile.com> Co-authored-by: zhanghaiwen <zhanghaiwen@cmss.chinamobile.com>
… consistent with vLLM (vllm-project#2608) ### What this PR does / why we need it? Currently, when performing profiling in vLLM-Ascend, if you need to obtain the Python call stack, you have to manually modify the code. The code location is: [worker_v1.py#L337](https://github.com/vllm-project/vllm-ascend/blob/6c973361fc2eba5d3faa9b6b496b4b9fec4dc784/vllm_ascend/worker/worker_v1.py#L337) where you set with_stack to true. Now, in vLLM, you can set whether to obtain the Python call stack through an environment variable. The relevant PR is: [#21803](vllm-project/vllm#21803) and the documentation is: [profiling](https://docs.vllm.ai/en/latest/contributing/profiling.html?h=vllm_torch_profiler_with_stack#profile-with-pytorch-profiler) This PR sets the profiler initialization parameters by using the same environment variable as vLLM, eliminating the need for manual code modification. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? CI passed with new added/existing test. - vLLM version: v0.10.1.1 - vLLM main: vllm-project/vllm@0235103 --------- Signed-off-by: zhanghaiwen <zhanghaiwen@cmss.chinamobile.com> Co-authored-by: zhanghaiwen <zhanghaiwen@cmss.chinamobile.com>
What this PR does / why we need it?
Currently, when performing profiling in vLLM-Ascend, if you need to obtain the Python call stack, you have to manually modify the code. The code location is: worker_v1.py#L337 where you set with_stack to true.
Now, in vLLM, you can set whether to obtain the Python call stack through an environment variable. The relevant PR is: #21803 and the documentation is: profiling
This PR sets the profiler initialization parameters by using the same environment variable as vLLM, eliminating the need for manual code modification.
Does this PR introduce any user-facing change?
No
How was this patch tested?
CI passed with new added/existing test.