Skip to content

Conversation

@wangx700
Copy link
Contributor

@wangx700 wangx700 commented Nov 6, 2025

What this PR does / why we need it?

enable sleepmode level2 e2e test and add the check logic to ensure the nz is not enabled.

Does this PR introduce any user-facing change?

no

How was this patch tested?

use e2e tests

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request fixes a failing E2E test for sleep_mode_level=2 by disabling the VLLM_ASCEND_ENABLE_NZ optimization for that specific test. While this makes the test pass, it appears to be a workaround for an underlying incompatibility between sleep mode level 2's weight reloading mechanism and the FRACTAL_NZ weight format. My review points out this issue and suggests that the root cause should be addressed to ensure full test coverage for this feature combination. I have also recommended adding a code comment to explain the workaround.

assert proc.returncode == 0


@patch.dict(os.environ, {"VLLM_ASCEND_ENABLE_NZ": "0"})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Disabling VLLM_ASCEND_ENABLE_NZ makes this test pass, but it appears to be a workaround for a deeper issue. This change effectively prevents testing the sleep_mode_level=2 functionality when the FRACTAL_NZ weight format optimization is enabled, reducing test coverage.

The root cause seems to be that when sleep_mode_level=2 is used, the test manually reloads model weights from disk (see examples/offline_external_launcher.py, line 250). This reloading process does not account for the FRACTAL_NZ format that weights are converted to when VLLM_ASCEND_ENABLE_NZ=1. Consequently, the model receives weights in an unexpected format, leading to failure.

While this fix may be acceptable to unblock CI, the underlying incompatibility should be addressed. Possible long-term solutions include:

  • Updating the weight reloading logic in the test to correctly handle the FRACTAL_NZ format.
  • If sleep_mode_level=2 is fundamentally incompatible with VLLM_ASCEND_ENABLE_NZ, this should be documented and ideally enforced with a runtime check.

As an immediate step, please add a code comment above this decorator to explain why this environment variable is being disabled for this specific test. This will improve code clarity and help future developers understand the context.

@github-actions
Copy link

github-actions bot commented Nov 6, 2025

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

@leo-pony
Copy link
Collaborator

leo-pony commented Nov 6, 2025

It look goods to me.
This is a test case for the RL scenario. In the RL scenario, NZ is not used, so there is no issue with the code. Here, we just need to ensure that the RL scenario is properly handled.

@leo-pony leo-pony added ready read for review ready-for-test start test by label for PR and removed ready read for review ready-for-test start test by label for PR labels Nov 6, 2025
@wangxiyuan wangxiyuan added ready read for review ready-for-test start test by label for PR labels Nov 6, 2025
@wangx700 wangx700 force-pushed the fix_e2e branch 4 times, most recently from cd244da to 46f1b64 Compare November 7, 2025 02:38
Signed-off-by: wangx700 <wangxin700@huawei.com>
@weijinqian0 weijinqian0 merged commit 24d6314 into vllm-project:main Nov 8, 2025
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

module:tests ready read for review ready-for-test start test by label for PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants