Skip to content

Conversation

momo609
Copy link
Collaborator

@momo609 momo609 commented Sep 4, 2025

What this PR does / why we need it?

In reinforcement learning scenarios, weight updates are required, but the current inference applies a transpose operation to the weights, altering their shape. This causes a shape mismatch with the training weights, triggering an error during weight updates.

Does this PR introduce any user-facing change?

How was this patch tested?

Copy link

github-actions bot commented Sep 4, 2025

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a check to conditionally transpose weights for MoE layers. My review focuses on improving code correctness and maintainability. I've identified a critical bug that would cause a runtime error due to incorrect attribute access, and also pointed out opportunities to remove dead code and reduce duplication for better long-term maintenance.

w2_data = self._maybe_pad_weight(layer.w2_weight.data).transpose(
1, 2).contiguous()
layer.w2_weight = torch.nn.Parameter(w2_data, requires_grad=False)
if self.transpose:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The self in process_weights_after_loading refers to an instance of UnquantizedFusedMoEMethod, which does not have a transpose attribute. The transpose attribute is defined on the AscendFusedMoE layer instance, which is passed as the layer argument to this method. Accessing self.transpose will raise an AttributeError at runtime. You should access this attribute via the layer object instead.

Suggested change
if self.transpose:
if layer.transpose:


original_unquantized_fused_moe_init_func = UnquantizedFusedMoEMethod.__init__

is_transpose = True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The module-level variable is_transpose is introduced but never used within the file. This appears to be dead code and could cause confusion for future maintenance. If it's not needed, it should be removed.

Comment on lines 488 to 576
if expert_data.shape[1] != loaded_weight.shape[1]:
loaded_weight = loaded_weight.transpose(1,2).contiguous()
if not is_310p():
loaded_weight = torch_npu.npu_format_cast(
loaded_weight, ACL_FORMAT_FRACTAL_NZ)
self.transpose = False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This block of code for checking and transposing weights is nearly identical to the one in the _load_w13 method (lines 455-460). This duplication can make the code harder to maintain, as any changes would need to be applied in both places. To improve code quality and avoid potential bugs from inconsistent updates, consider extracting this logic into a shared private helper method.

Copy link

github-actions bot commented Sep 4, 2025

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

@momo609 momo609 force-pushed the weightloader branch 3 times, most recently from ab0fca2 to 76a7aa5 Compare September 8, 2025 08:28
@momo609 momo609 force-pushed the weightloader branch 5 times, most recently from 0ba2169 to ad1aa5a Compare September 9, 2025 06:54
Signed-off-by: wangxiaoxin-sherie <wangxiaoxin7@huawei.com>
@wangxiyuan wangxiyuan added ready read for review ready-for-test start test by label for PR labels Sep 9, 2025
@wangxiyuan wangxiyuan merged commit 93e28e6 into vllm-project:main Sep 9, 2025
45 of 48 checks passed
Angazenn pushed a commit to Angazenn/vllm-ascend that referenced this pull request Sep 10, 2025
### What this PR does / why we need it?
In reinforcement learning scenarios, weight updates are required, but
the current inference applies a transpose operation to the weights,
altering their shape. This causes a shape mismatch with the training
weights, triggering an error during weight updates.

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?


- vLLM version: v0.10.1.1
- vLLM main:
vllm-project/vllm@6fb2788

Signed-off-by: wangxiaoxin-sherie <wangxiaoxin7@huawei.com>
Co-authored-by: wangxiaoxin-sherie <wangxiaoxin7@huawei.com>
offline893 pushed a commit to offline893/vllm-ascend that referenced this pull request Sep 16, 2025
### What this PR does / why we need it?
In reinforcement learning scenarios, weight updates are required, but
the current inference applies a transpose operation to the weights,
altering their shape. This causes a shape mismatch with the training
weights, triggering an error during weight updates.

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.10.1.1
- vLLM main:
vllm-project/vllm@6fb2788

Signed-off-by: wangxiaoxin-sherie <wangxiaoxin7@huawei.com>
Co-authored-by: wangxiaoxin-sherie <wangxiaoxin7@huawei.com>
wangxiaoteng888 pushed a commit to LCAIZJ/vllm-ascend that referenced this pull request Sep 25, 2025
### What this PR does / why we need it?
In reinforcement learning scenarios, weight updates are required, but
the current inference applies a transpose operation to the weights,
altering their shape. This causes a shape mismatch with the training
weights, triggering an error during weight updates.

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?


- vLLM version: v0.10.1.1
- vLLM main:
vllm-project/vllm@6fb2788

Signed-off-by: wangxiaoxin-sherie <wangxiaoxin7@huawei.com>
Co-authored-by: wangxiaoxin-sherie <wangxiaoxin7@huawei.com>
chopper0126 pushed a commit to chopper0126/vllm-ascend that referenced this pull request Sep 26, 2025
### What this PR does / why we need it?
In reinforcement learning scenarios, weight updates are required, but
the current inference applies a transpose operation to the weights,
altering their shape. This causes a shape mismatch with the training
weights, triggering an error during weight updates.

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?


- vLLM version: v0.10.1.1
- vLLM main:
vllm-project/vllm@6fb2788

Signed-off-by: wangxiaoxin-sherie <wangxiaoxin7@huawei.com>
Co-authored-by: wangxiaoxin-sherie <wangxiaoxin7@huawei.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module:ops module:tests ready read for review ready-for-test start test by label for PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants