-
Notifications
You must be signed in to change notification settings - Fork 461
add weight transpose check. #2756
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a check to conditionally transpose weights for MoE layers. My review focuses on improving code correctness and maintainability. I've identified a critical bug that would cause a runtime error due to incorrect attribute access, and also pointed out opportunities to remove dead code and reduce duplication for better long-term maintenance.
w2_data = self._maybe_pad_weight(layer.w2_weight.data).transpose( | ||
1, 2).contiguous() | ||
layer.w2_weight = torch.nn.Parameter(w2_data, requires_grad=False) | ||
if self.transpose: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The self
in process_weights_after_loading
refers to an instance of UnquantizedFusedMoEMethod
, which does not have a transpose
attribute. The transpose
attribute is defined on the AscendFusedMoE
layer instance, which is passed as the layer
argument to this method. Accessing self.transpose
will raise an AttributeError
at runtime. You should access this attribute via the layer
object instead.
if self.transpose: | |
if layer.transpose: |
vllm_ascend/ops/common_fused_moe.py
Outdated
|
||
original_unquantized_fused_moe_init_func = UnquantizedFusedMoEMethod.__init__ | ||
|
||
is_transpose = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
vllm_ascend/ops/common_fused_moe.py
Outdated
if expert_data.shape[1] != loaded_weight.shape[1]: | ||
loaded_weight = loaded_weight.transpose(1,2).contiguous() | ||
if not is_310p(): | ||
loaded_weight = torch_npu.npu_format_cast( | ||
loaded_weight, ACL_FORMAT_FRACTAL_NZ) | ||
self.transpose = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This block of code for checking and transposing weights is nearly identical to the one in the _load_w13
method (lines 455-460). This duplication can make the code harder to maintain, as any changes would need to be applied in both places. To improve code quality and avoid potential bugs from inconsistent updates, consider extracting this logic into a shared private helper method.
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
2c04f20
to
f6ed135
Compare
ab0fca2
to
76a7aa5
Compare
0ba2169
to
ad1aa5a
Compare
Signed-off-by: wangxiaoxin-sherie <wangxiaoxin7@huawei.com>
29c69df
to
c293c37
Compare
### What this PR does / why we need it? In reinforcement learning scenarios, weight updates are required, but the current inference applies a transpose operation to the weights, altering their shape. This causes a shape mismatch with the training weights, triggering an error during weight updates. ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.10.1.1 - vLLM main: vllm-project/vllm@6fb2788 Signed-off-by: wangxiaoxin-sherie <wangxiaoxin7@huawei.com> Co-authored-by: wangxiaoxin-sherie <wangxiaoxin7@huawei.com>
### What this PR does / why we need it? In reinforcement learning scenarios, weight updates are required, but the current inference applies a transpose operation to the weights, altering their shape. This causes a shape mismatch with the training weights, triggering an error during weight updates. ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.10.1.1 - vLLM main: vllm-project/vllm@6fb2788 Signed-off-by: wangxiaoxin-sherie <wangxiaoxin7@huawei.com> Co-authored-by: wangxiaoxin-sherie <wangxiaoxin7@huawei.com>
### What this PR does / why we need it? In reinforcement learning scenarios, weight updates are required, but the current inference applies a transpose operation to the weights, altering their shape. This causes a shape mismatch with the training weights, triggering an error during weight updates. ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.10.1.1 - vLLM main: vllm-project/vllm@6fb2788 Signed-off-by: wangxiaoxin-sherie <wangxiaoxin7@huawei.com> Co-authored-by: wangxiaoxin-sherie <wangxiaoxin7@huawei.com>
### What this PR does / why we need it? In reinforcement learning scenarios, weight updates are required, but the current inference applies a transpose operation to the weights, altering their shape. This causes a shape mismatch with the training weights, triggering an error during weight updates. ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.10.1.1 - vLLM main: vllm-project/vllm@6fb2788 Signed-off-by: wangxiaoxin-sherie <wangxiaoxin7@huawei.com> Co-authored-by: wangxiaoxin-sherie <wangxiaoxin7@huawei.com>
What this PR does / why we need it?
In reinforcement learning scenarios, weight updates are required, but the current inference applies a transpose operation to the weights, altering their shape. This causes a shape mismatch with the training weights, triggering an error during weight updates.
Does this PR introduce any user-facing change?
How was this patch tested?