Skip to content

[v0.9.1]add data preprocess functions to qwen2.5_vl_without_padding #1705

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 15, 2025

Conversation

zheliuyu
Copy link

@zheliuyu zheliuyu commented Jul 9, 2025

bug report@as12138

What this PR does / why we need it?

Compared qwen2_5_vl.py, qwen2_5_vl_without_padding.py missing some funtions. The purpose of this PR is to supplement these.

add:

  • rot_pos_emb(self, grid_thw: torch.Tensor)
  • get_window_index(self, grid_thw)
  • _process_image_input(self, image_input)
  • _process_video_input(self, video_input)

Does this PR introduce any user-facing change?

N/A
Same as qwen2_5_vl.py

How was this patch tested?

N/A
Same as qwen2_5_vl.py

@wangxiyuan wangxiyuan changed the title add qwen2.5_vl_without_padding's missing functions [0.9.1]add qwen2.5_vl_without_padding's missing functions Jul 10, 2025
@zheliuyu zheliuyu force-pushed the v0.9.1-dev branch 3 times, most recently from 31d0981 to 812ec8c Compare July 10, 2025 03:11
@zheliuyu zheliuyu marked this pull request as ready for review July 10, 2025 06:03
@zheliuyu zheliuyu changed the title [0.9.1]add qwen2.5_vl_without_padding's missing functions [v0.9.1]add rot_pos_emb()/get_window_index()/_process_image_input() to qwen2.5_vl_without_padding Jul 10, 2025
@zheliuyu
Copy link
Author

zheliuyu commented Jul 10, 2025

Ready for review. @wangxiyuan @leo-pony @as12138

q = torch_npu.npu_rotary_mul(q, cos, sin)
k = torch_npu.npu_rotary_mul(k, cos, sin)

q = torch_npu.npu_rotary_mul(q, cos.to(q.device), sin.to(q.device))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not put this Tensor.to(device) out of the model run?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To fix RuntimeError:

Expected all tensors to be on the same device, 
but found at least two devices, npu:0 and cpu! (when checking argument for argument r1 in method wrapper__npu_rotary_mul)

Copy link
Collaborator

@ganyi1996ppo ganyi1996ppo Jul 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, but since you already customize the modeling, can you make the cos sin cache to be npu tensor to prevent the h2d operation layer by layer?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, but since you already customize the modeling, can you make the cos sin cache to be npu tensor to prevent the h2d operation layer by layer?

Thanks for your advice. In order to make the PR more concise, this PR is only used to supplement the missing function of vl_without_padding.py. For Tensor.to(device), the PR will be submitted later after resuming the experimental results.

@ganyi1996ppo
Copy link
Collaborator

I'm wondering how dose this PR been tested? just reuse the ut of qwenvl?

@zheliuyu
Copy link
Author

I'm wondering how dose this PR been tested? just reuse the ut of qwenvl?

Yes, reuse ut of qwenvl.

Signed-off-by: zheliuyu <15750543867@163.com>
@zheliuyu zheliuyu changed the title [v0.9.1]add rot_pos_emb()/get_window_index()/_process_image_input() to qwen2.5_vl_without_padding [v0.9.1]add data preprocess functions to qwen2.5_vl_without_padding Jul 15, 2025
@ganyi1996ppo ganyi1996ppo merged commit a2a6377 into vllm-project:v0.9.1-dev Jul 15, 2025
16 checks passed
@Yikun Yikun added the no-test label Jul 16, 2025
Potabk added a commit to Potabk/vllm-ascend that referenced this pull request Jul 21, 2025
Signed-off-by: wangli <wangli858794774@gmail.com>
Potabk added a commit to Potabk/vllm-ascend that referenced this pull request Jul 28, 2025
Signed-off-by: wangli <wangli858794774@gmail.com>
wangxiyuan pushed a commit that referenced this pull request Aug 1, 2025
…2148)

### What this PR does / why we need it?
Cherry pick #1705 from v0.9.1-dev
Compared qwen2_5_vl.py, qwen2_5_vl_without_padding.py missing some
funtions. The purpose of this PR is to supplement these.

add:
- rot_pos_emb(self, grid_thw: torch.Tensor)
- get_window_index(self, grid_thw)
- _process_image_input(self, image_input)
- _process_video_input(self, video_input)

Co-authored-by: zheliuyu
[15750543867@163.com](mailto:15750543867@163.com)
Co-authored-by: wangli
[wangli858794774@gmail.com](mailto:wangli858794774@gmail.com)

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.10.0
- vLLM main:
vllm-project/vllm@207b750

Signed-off-by: wangli <wangli858794774@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants