Skip to content

Conversation

zouyida2052
Copy link
Contributor

@zouyida2052 zouyida2052 commented Apr 28, 2025

What this PR does / why we need it?

Optimize qwen2_vl and qwen2_5_vl.

Does this PR introduce any user-facing change?

no

How was this patch tested?

Testing this PR on 1080p picture with tp=1, bs=1 on Qwen2-VL and Qwen2.5-VL, every fa op's during time lasting from 11ms to 9ms, got roughly 22% perf boost.

Signed-off-by: zouyida2052 <zouyida@huawei.com>
Signed-off-by: zouyida2052 <zouyida@huawei.com>
Signed-off-by: zouyida2052 <zouyida2002@gmail.com>
Signed-off-by: zouyida2052 <zouyida2002@gmail.com>
Signed-off-by: zouyida2052 <zouyida2002@gmail.com>
Signed-off-by: zouyida2052 <zouyida2002@gmail.com>
Signed-off-by: zouyida2052 <zouyida2002@gmail.com>
Signed-off-by: zouyida2052 <zouyida2002@gmail.com>
Signed-off-by: zouyida2052 <zouyida2002@gmail.com>
Signed-off-by: zouyida2052 <zouyida2002@gmail.com>
@wangxiyuan
Copy link
Collaborator

we should add qwen2 vl to e2e test as well.

@wangxiyuan wangxiyuan merged commit ba9714c into vllm-project:main Apr 30, 2025
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants