[VLMs] add helpers to get multimodal encodings #37743

zucchini-nlp · 2025-04-24T10:19:02Z

What does this PR do?

As per title, makes it available in all VLMs where it was not added. This PR is another small part of vLLM integration for multimodality

Would be nice to update a few audio LLMs we have, but qwen2-audio is a mess and we can update it only after cleaning up deprecations. Granite Speech is already updated, cool

github-actions · 2025-04-24T10:19:17Z

Hi 👋, thank you for opening this pull request! The pull request is converted to draft by default. The CI will be paused while the PR is in draft mode. When it is ready for review, please click the Ready for review button (at the bottom of the PR page). This will assign reviewers and trigger CI.

HuggingFaceDocBuilderDev · 2025-04-24T10:32:57Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

add helpers in VLMs

6822ab7

github-actions bot marked this pull request as draft April 24, 2025 10:19

zucchini-nlp marked this pull request as ready for review April 24, 2025 10:19

Merge branch 'main' into get-image-features-helper

e84a14d

zucchini-nlp added 2 commits April 24, 2025 13:05

fix tests and copies

df915d1

fix blip tests

8629ed3

zucchini-nlp requested a review from ArthurZucker April 24, 2025 12:06

zucchini-nlp mentioned this pull request Apr 25, 2025

Support multimodal models in vLLM with transformers backend #37780

Open

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[VLMs] add helpers to get multimodal encodings #37743

[VLMs] add helpers to get multimodal encodings #37743

zucchini-nlp commented Apr 24, 2025 •

edited

Loading

github-actions bot commented Apr 24, 2025

HuggingFaceDocBuilderDev commented Apr 24, 2025

[VLMs] add helpers to get multimodal encodings #37743

Are you sure you want to change the base?

[VLMs] add helpers to get multimodal encodings #37743

Conversation

zucchini-nlp commented Apr 24, 2025 • edited Loading

What does this PR do?

github-actions bot commented Apr 24, 2025

HuggingFaceDocBuilderDev commented Apr 24, 2025

zucchini-nlp commented Apr 24, 2025 •

edited

Loading