Skip to content

Commit c7cd550

Browse files
committed
add intern video2
1 parent 3f773ae commit c7cd550

File tree

4 files changed

+1612
-0
lines changed

4 files changed

+1612
-0
lines changed
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
# Visual-language assistant with InternVL2 and OpenVINO
2+
3+
InternVL 2.0 is the latest addition to the InternVL series of multimodal large language models. InternVL 2.0 features a variety of instruction-tuned models, ranging from 1 billion to 108 billion parameters. Compared to the state-of-the-art open-source multimodal large language models, InternVL 2.0 surpasses most open-source models. It demonstrates competitive performance on par with proprietary commercial models across various capabilities, including document and chart comprehension, infographics QA, scene text understanding and OCR tasks, scientific and mathematical problem solving, as well as cultural understanding and integrated multimodal capabilities.
4+
5+
More details about model can be found in [model card](https://huggingface.co/OpenGVLab/InternVL2-4B), [blog](https://internvl.github.io/blog/2024-07-02-InternVL-2.0/) and original [repo](https://github.com/OpenGVLab/InternVL).
6+
7+
In this tutorial we consider how to convert and optimize InternVL2 model for creating multimodal chatbot. Additionally, we demonstrate how to apply stateful transformation on LLM part and model optimization techniques like weights compression using [NNCF](https://github.com/openvinotoolkit/nncf)
8+
9+
## Notebook contents
10+
The tutorial consists from following steps:
11+
12+
- Install requirements
13+
- Convert and Optimize model
14+
- Run OpenVINO model inference
15+
- Launch Interactive demo
16+
17+
In this demonstration, you'll create interactive chatbot that can answer questions about provided image's content.
18+
19+
The image bellow illustrates example of input prompt and model answer.
20+
![example.png](https://github.com/user-attachments/assets/6720efe0-ab24-4d73-a22f-a8a0499558d8)
21+
22+
## Installation instructions
23+
This is a self-contained example that relies solely on its own code.</br>
24+
We recommend running the notebook in a virtual environment. You only need a Jupyter server to start.
25+
For details, please refer to [Installation Guide](../../README.md).
26+
27+
<img referrerpolicy="no-referrer-when-downgrade" src="https://static.scarf.sh/a.png?x-pxid=5b5a4db0-7875-4bfb-bdbd-01698b5b1a77&file=notebooks/intern-video2-classiciation/README.md" />
Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
import gradio as gr
2+
3+
4+
def make_demo(classify):
5+
demo = gr.Interface(
6+
classify,
7+
[
8+
gr.Video(label="Video"),
9+
gr.Textbox(label="Labels", info="Comma-separated list of class labels"),
10+
],
11+
gr.Label(label="Result"),
12+
examples=[["coco.mp4", "airplane, dog, car"]],
13+
allow_flagging="never",
14+
)
15+
16+
return demo

0 commit comments

Comments
 (0)