Skip to content

Commit d6ea460

Browse files
authored
sync release branch with main (#3229)
* updates in documentation (#3223) * update requirements in universal-sentence-encoder demo (#3222) * fix rerank and embeddings demo for windows and gpu (#3235)
1 parent c7e8262 commit d6ea460

File tree

6 files changed

+9
-9
lines changed

6 files changed

+9
-9
lines changed

demos/common/export_models/export_model.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -189,7 +189,7 @@ def add_common_arguments(parser):
189189
"name": "{{model_name}}_embeddings_model",
190190
"base_path": "embeddings",
191191
"target_device": "{{target_device|default("CPU", true)}}",
192-
"plugin_config": { "NUM_STREAMS": {{num_streams|default("1", true)}} }
192+
"plugin_config": { "NUM_STREAMS": "{{num_streams|default(1, true)}}" }
193193
}
194194
}
195195
]
@@ -208,7 +208,7 @@ def add_common_arguments(parser):
208208
"name": "{{model_name}}_rerank_model",
209209
"base_path": "rerank",
210210
"target_device": "{{target_device|default("CPU", true)}}",
211-
"plugin_config": { "NUM_STREAMS": {{num_streams|default("1", true)}} }
211+
"plugin_config": { "NUM_STREAMS": "{{num_streams|default(1, true)}}" }
212212
}
213213
}
214214
]

demos/continuous_batching/accuracy/README.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -80,6 +80,7 @@ export OPENAI_COMPATIBLE_API_URL=http://localhost:8000/v3
8080
export OPENAI_COMPATIBLE_API_KEY="unused"
8181
git clone https://github.com/EvolvingLMMs-Lab/lmms-eval
8282
cd lmms-eval
83+
git checkout 4471ad311e620ed6cf3a0419d8ba6f18f8fb1cb3 # https://github.com/EvolvingLMMs-Lab/lmms-eval/issues/625
8384
pip install -e . --extra-index-url "https://download.pytorch.org/whl/cpu"
8485
python -m lmms_eval \
8586
--model openai_compatible \
@@ -101,7 +102,7 @@ openai_compatible (model_version=OpenGVLab/InternVL2_5-8B,max_retries=1), gen_kw
101102
|--------|-------|------|-----:|--------------------|---|--------:|---|------|
102103
|mme |Yaml |none | 0|mme_cognition_score |↑ | 600.3571|± | N/A|
103104
|mme |Yaml |none | 0|mme_perception_score|↑ |1618.2984|± | N/A|
104-
|mmmu_val| 0|none | 0|mmmu_acc |↑ | 0.5100|± | N/A|
105+
|mmmu_val| 0|none | 0|mmmu_acc |↑ | 0.5322|± | N/A|
105106
106107
```
107108

demos/continuous_batching/vlm/README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,7 @@ python export_model.py text_generation --source_model OpenGVLab/InternVL2_5-8B -
4646
> **Note:** Change the `--weight-format` to quantize the model to `int8` or `int4` precision to reduce memory consumption and improve performance.
4747
4848
> **Note:** You can change the model used in the demo out of any topology [tested](https://github.com/openvinotoolkit/openvino.genai/blob/master/SUPPORTED_MODELS.md#visual-language-models) with OpenVINO.
49+
Be aware that QwenVL models executed on GPU might experience execution errors with very high resolution images. In case of such behavior, it is recommended to reduce the parameter `max_pixels` in `preprocessor_config.json`.
4950

5051
You should have a model folder like below:
5152
```

demos/embeddings/README.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -126,8 +126,7 @@ content-type: application/json
126126

127127
:::{dropdown} **Request embeddings with cURL**
128128
```bash
129-
curl http://localhost:8000/v3/embeddings \
130-
-H "Content-Type: application/json" -d '{ "model": "Alibaba-NLP/gte-large-en-v1.5", "input": "hello world"}' | jq .
129+
curl http://localhost:8000/v3/embeddings -H "Content-Type: application/json" -d "{ \"model\": \"Alibaba-NLP/gte-large-en-v1.5\", \"input\": \"hello world\"}"
131130
```
132131
```json
133132
{

demos/rerank/README.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -106,8 +106,7 @@ content-type: application/json
106106
:::{dropdown} **Requesting rerank score with cURL**
107107

108108
```bash
109-
curl http://localhost:8000/v3/rerank -H "Content-Type: application/json" \
110-
-d '{ "model": "BAAI/bge-reranker-large", "query": "welcome", "documents":["good morning","farewell"]}' | jq .
109+
curl http://localhost:8000/v3/rerank -H "Content-Type: application/json" -d "{ \"model\": \"BAAI/bge-reranker-large\", \"query\": \"welcome\", \"documents\":[\"good morning\",\"farewell\"]}"
111110
```
112111
```json
113112
{
Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,2 @@
1-
tensorflow-serving-api==2.11.0
2-
numpy<2.0.0
1+
tensorflow-serving-api==2.18.1
2+
tensorflow==2.18.1

0 commit comments

Comments
 (0)