Skip to content

Commit d305e96

Browse files
Fix optical_character_recognition and horizontal_text_detection by changing models (#3206) (#3219)
1 parent 3ad4399 commit d305e96

File tree

6 files changed

+25
-27
lines changed

6 files changed

+25
-27
lines changed

demos/horizontal_text_detection/python/Makefile

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -27,9 +27,9 @@ setup_repository:
2727
curl https://storage.openvinotoolkit.org/repositories/open_model_zoo/2022.1/models_bin/2/horizontal-text-detection-0001/FP32/horizontal-text-detection-0001.bin -o workspace/horizontal-text-detection-0001/1/horizontal-text-detection-0001.bin
2828
curl https://storage.openvinotoolkit.org/repositories/open_model_zoo/2022.1/models_bin/2/horizontal-text-detection-0001/FP32/horizontal-text-detection-0001.xml -o workspace/horizontal-text-detection-0001/1/horizontal-text-detection-0001.xml
2929
# Download text recognition model
30-
mkdir -p workspace/text-recognition-0014/1
31-
curl https://storage.openvinotoolkit.org/repositories/open_model_zoo/2022.1/models_bin/2/text-recognition-0014/FP32/text-recognition-0014.bin -o workspace/text-recognition-0014/1/text-recognition-0014.bin
32-
curl https://storage.openvinotoolkit.org/repositories/open_model_zoo/2022.1/models_bin/2/text-recognition-0014/FP32/text-recognition-0014.xml -o workspace/text-recognition-0014/1/text-recognition-0014.xml
30+
mkdir -p workspace/text-recognition-0012/1
31+
curl https://storage.openvinotoolkit.org/repositories/open_model_zoo/2023.0/models_bin/1/text-recognition-0012/FP32/text-recognition-0012.bin -o workspace/text-recognition-0012/1/text-recognition-0012.bin
32+
curl https://storage.openvinotoolkit.org/repositories/open_model_zoo/2023.0/models_bin/1/text-recognition-0012/FP32/text-recognition-0012.xml -o workspace/text-recognition-0012/1/text-recognition-0012.xml
3333
ifeq ($(BUILD_CUSTOM_NODE),true)
3434
# Build custom node
3535
cd ../../../src/custom_nodes && \

demos/horizontal_text_detection/python/README.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -70,15 +70,15 @@ ThreadID: 3; Current FPS: 30.30; Average FPS: 25.73; Average latency:
7070
> **NOTE**: Video source is cropped to 704x704 resolution to match model input size.
7171
7272
## Recognize Detected Text with OCR Pipeline
73-
Optical Character Recognition (OCR) pipeline based on [horizontal text detection](https://github.com/openvinotoolkit/open_model_zoo/blob/releases/2023/0/models/intel/horizontal-text-detection-0001/README.md) model, [text recognition](https://github.com/openvinotoolkit/open_model_zoo/tree/2022.1.0/models/intel/text-recognition-0014)
73+
Optical Character Recognition (OCR) pipeline based on [horizontal text detection](https://github.com/openvinotoolkit/open_model_zoo/blob/releases/2023/0/models/intel/horizontal-text-detection-0001/README.md) model, [text recognition](https://github.com/openvinotoolkit/open_model_zoo/tree/2023.0.0/models/intel/text-recognition-0012)
7474
combined with a custom node implementation can be used with the same python script used before. OCR pipeline provides location of detected text boxes on the image and additionally recognized text for each box.
7575

7676
![horizontal text detection using OCR pipeline](horizontal-text-detection-ocr.gif)
7777

7878
### Prepare workspace to run the demo
7979

8080
To successfully deploy OCR pipeline you need to have a workspace that contains:
81-
- [horizontal text detection](https://github.com/openvinotoolkit/open_model_zoo/blob/releases/2022/1/models/intel/horizontal-text-detection-0001/README.md) and [text recognition](https://github.com/openvinotoolkit/open_model_zoo/tree/2022.1.0/models/intel/text-recognition-0014) models
81+
- [horizontal text detection](https://github.com/openvinotoolkit/open_model_zoo/blob/releases/2022/1/models/intel/horizontal-text-detection-0001/README.md) and [text recognition](https://github.com/openvinotoolkit/open_model_zoo/tree/2023.0.0/models/intel/text-recognition-0012) models
8282
- Custom node for image processing
8383
- Configuration file
8484

@@ -108,10 +108,10 @@ workspace/
108108
│   └── 1
109109
│   ├── horizontal-text-detection-0001.bin
110110
│   └── horizontal-text-detection-0001.xml
111-
└── text-recognition-0014
111+
└── text-recognition-0012
112112
└── 1
113-
├── text-recognition-0014.bin
114-
└── text-recognition-0014.xml
113+
├── text-recognition-0012.bin
114+
└── text-recognition-0012.xml
115115

116116
```
117117

@@ -134,10 +134,10 @@ workspace/
134134
│   └── horizontal-text-detection-0001.xml
135135
├── lib
136136
│   └── libcustom_node_horizontal_ocr.so
137-
└── text-recognition-0014
137+
└── text-recognition-0012
138138
└── 1
139-
├── text-recognition-0014.bin
140-
└── text-recognition-0014.xml
139+
├── text-recognition-0012.bin
140+
└── text-recognition-0012.xml
141141

142142
```
143143
## Deploying OVMS

demos/horizontal_text_detection/python/config.json

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -13,8 +13,7 @@
1313
{
1414
"config": {
1515
"name": "text-recognition",
16-
"layout": "NHWC:NCHW",
17-
"base_path": "/workspace/text-recognition-0014"
16+
"base_path": "/workspace/text-recognition-0012"
1817
}
1918
}
2019
],
@@ -51,7 +50,7 @@
5150
"original_image_width": "704",
5251
"original_image_height": "704",
5352
"original_image_layout": "NHWC",
54-
"target_image_width": "128",
53+
"target_image_width": "120",
5554
"target_image_height": "32",
5655
"target_image_layout": "NHWC",
5756
"convert_to_gray_scale": "true",
@@ -79,11 +78,11 @@
7978
"model_name": "text-recognition",
8079
"type": "DL model",
8180
"inputs": [
82-
{"imgs": {"node_name": "extract_node",
81+
{"Placeholder": {"node_name": "extract_node",
8382
"data_item": "text_images"}}
8483
],
8584
"outputs": [
86-
{"data_item": "logits",
85+
{"data_item": "shadow/LSTMLayers/transpose_time_major",
8786
"alias": "texts"}
8887
]
8988
}

demos/optical_character_recognition/python/README.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# Optical Character Recognition with Directed Acyclic Graph {#ovms_demo_optical_character_recognition}
22

33
This document demonstrates how to create and use an Optical Character Recognition (OCR) pipeline based on [east-resnet50](https://github.com/argman/EAST) text detection model,
4-
[text-recognition](https://github.com/openvinotoolkit/open_model_zoo/tree/2022.1.0/models/intel/text-recognition-0014) combined with a custom node implementation.
4+
[text-recognition](https://github.com/openvinotoolkit/open_model_zoo/tree/2023.0.0/models/intel/text-recognition-0012) combined with a custom node implementation.
55

66
Using such a pipeline, a single request to OVMS can perform a complex set of operations with a response containing
77
recognized characters for all detected text boxes.
@@ -91,9 +91,9 @@ Converted east-resnet50 model will have the following interface:
9191
- Output name: `feature_fusion/concat_3` ; shape: `[1 256 480 5]` ; precision: `FP32`; layout: `N...`
9292

9393
### Text-recognition model
94-
Download [text-recognition](https://github.com/openvinotoolkit/open_model_zoo/tree/2022.1.0/models/intel/text-recognition-0014) model and store it in `${PWD}/text-recognition/1` folder.
94+
Download [text-recognition](https://github.com/openvinotoolkit/open_model_zoo/tree/2023.0.0/models/intel/text-recognition-0012) model and store it in `${PWD}/text-recognition/1` folder.
9595
```bash
96-
curl -L --create-dirs https://storage.openvinotoolkit.org/repositories/open_model_zoo/2022.1/models_bin/2/text-recognition-0014/FP32/text-recognition-0014.bin -o text-recognition/1/model.bin https://storage.openvinotoolkit.org/repositories/open_model_zoo/2022.1/models_bin/2/text-recognition-0014/FP32/text-recognition-0014.xml -o text-recognition/1/model.xml
96+
curl -L --create-dirs https://storage.openvinotoolkit.org/repositories/open_model_zoo/2023.0/models_bin/1/text-recognition-0012/FP32/text-recognition-0012.bin -o text-recognition/1/model.bin https://storage.openvinotoolkit.org/repositories/open_model_zoo/2023.0/models_bin/1/text-recognition-0012/FP32/text-recognition-0012.xml -o text-recognition/1/model.xml
9797
chmod -R 755 text-recognition
9898
```
9999

@@ -192,7 +192,7 @@ openvino
192192
pipeline
193193
2021
194194
intel
195-
rotations
195+
rotation
196196
Output: name[text_images]
197197
numpy => shape[(9, 1, 32, 128, 1)] data[float32]
198198
Output: name[text_coordinates]

demos/optical_character_recognition/python/config.json

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,6 @@
1212
{
1313
"config": {
1414
"name": "text-recognition",
15-
"layout": "NHWC:NCHW",
1615
"base_path": "/OCR/text-recognition"
1716
}
1817
}
@@ -52,12 +51,12 @@
5251
"original_image_width": "1920",
5352
"original_image_height": "1024",
5453
"original_image_layout": "NHWC",
55-
"target_image_width": "128",
54+
"target_image_width": "120",
5655
"target_image_height": "32",
5756
"target_image_layout": "NHWC",
5857
"convert_to_gray_scale": "true",
5958
"confidence_threshold": "0.9",
60-
"overlap_threshold": "0.2",
59+
"overlap_threshold": "0.1",
6160
"max_output_batch": "100",
6261
"box_width_adjustment": "0.1",
6362
"box_height_adjustment": "0.0",
@@ -86,11 +85,11 @@
8685
"model_name": "text-recognition",
8786
"type": "DL model",
8887
"inputs": [
89-
{"imgs": {"node_name": "extract_node",
88+
{"Placeholder": {"node_name": "extract_node",
9089
"data_item": "text_images"}}
9190
],
9291
"outputs": [
93-
{"data_item": "logits",
92+
{"data_item": "shadow/LSTMLayers/transpose_time_major",
9493
"alias": "texts"}
9594
]
9695
}

demos/optical_character_recognition/python/optical_character_recognition.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -80,7 +80,7 @@ def decode(text):
8080
def text_recognition_output_to_text(output_nd):
8181
for i in range(output_nd.shape[0]):
8282
data = output_nd[i]
83-
alphabet = '#1234567890abcdefghijklmnopqrstuvwxyz'
83+
alphabet = '0123456789abcdefghijklmnopqrstuvwxyz#'
8484
preds = data.argmax(2)
8585
word = ''
8686
for i in range(preds.shape[0]):
@@ -124,4 +124,4 @@ def text_recognition_output_to_text(output_nd):
124124
if name == args['text_images_output_name'] and len(args['text_images_save_path']) > 0:
125125
save_text_images_as_jpgs(output_nd, name, args['text_images_save_path'])
126126
if name == args['texts_output_name']:
127-
text_recognition_output_to_text(output_nd)
127+
text_recognition_output_to_text(output_nd)

0 commit comments

Comments
 (0)