Skip to content

Commit fd5b4e1

Browse files
authored
[Feat] Support textline_orientation for chatocr and unify naming of text line orientation (#15337)
* Support textline_orientation for chatocr and unify naming of textline orientation * Unify description * Update documentation * Fix serving doc
1 parent 79a15b8 commit fd5b4e1

20 files changed

+153
-115
lines changed

docs/version3.x/module_usage/text_line_orientation_classification.en.md renamed to docs/version3.x/module_usage/textline_orientation_classification.en.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -92,8 +92,8 @@ The text line orientation classification module primarily distinguishes the orie
9292
You can quickly experience the functionality with a single command:
9393

9494
```bash
95-
paddleocr text_line_orientation_classification -i https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/textline_rot180_demo.jpg
96-
```
95+
paddleocr textline_orientation_classification -i https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/textline_rot180_demo.jpg
96+
```
9797

9898
You can also integrate the text line orientation classification model into your project. Run the following code after downloading the [example image](https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/textline_rot180_demo.jpg) to your local machine.
9999

docs/version3.x/module_usage/text_line_orientation_classification.md renamed to docs/version3.x/module_usage/textline_orientation_classification.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -96,7 +96,7 @@ comments: true
9696
使用一行命令即可快速体验:
9797

9898
```bash
99-
paddleocr text_line_orientation_classification -i https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/textline_rot180_demo.jpg
99+
paddleocr textline_orientation_classification -i https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/textline_rot180_demo.jpg
100100
```
101101

102102
您也可以将文本行方向分类模块中的模型推理集成到您的项目中。运行以下代码前,请您下载[示例图片](https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/textline_rot180_demo.jpg)到本地。

docs/version3.x/pipeline_usage/OCR.en.md

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ The general OCR pipeline is used to solve text recognition tasks by extracting t
1616

1717
- [Document Image Orientation Classification Module](../module_usage/doc_img_orientation_classification.md) (Optional)
1818
- [Text Image Unwarping Module](../module_usage/text_image_unwarping.md) (Optional)
19-
- [Text Line Orientation Classification Module](../module_usage/text_line_orientation_classification.md) (Optional)
19+
- [Text Line Orientation Classification Module](../module_usage/textline_orientation_classification.md) (Optional)
2020
- [Text Detection Module](../module_usage/text_detection.md)
2121
- [Text Recognition Module](../module_usage/text_recognition.md)
2222

@@ -592,19 +592,19 @@ paddleocr ocr -i ./general_ocr_002.png --ocr_version PP-OCRv4
592592
<td></td>
593593
</tr>
594594
<tr>
595-
<td><code>text_line_orientation_model_name</code></td>
595+
<td><code>textline_orientation_model_name</code></td>
596596
<td>Name of the text line orientation model. If not set, the pipeline default model will be used.</td>
597597
<td><code>str</code></td>
598598
<td></td>
599599
</tr>
600600
<tr>
601-
<td><code>text_line_orientation_model_dir</code></td>
601+
<td><code>textline_orientation_model_dir</code></td>
602602
<td>Directory path of the text line orientation model. If not set, the official model will be downloaded.</td>
603603
<td><code>str</code></td>
604604
<td></td>
605605
</tr>
606606
<tr>
607-
<td><code>text_line_orientation_batch_size</code></td>
607+
<td><code>textline_orientation_batch_size</code></td>
608608
<td>Batch size for the text line orientation model. If not set, the default batch size will be <code>1</code>.</td>
609609
<td><code>int</code></td>
610610
<td></td>
@@ -794,13 +794,13 @@ Any floating-point number greater than <code>0</code>. If not set, the pipeline'
794794
</tr>
795795
<tr>
796796
<td><code>cls_model_dir</code></td>
797-
<td>Deprecated. Please refer <code>text_line_orientation_model_dir</code> , they cannot be specified simultaneously with the new parameters.</td>
797+
<td>Deprecated. Please refer <code>textline_orientation_model_dir</code> , they cannot be specified simultaneously with the new parameters.</td>
798798
<td><code>str</code></td>
799799
<td></td>
800800
</tr>
801801
<tr>
802802
<td><code>cls_batch_num</code></td>
803-
<td>Deprecated. Please refer <code>text_line_orientation_batch_size</code> , they cannot be specified simultaneously with the new parameters.</td>
803+
<td>Deprecated. Please refer <code>textline_orientation_batch_size</code> , they cannot be specified simultaneously with the new parameters.</td>
804804
<td><code>int</code></td>
805805
<td></td>
806806
</tr>
@@ -975,19 +975,19 @@ In the above Python script, the following steps are performed:
975975
<td><code>None</code></td>
976976
</tr>
977977
<tr>
978-
<td><code>text_line_orientation_model_name</code></td>
978+
<td><code>textline_orientation_model_name</code></td>
979979
<td>Name of the text line orientation model. If set to <code>None</code>, the pipeline's default model will be used.</td>
980980
<td><code>str</code></td>
981981
<td><code>None</code></td>
982982
</tr>
983983
<tr>
984-
<td><code>text_line_orientation_model_dir</code></td>
984+
<td><code>textline_orientation_model_dir</code></td>
985985
<td>Directory path of the text line orientation model. If set to <code>None</code>, the official model will be downloaded.</td>
986986
<td><code>str</code></td>
987987
<td><code>None</code></td>
988988
</tr>
989989
<tr>
990-
<td><code>text_line_orientation_batch_size</code></td>
990+
<td><code>textline_orientation_batch_size</code></td>
991991
<td>Batch size for the text line orientation model. If set to <code>None</code>, the default batch size will be <code>1</code>.</td>
992992
<td><code>int</code></td>
993993
<td><code>None</code></td>

docs/version3.x/pipeline_usage/OCR.md

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ OCR(光学字符识别,Optical Character Recognition)是一种将图像中
1616

1717
- [文档图像方向分类模块](../module_usage/doc_img_orientation_classification.md) (可选)
1818
- [文本图像矫正模块](../module_usage/text_image_unwarping.md) (可选)
19-
- [文本行方向分类模块](../module_usage/text_line_orientation_classification.md) (可选)
19+
- [文本行方向分类模块](../module_usage/textline_orientation_classification.md) (可选)
2020
- [文本检测模块](../module_usage/text_detection.md)
2121
- [文本识别模块](../module_usage/text_recognition.md)
2222

@@ -592,19 +592,19 @@ paddleocr ocr -i ./general_ocr_002.png --ocr_version PP-OCRv4
592592
<td></td>
593593
</tr>
594594
<tr>
595-
<td><code>text_line_orientation_model_name</code></td>
595+
<td><code>textline_orientation_model_name</code></td>
596596
<td>文本行方向模型的名称。如果不设置,将会使用产线默认模型。</td>
597597
<td><code>str</code></td>
598598
<td></td>
599599
</tr>
600600
<tr>
601-
<td><code>text_line_orientation_model_dir</code></td>
601+
<td><code>textline_orientation_model_dir</code></td>
602602
<td>文本行方向模型的目录路径。如果不设置,将会下载官方模型。</td>
603603
<td><code>str</code></td>
604604
<td></td>
605605
</tr>
606606
<tr>
607-
<td><code>text_line_orientation_batch_size</code></td>
607+
<td><code>textline_orientation_batch_size</code></td>
608608
<td>文本行方向模型的批处理大小。如果不设置,将默认设置批处理大小为<code>1</code>。</td>
609609
<td><code>int</code></td>
610610
<td></td>
@@ -792,13 +792,13 @@ paddleocr ocr -i ./general_ocr_002.png --ocr_version PP-OCRv4
792792
</tr>
793793
<tr>
794794
<td><code>cls_model_dir</code></td>
795-
<td>已废弃,请参考<code>text_line_orientation_model_dir</code>,且与新的参数不能同时指定。</td>
795+
<td>已废弃,请参考<code>textline_orientation_model_dir</code>,且与新的参数不能同时指定。</td>
796796
<td><code>str</code></td>
797797
<td></td>
798798
</tr>
799799
<tr>
800800
<td><code>cls_batch_num</code></td>
801-
<td>已废弃,请参考<code>text_line_orientation_batch_size</code>,且与新的参数不能同时指定。</td>
801+
<td>已废弃,请参考<code>textline_orientation_batch_size</code>,且与新的参数不能同时指定。</td>
802802
<td><code>int</code></td>
803803
<td></td>
804804
</tr>
@@ -973,19 +973,19 @@ for res in result:
973973
<td><code>None</code></td>
974974
</tr>
975975
<tr>
976-
<td><code>text_line_orientation_model_name</code></td>
976+
<td><code>textline_orientation_model_name</code></td>
977977
<td>文本行方向模型的名称。如果设置为<code>None</code>,将会使用产线默认模型。</td>
978978
<td><code>str</code></td>
979979
<td><code>None</code></td>
980980
</tr>
981981
<tr>
982-
<td><code>text_line_orientation_model_dir</code></td>
982+
<td><code>textline_orientation_model_dir</code></td>
983983
<td>文本行方向模型的目录路径。如果设置为<code>None</code>,将会下载官方模型。</td>
984984
<td><code>str</code></td>
985985
<td><code>None</code></td>
986986
</tr>
987987
<tr>
988-
<td><code>text_line_orientation_batch_size</code></td>
988+
<td><code>textline_orientation_batch_size</code></td>
989989
<td>文本行方向模型的批处理大小。如果设置为<code>None</code>,将默认设置批处理大小为<code>1</code>。</td>
990990
<td><code>int</code></td>
991991
<td><code>None</code></td>

docs/version3.x/pipeline_usage/PP-ChatOCRv4.en.md

Lines changed: 19 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -664,6 +664,12 @@ The name of the document orientation classification model. If not set, the defau
664664
<td></td>
665665
</tr>
666666
<tr>
667+
<td><code>use_textline_orientation</code></td>
668+
<td>Whether to load and use the text line orientation classification module. If not set, the parameter value initialized by the pipeline will be used by default, initialized as <code>True</code>.</td>
669+
<td><code>bool</code></td>
670+
<td></td>
671+
</tr>
672+
<tr>
667673
<td><code>use_seal_recognition</code></td>
668674
<td>Whether to load and use the seal recognition sub-pipeline. If not set, the parameter's value initialized during pipeline setup will be used, defaulting to <code>True</code>.</td>
669675
<td><code>bool</code></td>
@@ -1106,6 +1112,12 @@ The relevant parameter descriptions are as follows:
11061112
<td><code>None</code></td>
11071113
</tr>
11081114
<tr>
1115+
<td><code>use_textline_orientation</code></td>
1116+
<td>Whether to load and use the text line orientation classification function. If set to<code>None</code>, the value initialized by the pipeline for this parameter will be used by default (initialized to <code>True</code>).</td>
1117+
<td><code>bool</code></td>
1118+
<td><code>None</code></td>
1119+
</tr>
1120+
<tr>
11091121
<td><code>use_seal_recognition</code></td>
11101122
<td>Whether to load and use the seal recognition sub-pipeline. If set to<code>None</code>, the value initialized by the pipeline for this parameter will be used by default (initialized to <code>True</code>).</td>
11111123
<td><code>bool</code></td>
@@ -1418,7 +1430,13 @@ The relevant parameter descriptions are as follows:
14181430
</tr>
14191431
<tr>
14201432
<td><code>use_doc_unwarping</code></td>
1421-
<td>Whether to use the text image correction module during inference.</td>
1433+
<td>Whether to use the document image unwarping module during inference.</td>
1434+
<td><code>bool</code></td>
1435+
<td><code>None</code></td>
1436+
</tr>
1437+
<tr>
1438+
<td><code>use_textline_orientation</code></td>
1439+
<td>Whether to use the text line orientation classification module during inference.</td>
14221440
<td><code>bool</code></td>
14231441
<td><code>None</code></td>
14241442
</tr>

docs/version3.x/pipeline_usage/PP-ChatOCRv4.md

Lines changed: 23 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ PP-ChatOCRv4 产线中包含<b>版面区域检测模块</b>、<b>表格结构识
2020
- [表格结构识别模块](../module_usage/table_structure_recognition.md)(可选)
2121
- [文本检测模块](../module_usage/text_detection.md)
2222
- [文本识别模块](../module_usage/text_recognition.md)
23-
- [文本行方向分类模块](../module_usage/text_line_orientation_classification.md)(可选)
23+
- [文本行方向分类模块](../module_usage/textline_orientation_classification.md)(可选)
2424
- [公式识别模块](../module_usage/formula_recognition.md)(可选)
2525
- [印章文本检测模块](../module_usage/seal_text_detection.md)(可选)
2626

@@ -905,13 +905,19 @@ paddleocr pp_chatocrv4_doc -i vehicle_certificate-1.png -k 驾驶室准乘人数
905905
</tr>
906906
<tr>
907907
<td><code>use_doc_orientation_classify</code></td>
908-
<td>是否加载并使用文档方向分类功能。如果不设置,将默认使用产线初始化的该参数值,初始化为<code>True</code>。</td>
908+
<td>是否加载并使用文档方向分类模块。如果不设置,将默认使用产线初始化的该参数值,初始化为<code>True</code>。</td>
909909
<td><code>bool</code></td>
910910
<td></td>
911911
</tr>
912912
<tr>
913913
<td><code>use_doc_unwarping</code></td>
914-
<td>是否加载并使用文档去扭曲功能。如果不设置,将默认使用产线初始化的该参数值,初始化为<code>True</code>。</td>
914+
<td>是否加载并使用文档去扭曲模块。如果不设置,将默认使用产线初始化的该参数值,初始化为<code>True</code>。</td>
915+
<td><code>bool</code></td>
916+
<td></td>
917+
</tr>
918+
<tr>
919+
<td><code>use_textline_orientation</code></td>
920+
<td>是否加载并使用文本行方向分类模块。如果不设置,初始化为<code>True</code>。</td>
915921
<td><code>bool</code></td>
916922
<td></td>
917923
</tr>
@@ -1325,13 +1331,19 @@ PP-ChatOCRv4 预测的流程、API说明、产出说明如下:
13251331
</tr>
13261332
<tr>
13271333
<td><code>use_doc_orientation_classify</code></td>
1328-
<td>是否加载并使用文档方向分类功能。如果设置为<code>None</code>,将默认使用产线初始化的该参数值,初始化为<code>True</code>。</td>
1334+
<td>是否加载并使用文档方向分类模块。如果设置为<code>None</code>,将默认使用产线初始化的该参数值,初始化为<code>True</code>。</td>
13291335
<td><code>bool</code></td>
13301336
<td><code>None</code></td>
13311337
</tr>
13321338
<tr>
13331339
<td><code>use_doc_unwarping</code></td>
1334-
<td>是否加载并使用文档去扭曲功能。如果设置为<code>None</code>,将默认使用产线初始化的该参数值,初始化为<code>True</code>。</td>
1340+
<td>是否加载并使用文档去扭曲模块。如果设置为<code>None</code>,将默认使用产线初始化的该参数值,初始化为<code>True</code>。</td>
1341+
<td><code>bool</code></td>
1342+
<td><code>None</code></td>
1343+
</tr>
1344+
<tr>
1345+
<td><code>use_textline_orientation</code></td>
1346+
<td>是否加载并使用文本行方向分类模块. 如果设置为<code>None</code>,将默认使用产线初始化的该参数值,初始化为<code>True</code>。</td>
13351347
<td><code>bool</code></td>
13361348
<td><code>None</code></td>
13371349
</tr>
@@ -1655,6 +1667,12 @@ PP-ChatOCRv4 预测的流程、API说明、产出说明如下:
16551667
<td><code>None</code></td>
16561668
</tr>
16571669
<tr>
1670+
<td><code>use_textline_orientation</code></td>
1671+
<td>是否加载并使用文本行方向分类模块。</td>
1672+
<td><code>bool</code></td>
1673+
<td><code>None</code></td>
1674+
</tr>
1675+
<tr>
16581676
<td><code>use_seal_recognition</code></td>
16591677
<td>是否在推理时使用印章识别子产线。</td>
16601678
<td><code>bool</code></td>

docs/version3.x/pipeline_usage/PP-StructureV3.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1382,7 +1382,7 @@ paddleocr pp_structurev3 -i ./pp_structure_v3_demo.png --device gpu
13821382
</tr>
13831383
<tr>
13841384
<td><code>use_textline_orientation</code></td>
1385-
<td>是否加载并使用文本行方向分类模块。 如果不设置, default is <code>True</code>。</td>
1385+
<td>是否加载并使用文本行方向分类模块。如果不设置,初始化为<code>True</code>。</td>
13861386
<td><code>bool</code></td>
13871387
<td></td>
13881388
</tr>
@@ -1987,7 +1987,7 @@ for item in markdown_images:
19871987
</tr>
19881988
<tr>
19891989
<td><code>use_textline_orientation</code></td>
1990-
<td>是否加载并使用文本行方向分类模块. 如果设置为<code>None</code>, default is <code>True</code>.</td>
1990+
<td>是否加载并使用文本行方向分类模块如果设置为<code>None</code>,将默认使用产线初始化的该参数值,初始化为<code>True</code></td>
19911991
<td><code>bool</code></td>
19921992
<td><code>None</code></td>
19931993
</tr>

docs/version3.x/pipeline_usage/table_recognition_v2.en.md

Lines changed: 0 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -1872,30 +1872,6 @@ Below is the API reference for basic service-oriented deployment and examples of
18721872
<td>No</td>
18731873
</tr>
18741874
<tr>
1875-
<td><code>layoutThreshold</code></td>
1876-
<td><code>number</code> | <code>null</code></td>
1877-
<td>Please refer to the <code>layout_threshold</code> parameter description in the <code>predict</code> method of the model object.</td>
1878-
<td>No</td>
1879-
</tr>
1880-
<tr>
1881-
<td><code>layoutNms</code></td>
1882-
<td><code>boolean</code> | <code>null</code></td>
1883-
<td>Please refer to the <code>layout_nms</code> parameter description in the <code>predict</code> method of the model object.</td>
1884-
<td>No</td>
1885-
</tr>
1886-
<tr>
1887-
<td><code>layoutUnclipRatio</code></td>
1888-
<td><code>number</code> | <code>array</code> | <code>null</code></td>
1889-
<td>Please refer to the <code>layout_unclip_ratio</code> parameter description in the <code>predict</code> method of the model object.</td>
1890-
<td>No</td>
1891-
</tr>
1892-
<tr>
1893-
<td><code>layoutMergeBboxesMode</code></td>
1894-
<td><code>string</code> | <code>null</code></td>
1895-
<td>Please refer to the <code>layout_merge_bboxes_mode</code> parameter description in the <code>predict</code> method of the model object.</td>
1896-
<td>No</td>
1897-
</tr>
1898-
<tr>
18991875
<td><code>textDetLimitSideLen</code></td>
19001876
<td><code>integer</code> | <code>null</code></td>
19011877
<td>Please refer to the <code>text_det_limit_side_len</code> parameter description in the <code>predict</code> method of the model object.</td>

docs/version3.x/pipeline_usage/table_recognition_v2.md

Lines changed: 0 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -1875,30 +1875,6 @@ for res in output:
18751875
<td>否</td>
18761876
</tr>
18771877
<tr>
1878-
<td><code>layoutThreshold</code></td>
1879-
<td><code>number</code> | <code>null</code></td>
1880-
<td>请参阅产线对象中 <code>predict</code> 方法的 <code>layout_threshold</code> 参数相关说明。</td>
1881-
<td>否</td>
1882-
</tr>
1883-
<tr>
1884-
<td><code>layoutNms</code></td>
1885-
<td><code>boolean</code> | <code>null</code></td>
1886-
<td>请参阅产线对象中 <code>predict</code> 方法的 <code>layout_nms</code> 参数相关说明。</td>
1887-
<td>否</td>
1888-
</tr>
1889-
<tr>
1890-
<td><code>layoutUnclipRatio</code></td>
1891-
<td><code>number</code> | <code>array</code> | <code>null</code></td>
1892-
<td>请参阅产线对象中 <code>predict</code> 方法的 <code>layout_unclip_ratio</code> 参数相关说明。</td>
1893-
<td>否</td>
1894-
</tr>
1895-
<tr>
1896-
<td><code>layoutMergeBboxesMode</code></td>
1897-
<td><code>string</code> | <code>null</code></td>
1898-
<td>请参阅产线对象中 <code>predict</code> 方法的 <code>layout_merge_bboxes_mode</code> 参数相关说明。</td>
1899-
<td>否</td>
1900-
</tr>
1901-
<tr>
19021878
<td><code>textDetLimitSideLen</code></td>
19031879
<td><code>integer</code> | <code>null</code></td>
19041880
<td>请参阅产线对象中 <code>predict</code> 方法的 <code>text_det_limit_side_len</code> 参数相关说明。</td>

mkdocs.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -290,7 +290,7 @@ nav:
290290
- 表格结构识别模块: version3.x/module_usage/table_structure_recognition.md
291291
- 文本检测模块: version3.x/module_usage/text_detection.md
292292
- 文本图像矫正模块: version3.x/module_usage/text_image_unwarping.md
293-
- 文本行方向分类模块: version3.x/module_usage/text_line_orientation_classification.md
293+
- 文本行方向分类模块: version3.x/module_usage/textline_orientation_classification.md
294294
- 文本识别模块: version3.x/module_usage/text_recognition.md
295295
- 产线列表:
296296
- 产线概述: version3.x/pipeline_usage/pipeline_overview.md

0 commit comments

Comments
 (0)