From 9bb0f2149a9eb78d5e0d222dba8df7cbec2419f2 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E5=AD=A6=E5=8D=BF?= <64625668+leo-q8@users.noreply.github.com> Date: Wed, 2 Jul 2025 15:58:02 +0800 Subject: [PATCH] Doc refine (#15907) * support ppocrv5 minor lang docs * fixed bugs * fixed bugs * refine docs * refine docs * fixed bugs --- .../PP-OCRv5/PP-OCRv5_multi_languages.en.md | 94 ++++++++++++++----- .../PP-OCRv5/PP-OCRv5_multi_languages.md | 63 ++++++++++--- .../module_usage/text_recognition.en.md | 4 +- .../module_usage/text_recognition.md | 4 +- docs/version3.x/pipeline_usage/OCR.en.md | 4 +- docs/version3.x/pipeline_usage/OCR.md | 4 +- mkdocs.yml | 4 +- 7 files changed, 129 insertions(+), 48 deletions(-) diff --git a/docs/version3.x/algorithm/PP-OCRv5/PP-OCRv5_multi_languages.en.md b/docs/version3.x/algorithm/PP-OCRv5/PP-OCRv5_multi_languages.en.md index ad09d42ce6..e206d7f51a 100644 --- a/docs/version3.x/algorithm/PP-OCRv5/PP-OCRv5_multi_languages.en.md +++ b/docs/version3.x/algorithm/PP-OCRv5/PP-OCRv5_multi_languages.en.md @@ -4,21 +4,42 @@ comments: true # 1. Introduction to PP-OCRv5 Multilingual Text Recognition -PP-OCRv5 is the latest generation of the PP-OCR series text recognition solutions, focusing on text recognition tasks across multiple scenarios and languages. By default, the recognition model supports accurate recognition of five mainstream text types: Simplified Chinese, Chinese Pinyin, Traditional Chinese, English, and Japanese. In addition, PP-OCRv5 provides multilingual recognition capabilities covering 37 languages, including Korean, Spanish, French, Portuguese, German, Italian, Russian, and more (see [Section 4](#4-supported-languages-and-abbreviations) for the full list of supported languages and abbreviations). Compared to the previous PP-OCRv3 version, PP-OCRv5 achieves more than a 30% improvement in recognition accuracy for multilingual tasks. +[PP-OCRv5](./PP-OCRv5.md) is the latest generation text recognition solution in the PP-OCR series, focusing on multi-scenario and multilingual text recognition tasks. In terms of supported text types, the default configuration of the recognition model can accurately identify five major types: Simplified Chinese, Pinyin, Traditional Chinese, English, and Japanese. Additionally, PP-OCRv5 offers multilingual text recognition capabilities covering 37 languages, including Korean, Spanish, French, Portuguese, German, Italian, Russian, and more (for a full list of supported languages and abbreviations, see [Section 4](#4-supported-languages-and-abbreviations)). Compared to the previous PP-OCRv3 version, PP-OCRv5 achieves over a 30% improvement in accuracy for multilingual text recognition. -![img](https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/refs/heads/main/images/pipelines/ocr/japan_2_res.jpg) +
+ French recognition result +
+ French Recognition Result +
-![img](https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/refs/heads/main/images/pipelines/ocr/french_0_res.jpg) +
-![img](https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/refs/heads/main/images/pipelines/ocr/german_0_res.png) +
+ German recognition result +
+ German Recognition Result +
-![img](https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/refs/heads/main/images/pipelines/ocr/korean_1_res.jpg) +
+ +
+ Korean recognition result +
+ Korean Recognition Result +
+ +
+ +
+ Russian recognition result +
+ Russian Recognition Result +
-![img](https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/refs/heads/main/images/pipelines/ocr/ru_0.jpeg) ## 2. Quick Start -You can use the `--lang` parameter in the command line to specify the text recognition model for your target language when running the general OCR pipeline: +You can specify the language for text recognition by using the `--lang` parameter when running the general OCR pipeline in the command line: ```bash # Use the `--lang` parameter to specify the French recognition model @@ -30,7 +51,7 @@ paddleocr ocr -i https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_im --save_path ./output \ --device gpu:0 ``` -For explanations of other command line parameters, please refer to the [command line usage](../../pipeline_usage/OCR.en.md#21-command-line) of the general OCR pipeline. After execution, results will be printed to the terminal: +For explanations of the other command-line parameters, please refer to the [Command Line Usage](../../pipeline_usage/OCR.md#21-command-line-usage) section of the general OCR pipeline documentation. After running, the results will be displayed in the terminal: ```bash {'res': {'input_path': '/root/.paddlex/predict_input/general_ocr_french01.png', 'page_index': None, 'model_settings': {'use_doc_preprocessor': True, 'use_textline_orientation': False}, 'doc_preprocessor_res': {'input_path': None, 'page_index': None, 'model_settings': {'use_doc_orientation_classify': False, 'use_doc_unwarping': False}, 'angle': -1}, 'dt_polys': array([[[119, 23], @@ -54,20 +75,21 @@ For explanations of other command line parameters, please refer to the [command [108, ..., 562]], dtype=int16)}} ``` -If you specify `save_path`, the visualization results will be saved in the `save_path` directory. An example visualization is shown below: +If you specify `save_path`, the visualization results will be saved to the specified path. An example of the visualized result is shown below: -You can also use Python code to specify the recognition model for your target language using the `lang` parameter when initializing the general OCR pipeline: + +You can also use Python code to specify the recognition model for a particular language when initializing the general OCR pipeline via the `lang` parameter: ```python from paddleocr import PaddleOCR ocr = PaddleOCR( - lang="fr", # Specify the French recognition model via the lang parameter - use_doc_orientation_classify=False, # Disable document orientation classification - use_doc_unwarping=False, # Disable text image unwarping - use_textline_orientation=False, # Disable textline orientation classification + lang="fr", # Specify French recognition model with the lang parameter + use_doc_orientation_classify=False, # Disable document orientation classification model + use_doc_unwarping=False, # Disable text image unwarping model + use_textline_orientation=False, # Disable text line orientation classification model ) result = ocr.predict("https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_french01.png") for res in result: @@ -75,31 +97,42 @@ for res in result: res.save_to_img("output") res.save_to_json("output") ``` -For more details on the `PaddleOCR` class parameters, refer to the [Python script integration](../../pipeline_usage/OCR.en.md#22-python-script-integration) of the general OCR pipeline. +For more details on the `PaddleOCR` class parameters, please refer to the [Python Scripting Integration](../../pipeline_usage/OCR.md#22-python-scripting-integration) section of the general OCR pipeline documentation. -## 3. Benchmark Comparison -| Model | Korean Dataset Accuracy (%) | | Model | Latin Script Languages Dataset Accuracy (%) | | Model | East Slavic Languages Dataset Accuracy (%) | -|--|--|--|--|--|--|--|--| -| korean_PP-OCRv5_mobile_rec | 88.0 | | latin_PP-OCRv5_mobile_rec | 84.7 | | eslav_PP-OCRv5_mobile_rec | 85.8 | -| korean_PP-OCRv3_mobile_rec | 23.0 | | latin_PP-OCRv3_mobile_rec | 37.9 | | cyrillic_PP-OCRv3_mobile_rec| 50.2 | +## 3. Performance Comparison + +| Model | Download Link | Korean Dataset Accuracy (%) | +|-|-|-| +| korean_PP-OCRv5_mobile_rec |Inference Model/Pretrained Model | 88.0| +| korean_PP-OCRv3_mobile_rec | Inference Model/Pretrained Model | 23.0 | + +| Model | Download Link | Latin Script Language Dataset Accuracy (%) | +|-|-|-| +| latin_PP-OCRv5_mobile_rec | Inference Model/Pretrained Model | 84.7 | +| latin_PP-OCRv3_mobile_rec | Inference Model/Pretrained Model | 37.9 | + +| Model | Download Link | East Slavic Language Dataset Accuracy (%) | +|-|-|-| +| eslav_PP-OCRv5_mobile_rec |Inference Model/Pretrained Model | 81.6 | +| cyrillic_PP-OCRv3_mobile_rec | Inference Model/Pretrained Model | 50.2 | **Notes:** - - Korean Dataset: PP-OCRv5's latest dataset containing 5,007 Korean text images. - - Latin Script Languages Dataset: The latest PP-OCRv5 recognition dataset, containing 3,111 text images in Latin script languages. - - East Slavic Languages Dataset: PP-OCRv5's latest dataset containing a total of 7,031 Russian, Belarusian, and Ukrainian text images. + - Korean Dataset: The latest PP-OCRv5 dataset containing 5,007 Korean text images. + - Latin Script Language Dataset: The latest PP-OCRv5 dataset containing 3,111 images of Latin script languages. + - East Slavic Language Dataset: The latest PP-OCRv5 dataset containing a total of 7,031 text images in Russian, Belarusian, and Ukrainian. ## 4. Supported Languages and Abbreviations | Language | Description | Abbreviation | | Language | Description | Abbreviation | | --- | --- | --- | ---|--- | --- | --- | | Chinese | Chinese & English | ch | | Hungarian | Hungarian | hu | -| English | English | en | | Serbian (Latin) | Serbian(latin) | rslatin | +| English | English | en | | Serbian (latin) | Serbian (latin) | rs_latin | | French | French | fr | | Indonesian | Indonesian | id | | German | German | de | | Occitan | Occitan | oc | | Japanese | Japanese | japan | | Icelandic | Icelandic | is | | Korean | Korean | korean | | Lithuanian | Lithuanian | lt | -| Chinese Traditional | Chinese Traditional | chinese_cht | | Maori | Maori | mi | +| Traditional Chinese | Chinese Traditional | chinese_cht | | Maori | Maori | mi | | Afrikaans | Afrikaans | af | | Malay | Malay | ms | | Italian | Italian | it | | Dutch | Dutch | nl | | Spanish | Spanish | es | | Norwegian | Norwegian | no | @@ -113,4 +146,13 @@ For more details on the `PaddleOCR` class parameters, refer to the [Python scrip | Croatian | Croatian | hr | | Turkish | Turkish | tr | | Uzbek | Uzbek | uz | | Latin | Latin | la | | Russian | Russian | ru | | Belarusian | Belarusian | be | -| Ukrainian | Ukranian | uk | | | | | +| Ukrainian | Ukrainian | uk | | | | | + + +## 5. Models and Their Supported Languages + +| Model | Supported Languages | +|-|-| +| korean_PP-OCRv5_mobile_rec | Korean | +| latin_PP-OCRv5_mobile_rec | English, French, German, Afrikaans, Italian, Spanish, Bosnian, Portuguese, Czech, Welsh, Danish, Estonian, Irish, Croatian, Uzbek, Hungarian, Serbian (Latin), Indonesian, Occitan, Icelandic, Lithuanian, Maori, Malay, Dutch, Norwegian, Polish, Slovak, Slovenian, Albanian, Swedish, Swahili, Tagalog, Turkish, Latin | +| eslav_PP-OCRv5_mobile_rec | Russian, Belarusian, Ukrainian | diff --git a/docs/version3.x/algorithm/PP-OCRv5/PP-OCRv5_multi_languages.md b/docs/version3.x/algorithm/PP-OCRv5/PP-OCRv5_multi_languages.md index 429b4f97a7..65471f3909 100644 --- a/docs/version3.x/algorithm/PP-OCRv5/PP-OCRv5_multi_languages.md +++ b/docs/version3.x/algorithm/PP-OCRv5/PP-OCRv5_multi_languages.md @@ -2,21 +2,41 @@ comments: true --- -# 一、PP-OCRv5多语种文本识别介绍 +# 一、PP-OCRv5多语种文字识别介绍 -PP-OCRv5 是 PP-OCR 系列的最新一代文字识别解决方案,专注于多场景、多语种的文字识别任务。在文字类型支持方面,默认配置的识别模型可准确识别简体中文、中文拼音、繁体中文、英文和日文这五大主流文字类型。同时,PP-OCRv5还提供了覆盖37种语言的多语种识别能力,包括韩文、西班牙文、法文、葡萄牙文、德文、意大利文、俄罗斯文等(具体支持语种及缩写详见[第四节](#四-支持语种及缩写))。相较于前代 PP-OCRv3 版本,PP-OCRv5 在多语言识别准确率上实现了超过30%的提升。 +[PP-OCRv5](./PP-OCRv5.md) 是 PP-OCR 系列的最新一代文字识别解决方案,专注于多场景、多语种的文字识别任务。在文字类型支持方面,默认配置的识别模型可准确识别简体中文、中文拼音、繁体中文、英文和日文这五大主流文字类型。同时,PP-OCRv5还提供了覆盖37种语言的多语种文字识别能力,包括韩文、西班牙文、法文、葡萄牙文、德文、意大利文、俄罗斯文等(具体支持语种及缩写详见[第四节](#四-支持语种及缩写))。相较于前代 PP-OCRv3 版本,PP-OCRv5 在多语言文字识别准确率上实现了超过30%的提升。 +
+ 法文识别结 +
+ 法文识别结果 +
-![img](https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/refs/heads/main/images/pipelines/ocr/japan_2_res.jpg) +
-![img](https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/refs/heads/main/images/pipelines/ocr/french_0_res.jpg) +
+ 德文识别结 +
+ 德文识别结果 +
-![img](https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/refs/heads/main/images/pipelines/ocr/german_0_res.png) +
-![img](https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/refs/heads/main/images/pipelines/ocr/korean_1_res.jpg) +
+ 韩文识别结果 +
+ 韩文识别结果 +
+ +
+ +
+ 俄文识别结果 +
+ 俄文识别结果 +
-![img](https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/refs/heads/main/images/pipelines/ocr/ru_0.jpeg) ## 二、快速使用 @@ -83,10 +103,20 @@ for res in result: ## 三、指标对比 -| 模型 |韩语数据集 精度 (%)| | 模型 | 拉丁字母语言数据集 精度 (%)| | 模型| 东斯拉夫语言数据集 精度 (%) | -|--|--|--|--|--|--|--|--| -| korean_PP-OCRv5_mobile_rec | 88.0 | | latin_PP-OCRv5_mobile_rec | 84.7 | | eslav_PP-OCRv5_mobile_rec | 85.8 | -| korean_PP-OCRv3_mobile_rec | 23.0 | | latin_PP-OCRv3_mobile_rec | 37.9 | | cyrillic_PP-OCRv3_mobile_rec| 50.2 | +| 模型 | 模型下载链接 | 韩语数据集 精度 (%) | +|-|-|-| +| korean_PP-OCRv5_mobile_rec |推理模型/训练模型 | 88.0| +| korean_PP-OCRv3_mobile_rec | 推理模型/训练模型 | 23.0 | + +| 模型 | 模型下载链接 |拉丁字母语言数据集 精度 (%) | +|-|-|-| +| latin_PP-OCRv5_mobile_rec | 推理模型/训练模型 | 84.7 | +| latin_PP-OCRv3_mobile_rec | 推理模型/训练模型 | 37.9 | + +| 模型 | 模型下载链接 | 东斯拉夫语言数据集 精度 (%) | +|-|-|-| +| eslav_PP-OCRv5_mobile_rec |推理模型/训练模型 | 81.6 | +| cyrillic_PP-OCRv3_mobile_rec | 推理模型/训练模型 | 50.2 | **注:** - 韩语数据集:PP-OCRv5 最新构建的包含了 5007 张韩语文本图片的识别数据集。 @@ -98,7 +128,7 @@ for res in result: | 语种 | 描述 | 缩写 | | 语种 | 描述 | 缩写 | | --- | --- | --- | ---|--- | --- | --- | | 中文 | Chinese & English | ch | | 匈牙利文 | Hungarian | hu | -| 英文 | English | en | | 塞尔维亚文(latin) | Serbian(latin) | rslatin | +| 英文 | English | en | | 塞尔维亚文(latin) | Serbian(latin) | rs_latin | | 法文 | French | fr | | 印度尼西亚文 | Indonesian | id | | 德文 | German | de | | 欧西坦文 | Occitan | oc | | 日文 | Japanese | japan | | 冰岛文 | Icelandic | is | @@ -118,3 +148,12 @@ for res in result: | 乌兹别克文 | Uzbek | uz | | 拉丁文 | Latin | la | | 俄罗斯文 | Russian | ru | | 白俄罗斯文 | Belarusian | be | | 乌克兰文 | Ukranian | uk | | | | | + + +## 五、模型及其支持的语种 + +| 模型 | 支持语种 | +|-|-| +| korean_PP-OCRv5_mobile_rec | 韩文 | +| latin_PP-OCRv5_mobile_rec |英文、法文、德文、南非荷兰文、意大利文、西班牙文、波斯尼亚文、葡萄牙文、捷克文、威尔士文、丹麦文、爱沙尼亚文、爱尔兰文、克罗地亚文、乌兹别克文、匈牙利文、塞尔维亚文(latin)、印度尼西亚文、欧西坦文、冰岛文、立陶宛文、毛利文、马来文、荷兰文、挪威文、波兰文、斯洛伐克文、斯洛文尼亚文、阿尔巴尼亚文、瑞典文、西瓦希里文、塔加洛文、土耳其文、拉丁文| +| eslav_PP-OCRv5_mobile_rec | 俄罗斯文、白俄罗斯文、乌克兰文 | diff --git a/docs/version3.x/module_usage/text_recognition.en.md b/docs/version3.x/module_usage/text_recognition.en.md index 1d281e95b1..ec50f45e13 100644 --- a/docs/version3.x/module_usage/text_recognition.en.md +++ b/docs/version3.x/module_usage/text_recognition.en.md @@ -259,7 +259,7 @@ en_PP-OCRv3_mobile_rec_infer.tar">Inference Model/Inference Model/Pre-trained Model -90.45 +88.0 5.43 / 1.46 21.20 / 5.32 14 @@ -279,7 +279,7 @@ latin_PP-OCRv5_mobile_rec_infer.tar">Inference Model/Inference Model/Pre-trained Model -85.8 +81.6 5.43 / 1.46 21.20 / 5.32 14 diff --git a/docs/version3.x/module_usage/text_recognition.md b/docs/version3.x/module_usage/text_recognition.md index 3cdc7001d5..7015a70d52 100644 --- a/docs/version3.x/module_usage/text_recognition.md +++ b/docs/version3.x/module_usage/text_recognition.md @@ -262,7 +262,7 @@ en_PP-OCRv3_mobile_rec_infer.tar">推理模型/推理模型/训练模型 -90.45 +88.0 5.43 / 1.46 21.20 / 5.32 14 @@ -282,7 +282,7 @@ latin_PP-OCRv5_mobile_rec_infer.tar">推理模型/推理模型/训练模型 -85.8 +81.6 5.43 / 1.46 21.20 / 5.32 14 diff --git a/docs/version3.x/pipeline_usage/OCR.en.md b/docs/version3.x/pipeline_usage/OCR.en.md index 0970e73c96..ad415e6441 100644 --- a/docs/version3.x/pipeline_usage/OCR.en.md +++ b/docs/version3.x/pipeline_usage/OCR.en.md @@ -420,7 +420,7 @@ en_PP-OCRv3_mobile_rec_infer.tar">Inference Model/Inference Model/Pre-trained Model -90.45 +88.0 5.43 / 1.46 21.20 / 5.32 14 @@ -440,7 +440,7 @@ latin_PP-OCRv5_mobile_rec_infer.tar">Inference Model/Inference Model/Pre-trained Model -85.8 +81.6 5.43 / 1.46 21.20 / 5.32 14 diff --git a/docs/version3.x/pipeline_usage/OCR.md b/docs/version3.x/pipeline_usage/OCR.md index 669ac634c3..c87057edd9 100644 --- a/docs/version3.x/pipeline_usage/OCR.md +++ b/docs/version3.x/pipeline_usage/OCR.md @@ -421,7 +421,7 @@ en_PP-OCRv3_mobile_rec_infer.tar">推理模型/推理模型/训练模型 -90.45 +88.0 5.43 / 1.46 21.20 / 5.32 14 @@ -441,7 +441,7 @@ latin_PP-OCRv5_mobile_rec_infer.tar">推理模型/推理模型/训练模型 -85.8 +81.6 5.43 / 1.46 21.20 / 5.32 14 diff --git a/mkdocs.yml b/mkdocs.yml index 9e237574dc..bf59533a4a 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -103,7 +103,7 @@ plugins: PP-OCRv5: PP-OCRv5 使用教程: Usage Tutorial PP-OCRv5简介: PP-OCRv5 Introduction - PP-OCRv5多语种文本识别简介: PP-OCRv5 Multilingual Text Recognition Introduction + PP-OCRv5多语种文字识别: PP-OCRv5 Multilingual Text Recognition PP-StructureV3: PP-StructureV3 PP-StructureV3简介: PP-StructureV3 Introduction PP-ChatOCRv4: PP-ChatOCRv4 @@ -268,7 +268,7 @@ nav: - PP-OCRv5: - 使用教程: version3.x/pipeline_usage/OCR.md - PP-OCRv5简介: version3.x/algorithm/PP-OCRv5/PP-OCRv5.md - - PP-OCRv5多语种文本识别简介: version3.x/algorithm/PP-OCRv5/PP-OCRv5_multi_languages.md + - PP-OCRv5多语种文字识别: version3.x/algorithm/PP-OCRv5/PP-OCRv5_multi_languages.md - PP-StructureV3: - 使用教程: version3.x/pipeline_usage/PP-StructureV3.md - PP-StructureV3简介: version3.x/algorithm/PP-StructureV3/PP-StructureV3.md