diff --git a/docs/version3.x/algorithm/PP-OCRv5/PP-OCRv5.en.md b/docs/version3.x/algorithm/PP-OCRv5/PP-OCRv5.en.md new file mode 100644 index 0000000000..a139cee474 --- /dev/null +++ b/docs/version3.x/algorithm/PP-OCRv5/PP-OCRv5.en.md @@ -0,0 +1,203 @@ +# Introduction to PP-OCRv5 + +**PP-OCRv5** is the new generation text recognition solution of PP-OCR, focusing on multi-scenario and multi-text type recognition. In terms of text types, PP-OCRv5 supports 5 major mainstream text types: Simplified Chinese, Chinese Pinyin, Traditional Chinese, English, and Japanese. For scenarios, PP-OCRv5 has upgraded recognition capabilities for challenging scenarios such as complex Chinese and English handwriting, vertical text, and uncommon characters. On internal complex evaluation sets across multiple scenarios, PP-OCRv5 achieved a 13 percentage point end-to-end improvement over PP-OCRv4. + +
+ +
+ +# Key Metrics + +### 1. Text Detection Metrics + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ModelHandwritten ChineseHandwritten EnglishPrinted ChinesePrinted EnglishTraditional ChineseAncient TextJapaneseGeneral ScenarioPinyinRotationDistortionArtistic TextAverage
PP-OCRv5_server_det0.8030.8410.9450.9170.8150.6760.7720.7970.6710.80.8760.6730.827
PP-OCRv4_server_det0.7060.2490.8880.6900.7590.4730.6850.7150.5420.3660.7750.5830.662
PP-OCRv5_mobile_det0.7440.7770.9050.9100.8230.5810.7270.7210.5750.6470.8270.5250.770
PP-OCRv4_mobile_det0.5830.3690.8720.7730.6630.2310.6340.7100.4300.2990.7150.5490.624
+ +Compared to PP-OCRv4, PP-OCRv5 shows significant improvement in all detection scenarios, especially in handwriting, ancient texts, and Japanese detection capabilities. + +### 2. Text Recognition Metrics + +
+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Evaluation Set CategoryHandwritten ChineseHandwritten EnglishPrinted ChinesePrinted EnglishTraditional ChineseAncient TextJapaneseConfusable CharactersGeneral ScenarioPinyinVertical TextArtistic TextWeighted Average
PP-OCRv5_server_rec0.58070.58060.90130.86790.74720.60390.73720.59460.83840.74350.93140.63970.8401
PP-OCRv4_server_rec0.36260.26610.84860.66770.40970.30800.46230.50280.83620.26940.54550.58920.5735
PP-OCRv5_mobile_rec0.41660.49440.86050.87530.71990.57860.75770.55700.77030.72480.80890.53980.8015
PP-OCRv4_mobile_rec0.29800.25500.83980.65980.32180.25930.47240.45990.81060.25930.59240.55550.5301
+ +A single model can cover multiple languages and text types, with recognition accuracy significantly ahead of previous generation products and mainstream open-source solutions. + +# PP-OCRv5 Demo Examples + +# Deployment and Secondary Development +* **Multiple System Support**: Compatible with mainstream operating systems including Windows, Linux, and Mac. +* **Multiple Hardware Support**: Besides NVIDIA GPUs, it also supports inference and deployment on Intel CPU, Kunlun chips, Ascend, and other new hardware. +* **High-Performance Inference Plugin**: Recommended to combine with high-performance inference plugins to further improve inference speed. See [High-Performance Inference Guide](../../deployment/high_performance_inference.md) for details. +* **Service Deployment**: Supports highly stable service deployment solutions. See [Service Deployment Guide](../../deployment/serving.md) for details. +* **Secondary Development Capability**: Supports custom dataset training, dictionary extension, and model fine-tuning. Example: To add Korean recognition, you can extend the dictionary and fine-tune the model, seamlessly integrating into existing production lines. See [Text Recognition Module Usage Tutorial](../../module_usage/text_recognition.md) for details. diff --git a/docs/version3.x/algorithm/PP-OCRv5/PP-OCRv5.md b/docs/version3.x/algorithm/PP-OCRv5/PP-OCRv5.md new file mode 100644 index 0000000000..03649a2e8f --- /dev/null +++ b/docs/version3.x/algorithm/PP-OCRv5/PP-OCRv5.md @@ -0,0 +1,207 @@ +# 一、PP-OCRv5简介 +**PP-OCRv5** 是PP-OCR新一代文字识别解决方案,该方案聚焦于多场景、多文字类型的文字识别。在文字类型方面,PP-OCRv5支持简体中文、中文拼音、繁体中文、英文、日文5大主流文字类型,在场景方面,PP-OCRv5升级了中英复杂手写体、竖排文本、生僻字等多种挑战性场景的识别能力。在内部多场景复杂评估集上,PP-OCRv5较PP-OCRv4端到端提升13个百分点。 + +
+ +
+ + +# 二、关键指标 +### 1. 文本检测指标 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
模型手写中文手写英文印刷中文印刷英文繁体中文古籍文本日文通用场景拼音旋转扭曲艺术字平均
PP-OCRv5_server_det0.8030.8410.9450.9170.8150.6760.7720.7970.6710.80.8760.6730.827
PP-OCRv4_server_det0.7060.2490.8880.6900.7590.4730.6850.7150.5420.3660.7750.5830.662
PP-OCRv5_mobile_det0.7440.7770.9050.9100.8230.5810.7270.7210.5750.6470.8270.5250.770
PP-OCRv4_mobile_det0.5830.3690.8720.7730.6630.2310.6340.7100.4300.2990.7150.5490.624
+ +对比PP-OCRv4,PP-OCRv5在所有检测场景下均有明显提升,尤其在手写、古籍、日文检测能力上表现更优。 + +### 2. 文本识别指标 + + +
+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
评估集类别手写中文手写英文印刷中文印刷英文繁体中文古籍文本日文易混淆字符通用场景拼音竖直文本艺术字加权平均
PP-OCRv5_server_rec0.58070.58060.90130.86790.74720.60390.73720.59460.83840.74350.93140.63970.8401
PP-OCRv4_server_rec0.36260.26610.84860.66770.40970.30800.46230.50280.83620.26940.54550.58920.5735
PP-OCRv5_mobile_rec0.41660.49440.86050.87530.71990.57860.75770.55700.77030.72480.80890.53980.8015
PP-OCRv4_mobile_rec0.29800.25500.83980.65980.32180.25930.47240.45990.81060.25930.59240.55550.5301
+ +单模型即可覆盖多语言和多类型文本,识别精度大幅领先前代产品和主流开源方案。 + + +# 三、PP-OCRv5 Demo示例 + + + +# 四、部署与二次开发 +* **多系统支持**:兼容Windows、Linux、Mac等主流操作系统。 +* **多硬件支持**:除了英伟达GPU外,还支持Intel CPU、昆仑芯、昇腾等新硬件推理和部署。 +* **高性能推理插件**:推荐结合高性能推理插件进一步提升推理速度,详见[高性能推理指南](../../deployment/high_performance_inference.md)。 +* **服务化部署**:支持高稳定性服务化部署方案,详见[服务化部署指南](../../deployment/serving.md)。 +* **二次开发能力**:支持自定义数据集训练、字典扩展、模型微调。举例:如需增加韩文识别,可扩展字典并微调模型,无缝集成到现有产线,详见[文本识别模块使用教程](../../module_usage/text_recognition.md)