diff --git a/README.md b/README.md index 9cb2af6141..c0e53e872a 100644 --- a/README.md +++ b/README.md @@ -65,8 +65,57 @@ In addition to providing an outstanding model library, PaddleOCR 3.0 also offers ## 📣 Recent updates +### 🔥🔥2025.08.21: Release of PaddleOCR 3.2.0, includes: -#### **2025.06.29: Release of PaddleOCR 3.1.0**, includes: + +- **Significant Model Additions:** + - Introduced training, inference, and deployment for PP-OCRv5 recognition models in English, Thai, and Greek. **The PP-OCRv5 English model delivers an 11% improvement in English scenarios compared to the main PP-OCRv5 model, with the Thai and Greek recognition models achieving accuracies of 82.68% and 89.28%, respectively.** + +- **Deployment Capability Upgrades:** + - **Full support for PaddlePaddle framework versions 3.1.0 and 3.1.1.** + - **Comprehensive upgrade of the PP-OCRv5 C++ local deployment solution, now supporting both Linux and Windows, with feature parity and identical accuracy to the Python implementation.** + - **High-performance inference now supports CUDA 12, and inference can be performed using either the Paddle Inference or ONNX Runtime backends.** + - **The high-stability service-oriented deployment solution is now fully open-sourced, allowing users to customize Docker images and SDKs as required.** + - The high-stability service-oriented deployment solution also supports invocation via manually constructed HTTP requests, enabling client-side code development in any programming language. + +- **Benchmark Support:** + - **All production lines now support fine-grained benchmarking, enabling measurement of end-to-end inference time as well as per-layer and per-module latency data to assist with performance analysis.** + - **Documentation has been updated to include key metrics for commonly used configurations on mainstream hardware, such as inference latency and memory usage, providing deployment references for users.** + +- **Bug Fixes:** + - Resolved the issue of failed log saving during model training. + - Upgraded the data augmentation component for formula models for compatibility with newer versions of the albumentations dependency, and fixed deadlock warnings when using the tokenizers package in multi-process scenarios. + - Fixed inconsistencies in switch behaviors (e.g., `use_chart_parsing`) in the PP-StructureV3 configuration files compared to other pipelines. + +- **Other Enhancements:** + - **Separated core and optional dependencies. Only minimal core dependencies are required for basic text recognition; additional dependencies for document parsing and information extraction can be installed as needed.** + - **Enabled support for NVIDIA RTX 50 series graphics cards on Windows; users can refer to the [installation guide](docs/version3.x/installation.en.md) for the corresponding PaddlePaddle framework versions.** + - **PP-OCR series models now support returning single-character coordinates.** + - Added AIStudio, ModelScope, and other model download sources, allowing users to specify the source for model downloads. + - Added support for chart-to-table conversion via the PP-Chart2Table module. + - Optimized documentation descriptions to improve usability. + + +
+2025.08.15: PaddleOCR 3.1.1 Released + +- **Bug Fixes:** + - Added the missing methods `save_vector`, `save_visual_info_list`, `load_vector`, and `load_visual_info_list` in the `PP-ChatOCRv4` class. + - Added the missing parameters `glossary` and `llm_request_interval` to the `translate` method in the `PPDocTranslation` class. + +- **Documentation Improvements:** + - Added a demo to the MCP documentation. + - Added information about the PaddlePaddle and PaddleOCR version used for performance metrics testing in the documentation. + - Fixed errors and omissions in the production line document translation. + +- **Others:** + - Changed the MCP server dependency to use the pure Python library `puremagic` instead of `python-magic` to reduce installation issues. + - Retested PP-OCRv5 performance metrics with PaddleOCR version 3.1.0 and updated the documentation. + +
+ +
+2025.06.29: PaddleOCR 3.1.0 Released - **Key Models and Pipelines:** - **Added PP-OCRv5 Multilingual Text Recognition Model**, which supports the training and inference process for text recognition models in 37 languages, including French, Spanish, Portuguese, Russian, Korean, etc. **Average accuracy improved by over 30%.** [Details](https://paddlepaddle.github.io/PaddleOCR/latest/en/version3.x/algorithm/PP-OCRv5/PP-OCRv5_multi_languages.html) @@ -81,6 +130,8 @@ In addition to providing an outstanding model library, PaddleOCR 3.0 also offers - **Documentation Optimization:** Improved the descriptions in some user guides for a smoother reading experience. +
+
2025.06.26: PaddleOCR 3.0.3 Released - Bug Fix: Resolved the issue where the `enable_mkldnn` parameter was not effective, restoring the default behavior of using MKL-DNN for CPU inference. diff --git a/docs/update/update.en.md b/docs/update/update.en.md index 8bb0f27319..5f9e234a11 100644 --- a/docs/update/update.en.md +++ b/docs/update/update.en.md @@ -7,6 +7,37 @@ hide: ### Recently Update +#### 🔥🔥**2025.08.21: Release of PaddleOCR 3.2.0**, includes: + + +- **Significant Model Additions:** + - Introduced training, inference, and deployment for PP-OCRv5 recognition models in English, Thai, and Greek. **The PP-OCRv5 English model delivers an 11% improvement in English scenarios compared to the main PP-OCRv5 model, with the Thai and Greek recognition models achieving accuracies of 82.68% and 89.28%, respectively.** + +- **Deployment Capability Upgrades:** + - **Full support for PaddlePaddle framework versions 3.1.0 and 3.1.1.** + - **Comprehensive upgrade of the PP-OCRv5 C++ local deployment solution, now supporting both Linux and Windows, with feature parity and identical accuracy to the Python implementation.** + - **High-performance inference now supports CUDA 12, and inference can be performed using either the Paddle Inference or ONNX Runtime backends.** + - **The high-stability service-oriented deployment solution is now fully open-sourced, allowing users to customize Docker images and SDKs as required.** + - The high-stability service-oriented deployment solution also supports invocation via manually constructed HTTP requests, enabling client-side code development in any programming language. + +- **Benchmark Support:** + - **All production lines now support fine-grained benchmarking, enabling measurement of end-to-end inference time as well as per-layer and per-module latency data to assist with performance analysis.** + - **Documentation has been updated to include key metrics for commonly used configurations on mainstream hardware, such as inference latency and memory usage, providing deployment references for users.** + +- **Bug Fixes:** + - Resolved the issue of failed log saving during model training. + - Upgraded the data augmentation component for formula models for compatibility with newer versions of the albumentations dependency, and fixed deadlock warnings when using the tokenizers package in multi-process scenarios. + - Fixed inconsistencies in switch behaviors (e.g., `use_chart_parsing`) in the PP-StructureV3 configuration files compared to other pipelines. + +- **Other Enhancements:** + - **Separated core and optional dependencies. Only minimal core dependencies are required for basic text recognition; additional dependencies for document parsing and information extraction can be installed as needed.** + - **Enabled support for NVIDIA RTX 50 series graphics cards on Windows; users can refer to the [installation guide](../version3.x/installation.en.md) for the corresponding PaddlePaddle framework versions.** + - **PP-OCR series models now support returning single-character coordinates.** + - Added AIStudio, ModelScope, and other model download sources, allowing users to specify the source for model downloads. + - Added support for chart-to-table conversion via the PP-Chart2Table module. + - Optimized documentation descriptions to improve usability. + + #### **2025.08.15: Release of PaddleOCR 3.1.1**, includes: - **Bug Fixes:** diff --git a/docs/update/update.md b/docs/update/update.md index d879a6f8aa..04d9c5b2af 100644 --- a/docs/update/update.md +++ b/docs/update/update.md @@ -7,6 +7,35 @@ hide: ### 更新 +#### 2025.08.21: **PaddleOCR 3.2.0** 发布,新增能力如下: + +- **重要模型新增:** + - 新增 PP-OCRv5 英文、泰文、希腊文识别模型的训练、推理、部署。**其中 PP-OCRv5 英文模型较 PP-OCRv5 主模型在英文场景提升 11%,泰文识别模型精度 82.68%,希腊文识别模型精度 89.28%。** + +- **部署能力升级:** + - **全面支持飞桨框架 3.1.0 和 3.1.1 版本。** + - **全面升级 PP-OCRv5 C++ 本地部署方案,支持 Linux、Windows,功能及精度效果与 Python 方案保持一致。** + - **高性能推理支持 CUDA 12,可使用 Paddle Inference、ONNX Runtime 后端推理。** + - **高稳定性服务化部署方案全面开源,支持用户根据需求对 Docker 镜像和 SDK 进行定制化修改。** + - 高稳定性服务化部署方案支持通过手动构造HTTP请求的方式调用,该方式允许客户端代码使用任意编程语言编写。 + +- **Benchmark支持**: + - **全部产线支持产线细粒度 benchmark,能够测量产线端到端推理时间以及逐层、逐模块的耗时数据,可用于辅助产线性能分析。** + - **文档中补充各产线常用配置在主流硬件上的关键指标,包括推理耗时和内存占用等,为用户部署提供参考。** + +- **Bug修复:** + - 修复模型训练时训练日志保存失败的问题。 + - 对公式模型的数据增强部分进行了版本兼容性升级,以适应新版本的 albumentations 依赖,并修复了在多进程使用 tokenizers 依赖包时出现的死锁警告。 + - 修复 PP-StructureV3 配置文件中的 `use_chart_parsing` 等开关行为与其他产线不统一的问题。 + +- **其他升级:** + - **分离必要依赖与可选依赖。使用基础文字识别功能时,仅需安装少量核心依赖;若需文档解析、信息抽取等功能,用户可按需选择安装额外依赖。** + - **支持 Windows 用户使用英伟达 50 系显卡,可根据 [安装文档](../docs/version3.x/installation.md) 安装对应版本的 paddle 框架。** + - **PP-OCR 系列模型支持返回单文字坐标。** + - 模型新增 AIStudio、ModelScope 等下载源。可指定相关下载源下载对应的模型。 + - 支持图表转表PP-Chart2Table单功能模块推理能力。 + - 优化部分使用文档中的描述,提升易用性。 + #### 2025.08.15: **PaddleOCR 3.1.1** 发布,新增能力如下: - **bug修复:** diff --git a/readme/README_ar.md b/readme/README_ar.md index 80f92b7a76..91e031977f 100644 --- a/readme/README_ar.md +++ b/readme/README_ar.md @@ -50,6 +50,84 @@ ## 📣 آخر التحديثات +

2025.08.21: إصدار PaddleOCR 3.2.0، يتضمن:

+

2025.08.15: إصدار PaddleOCR 3.1.1، يتضمن: