From f35b9a0c0c32a8c09e7f7c83c5e190793cbfabbe Mon Sep 17 00:00:00 2001 From: zhang-prog <69562787+zhang-prog@users.noreply.github.com> Date: Mon, 19 May 2025 23:18:11 +0800 Subject: [PATCH] update docs (#15181) * update docs * update v3 model_list * update quick_start and PP-StructureV3 --- docs/version2.x/legacy/index.en.md | 30 +- docs/version2.x/legacy/index.md | 6 +- docs/version2.x/ppocr/installation.en.md | 85 ++ docs/version2.x/ppocr/installation.md | 74 +- .../high_performance_inference.en.md | 94 ++ .../deployment/obtaining_onnx_models.en.md | 48 + .../deployment/on_device_deployment.en.md | 3 + docs/version3.x/deployment/serving.en.md | 92 ++ docs/version3.x/deployment/serving.md | 4 +- docs/version3.x/installation.md | 50 + docs/version3.x/logging.en.md | 22 + docs/version3.x/model_list.md | 922 ++++++++++++++++++ docs/version3.x/paddleocr_and_paddlex.en.md | 69 ++ .../pipeline_usage/PP-StructureV3.md | 10 +- docs/version3.x/quick_start.md | 175 ++++ 15 files changed, 1638 insertions(+), 46 deletions(-) create mode 100644 docs/version2.x/ppocr/installation.en.md create mode 100644 docs/version3.x/deployment/high_performance_inference.en.md create mode 100644 docs/version3.x/deployment/obtaining_onnx_models.en.md create mode 100644 docs/version3.x/deployment/on_device_deployment.en.md create mode 100644 docs/version3.x/deployment/serving.en.md create mode 100644 docs/version3.x/installation.md create mode 100644 docs/version3.x/logging.en.md create mode 100644 docs/version3.x/model_list.md create mode 100644 docs/version3.x/paddleocr_and_paddlex.en.md create mode 100644 docs/version3.x/quick_start.md diff --git a/docs/version2.x/legacy/index.en.md b/docs/version2.x/legacy/index.en.md index f8f6e79c01..643ae600c2 100644 --- a/docs/version2.x/legacy/index.en.md +++ b/docs/version2.x/legacy/index.en.md @@ -1,24 +1,30 @@ --- -comments: true typora-copy-images-to: images +comments: true hide: - toc --- -# PP-OCR Deployment +# Legacy Features -## Paddle Deployment Introduction +## Overview -Paddle provides a variety of deployment schemes to meet the deployment requirements of different scenarios. Please choose according to the actual situation: +This section introduces the features and models related to the PaddleOCR 2.x branch. Due to the upgrades in the 3.x branch, some models and features are no longer compatible with the older branch. Therefore, users who need to use or refer to the features of the older branch can refer to this part of the documentation. -![img](./images/deployment_en.jpg) +## Models Supported by PaddleOCR 2.x Branch: -PP-OCR has supported multi deployment schemes. Click the link to get the specific tutorial. +* [Model List](model_list.md) -- [Python Inference](./python_infer.en.md) -- [C++ Inference](./cpp_infer.en.md) -- [Serving (Python/C++)](./paddle_server.en.md) -- [Paddle-Lite (ARM CPU/OpenCL ARM GPU)](./lite.en.md) -- [Paddle2ONNX](./paddle2onnx.en.md) +## Features Supported by PaddleOCR 2.x Branch: -If you need the deployment tutorial of academic algorithm models other than PP-OCR, please directly enter the main page of corresponding algorithms, [entrance](../../algorithm/overview.en.md)。 +* [Inference with Python Prediction Engine](python_infer.en.md) +* [Inference with C++ Prediction Engine](cpp_infer.en.md) +* [Compilation Guide for Visual Studio 2019 Community CMake](windows_vs2019_build.en.md) +* [Service-Oriented Deployment](paddle_server.en.md) +* [Android Deployment](android_demo.en.md) +* [Jetson Deployment](Jetson_infer.en.md) +* [Edge Device Deployment](lite.en.md) +* [Web Frontend Deployment](paddle_js.en.md) +* [Paddle2ONNX Model Conversion and Prediction](paddle2onnx.en.md) +* [PaddlePaddle Cloud Deployment Tool](paddle_cloud.en.md) +* [Benchmark](benchmark.en.md) diff --git a/docs/version2.x/legacy/index.md b/docs/version2.x/legacy/index.md index 7952d9f5c4..2dd679ea08 100644 --- a/docs/version2.x/legacy/index.md +++ b/docs/version2.x/legacy/index.md @@ -9,13 +9,13 @@ hide: ## 概述 -本节介绍了 PaddleOCR 2.x 版本的相关功能和模型。由于最新版本的升级,部分模型和功能与旧版本不再兼容。因此,需要使用或参考旧版本特性的用户可以参考这部分文档。 +本节介绍了 PaddleOCR 2.x 分支的相关功能和模型。由于 3.x 分支的升级,部分模型和功能与旧分支不再兼容。因此,需要使用或参考旧分支特性的用户可以参考这部分文档。 -## PaddleOCR 2.x 版本支持的模型: +## PaddleOCR 2.x 分支支持的模型: * [模型列表](model_list.md) -## PaddleOCR 2.x 版本支持的功能: +## PaddleOCR 2.x 分支支持的功能: * [基于Python预测引擎推理](python_infer.md) * [基于C++预测引擎推理](cpp_infer.md) diff --git a/docs/version2.x/ppocr/installation.en.md b/docs/version2.x/ppocr/installation.en.md new file mode 100644 index 0000000000..244e88dffa --- /dev/null +++ b/docs/version2.x/ppocr/installation.en.md @@ -0,0 +1,85 @@ +--- +comments: true +--- + +## Quick Installation + +After testing, PaddleOCR can run on glibc 2.23. You can also test other glibc versions or install glibc 2.23 for the best compatibility. + +PaddleOCR working environment: + +- PaddlePaddle > 2.0.0 +- Python 3 +- glibc 2.23 + +It is recommended to use the docker provided by us to run PaddleOCR. Please refer to the docker tutorial [link](https://www.runoob.com/docker/docker-tutorial.html/). + +*If you want to directly run the prediction code on Mac or Windows, you can start from step 2.* + +### 1. (Recommended) Prepare a docker environment + +For the first time you use this docker image, it will be downloaded automatically. Please be patient. + +```bash linenums="1" +# Switch to the working directory +cd /home/Projects +# You need to create a docker container for the first run, and do not need to run the current command when you run it again +# Create a docker container named ppocr and map the current directory to the /paddle directory of the container + +#If using CPU, use docker instead of nvidia-docker to create docker +sudo docker run --name ppocr -v $PWD:/paddle --network=host -it paddlepaddle/paddle:latest-dev-cuda10.1-cudnn7-gcc82 /bin/bash +``` + +With CUDA10, please run the following command to create a container. +It is recommended to set a shared memory greater than or equal to 32G through the --shm-size parameter: + +```bash linenums="1" +sudo nvidia-docker run --name ppocr -v $PWD:/paddle --shm-size=64G --network=host -it paddlepaddle/paddle:latest-dev-cuda10.1-cudnn7-gcc82 /bin/bash +``` + +You can also visit [DockerHub](https://hub.docker.com/r/paddlepaddle/paddle/tags/) to get the image that fits your machine. + +```bash linenums="1" +# ctrl+P+Q to exit docker, to re-enter docker using the following command: +sudo docker container exec -it ppocr /bin/bash +``` + +### 2. Install PaddlePaddle 2.0 + +```bash linenums="1" +pip3 install --upgrade pip + +# If you have cuda9 or cuda10 installed on your machine, please run the following command to install +python3 -m pip install paddlepaddle-gpu==2.0.0 -i https://mirror.baidu.com/pypi/simple + +# If you only have cpu on your machine, please run the following command to install +python3 -m pip install paddlepaddle==2.0.0 -i https://mirror.baidu.com/pypi/simple +``` + +For more software version requirements, please refer to the instructions in [Installation Document](https://www.paddlepaddle.org.cn/install/quick) for operation. + +### 3. Clone PaddleOCR repo + +```bash linenums="1" +# Recommend +git clone https://github.com/PaddlePaddle/PaddleOCR + +# If you cannot pull successfully due to network problems, you can switch to the mirror hosted on Gitee: + +git clone https://gitee.com/paddlepaddle/PaddleOCR + +# Note: The mirror on Gitee may not keep in synchronization with the latest update with the project on GitHub. There might be a delay of 3-5 days. Please try GitHub at first. +``` + +### 4. Install third-party libraries + +```bash linenums="1" +cd PaddleOCR +pip3 install -r requirements.txt +``` + +If you getting this error `OSError: [WinError 126] The specified module could not be found` when you install shapely on windows. + +Please try to download Shapely whl file from [http://www.lfd.uci.edu/~gohlke/pythonlibs/#shapely](http://www.lfd.uci.edu/~gohlke/pythonlibs/#shapely). + +Reference: [Solve shapely installation on windows](https://stackoverflow.com/questions/44398265/install-shapely-oserror-winerror-126-the-specified-module-could-not-be-found) diff --git a/docs/version2.x/ppocr/installation.md b/docs/version2.x/ppocr/installation.md index 74a3eeef65..30e35795fb 100644 --- a/docs/version2.x/ppocr/installation.md +++ b/docs/version2.x/ppocr/installation.md @@ -2,49 +2,75 @@ comments: true --- -# 安装 +## 快速安装 -# 1. 安装飞桨框架 +经测试PaddleOCR可在glibc 2.23上运行,您也可以测试其他glibc版本或安装glic 2.23 +PaddleOCR 工作环境 -请参考 [飞桨官网](https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/develop/install/pip/linux-pip.html) 安装 `3.0` 及以上版本的飞桨框架。**推荐使用飞桨官方 Docker 镜像。** +- PaddlePaddle 2.0.0 +- python3 +- glibc 2.23 +- cuDNN 7.6+ (GPU) -# 2. 安装 PaddleOCR +建议使用我们提供的docker运行PaddleOCR,有关docker、nvidia-docker使用请参考[链接](https://www.runoob.com/docker/docker-tutorial.html/)。 -如果只希望使用 PaddleOCR 的推理功能,请参考 [安装推理包](#21-安装推理包);如果希望进行模型训练、导出等,请参考 [安装训练依赖](#22-安装训练依赖)。在同一环境中安装推理包和训练依赖是允许的,无需进行环境隔离。 +*如您希望使用 mac 或 windows直接运行预测代码,可以从第2步开始执行。* -## 2.1 安装推理包 +### 1. (建议)准备docker环境 -从 PyPI 安装最新版本 PaddleOCR 推理包: +第一次使用这个镜像,会自动下载该镜像,请耐心等待 -```bash -python -m pip install paddleocr +```bash linenums="1" +# 切换到工作目录下 +cd /home/Projects +# 首次运行需创建一个docker容器,再次运行时不需要运行当前命令 +# 创建一个名字为ppocr的docker容器,并将当前目录映射到容器的/paddle目录下 + +如果您希望在CPU环境下使用docker,使用docker而不是nvidia-docker创建docker +sudo docker run --name ppocr -v $PWD:/paddle --network=host -it paddlepaddle/paddle:latest-dev-cuda10.1-cudnn7-gcc82 /bin/bash + +如果使用CUDA10,请运行以下命令创建容器,设置docker容器共享内存shm-size为64G,建议设置32G以上 +sudo nvidia-docker run --name ppocr -v $PWD:/paddle --shm-size=64G --network=host -it paddlepaddle/paddle:latest-dev-cuda10.1-cudnn7-gcc82 /bin/bash + +您也可以访问[DockerHub](https://hub.docker.com/r/paddlepaddle/paddle/tags/)获取与您机器适配的镜像。 + +# ctrl+P+Q可退出docker 容器,重新进入docker 容器使用如下命令 +sudo docker container exec -it ppocr /bin/bash ``` -或者从源码安装(默认为开发分支): +### 2. 安装PaddlePaddle 2.0 -```bash -python -m pip install "git+https://github.com/PaddlePaddle/PaddleOCR.git" +```bash linenums="1" +pip3 install --upgrade pip + +# 如果您的机器安装的是CUDA9或CUDA10,请运行以下命令安装 +python3 -m pip install paddlepaddle-gpu==2.0.0 -i https://mirror.baidu.com/pypi/simple + +# 如果您的机器是CPU,请运行以下命令安装 +python3 -m pip install paddlepaddle==2.0.0 -i https://mirror.baidu.com/pypi/simple + +# 更多的版本需求,请参照[安装文档](https://www.paddlepaddle.org.cn/install/quick)中的说明进行操作。 ``` -## 2.2 安装训练依赖 +### 3. 克隆PaddleOCR repo代码 -要进行模型训练、导出等,需要首先将仓库克隆到本地: - -```bash -# 推荐方式 +```bash linenums="1" +#【推荐】 git clone https://github.com/PaddlePaddle/PaddleOCR -# (可选)切换到指定分支 -git checkout release/3.0 +# 如果因为网络问题无法pull成功,也可选择使用码云上的托管: -# 如果因为网络问题无法克隆成功,也可选择使用码云上的仓库: git clone https://gitee.com/paddlepaddle/PaddleOCR -# 注:码云托管代码可能无法实时同步本 GitHub 项目更新,存在3~5天延时,请优先使用推荐方式。 +# 注:码云托管代码可能无法实时同步本github项目更新,存在3~5天延时,请优先使用推荐方式。 ``` -执行如下命令安装依赖: +### 4. 安装第三方库 -```bash -python -m pip install -r requirements.txt +```bash linenums="1" +cd PaddleOCR +pip3 install -r requirements.txt ``` + +注意,windows环境下,建议从[这里](https://www.lfd.uci.edu/~gohlke/pythonlibs/#shapely)下载shapely安装包完成安装, +直接通过pip安装的shapely库可能出现`[winRrror 126] 找不到指定模块的问题`。 diff --git a/docs/version3.x/deployment/high_performance_inference.en.md b/docs/version3.x/deployment/high_performance_inference.en.md new file mode 100644 index 0000000000..4d93121431 --- /dev/null +++ b/docs/version3.x/deployment/high_performance_inference.en.md @@ -0,0 +1,94 @@ +# High-Performance Inference + +In real-world production environments, many applications have stringent performance requirements for deployment strategies, particularly regarding response speed, to ensure efficient system operation and a smooth user experience. PaddleOCR provides high-performance inference capabilities, allowing users to enhance model inference speed with a single click without worrying about complex configurations or underlying details. Specifically, PaddleOCR's high-performance inference functionality can: + +- Automatically select an appropriate inference backend (e.g., Paddle Inference, OpenVINO, ONNX Runtime, TensorRT) based on prior knowledge and configure acceleration strategies (e.g., increasing the number of inference threads, setting FP16 precision inference); +- Automatically convert PaddlePaddle static graph models to ONNX format as needed to leverage better inference backends for acceleration. + +This document primarily introduces the installation and usage methods for high-performance inference. + +## 1. Prerequisites + +### 1.1 Install High-Performance Inference Dependencies + +Install the dependencies required for high-performance inference using the PaddleOCR CLI: + +```bash +paddleocr install_hpi_deps {device_type} +``` + +The supported device types are: + +- `cpu`: For CPU-only inference. Currently supports Linux systems, x86-64 architecture processors, and Python 3.8-3.12. +- `gpu`: For inference using either CPU or NVIDIA GPU. Currently supports Linux systems, x86-64 architecture processors, and Python 3.8-3.12. Refer to the next subsection for detailed instructions. + +Only one type of device dependency should exist in the same environment. For Windows systems, it is currently recommended to install within a Docker container or [WSL](https://learn.microsoft.com/en-us/windows/wsl/install) environment. + +**It is recommended to use the official PaddlePaddle Docker image to install high-performance inference dependencies.** The corresponding images for each device type are as follows: + +- `cpu`: `ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/paddle:3.0.0` +- `gpu`: + - CUDA 11.8: `ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/paddle:3.0.0-gpu-cuda11.8-cudnn8.9-trt8.6` + +### 1.2 Detailed GPU Environment Instructions + +First, ensure that the environment has the required CUDA and cuDNN versions installed. Currently, PaddleOCR only supports CUDA and cuDNN versions compatible with CUDA 11.8 + cuDNN 8.9. Below are the installation instructions for CUDA 11.8 and cuDNN 8.9: + +- [Install CUDA 11.8](https://developer.nvidia.com/cuda-11-8-0-download-archive) +- [Install cuDNN 8.9](https://docs.nvidia.com/deeplearning/cudnn/archives/cudnn-890/install-guide/index.html) + +If using the official PaddlePaddle image, the CUDA and cuDNN versions in the image already meet the requirements, and no additional installation is needed. + +If installing PaddlePaddle via pip, the relevant Python packages for CUDA and cuDNN will typically be installed automatically. In this case, **you still need to install the non-Python-specific CUDA and cuDNN versions.** It is also recommended to install CUDA and cuDNN versions that match the Python package versions in your environment to avoid potential issues caused by coexisting library versions. You can check the versions of the CUDA and cuDNN-related Python packages with the following commands: + +```bash +# CUDA-related Python package versions +pip list | grep nvidia-cuda +# cuDNN-related Python package versions +pip list | grep nvidia-cudnn +``` + +Secondly, ensure that the environment has the required TensorRT version installed. Currently, PaddleOCR only supports TensorRT 8.6.1.6. If using the official PaddlePaddle image, you can install the TensorRT wheel package with the following command: + +```bash +python -m pip install /usr/local/TensorRT-*/python/tensorrt-*-cp310-none-linux_x86_64.whl +``` + +For other environments, refer to the [TensorRT documentation](https://docs.nvidia.com/deeplearning/tensorrt/archives/index.html) to install TensorRT. Here is an example: + +```bash +# Download the TensorRT tar file +wget https://developer.nvidia.com/downloads/compute/machine-learning/tensorrt/secure/8.6.1/tars/TensorRT-8.6.1.6.Linux.x86_64-gnu.cuda-11.8.tar.gz +# Extract the TensorRT tar file +tar xvf TensorRT-8.6.1.6.Linux.x86_64-gnu.cuda-11.8.tar.gz +# Install the TensorRT wheel package +python -m pip install TensorRT-8.6.1.6/python/tensorrt-8.6.1-cp310-none-linux_x86_64.whl +# Add the absolute path of the TensorRT `lib` directory to LD_LIBRARY_PATH +export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:TensorRT-8.6.1.6/lib" +``` + +## 2. Executing High-Performance Inference + +For the PaddleOCR CLI, specify `--enable_hpi` as `True` to execute high-performance inference. For example: + +```bash +paddleocr ocr --enable_hpi True ... +``` + +For the PaddleOCR Python API, set `enable_hpi` to `True` when initializing the pipeline or module object to enable high-performance inference when calling the inference method. For example: + +```python +from paddleocr import PaddleOCR +pipeline = PaddleOCR(enable_hpi=True) +result = pipeline.predict(...) +``` + +## 3. Notes + +1. For some models, the first execution of high-performance inference may take longer to complete the construction of the inference engine. Relevant information about the inference engine will be cached in the model directory after the first construction, and subsequent initializations can reuse the cached content to improve speed. + +2. Currently, due to reasons such as not using static graph format models or the presence of unsupported operators, some models may not achieve inference acceleration. + +3. During high-performance inference, PaddleOCR automatically handles the conversion of model formats and selects the optimal inference backend whenever possible. Additionally, PaddleOCR supports users specifying ONNX models. For information on converting PaddlePaddle static graph models to ONNX format, refer to [Obtaining ONNX Models](./obtaining_onnx_models.en.md). + +4. The high-performance inference capabilities of PaddleOCR rely on PaddleX and its high-performance inference plugins. By passing in a custom PaddleX production line configuration file, you can configure the inference backend and other related settings. Please refer to [Using PaddleX Production Line Configuration Files](../paddleocr_and_paddlex.en.md#3-Using-PaddleX-Pipeline-Configuration-Files) and the [PaddleX High-Performance Inference Guide](https://paddlepaddle.github.io/PaddleX/3.0/en/pipeline_deploy/high_performance_inference.html#22) to learn how to adjust the high-performance inference configurations. diff --git a/docs/version3.x/deployment/obtaining_onnx_models.en.md b/docs/version3.x/deployment/obtaining_onnx_models.en.md new file mode 100644 index 0000000000..25ca0002eb --- /dev/null +++ b/docs/version3.x/deployment/obtaining_onnx_models.en.md @@ -0,0 +1,48 @@ +# Obtaining ONNX Models + +PaddleOCR provides a rich collection of pre-trained models, all stored in PaddlePaddle's static graph format. To use these models in ONNX format during deployment, you can convert them using the Paddle2ONNX plugin provided by PaddleX. For more information about PaddleX and its relationship with PaddleOCR, refer to [Differences and Connections Between PaddleOCR and PaddleX](../paddleocr_and_paddlex.en.md#1-Differences-and-Connections-Between-PaddleOCR-and-PaddleX). + +First, install the Paddle2ONNX plugin for PaddleX using the following command via the PaddleX CLI: + +```bash +paddlex --install paddle2onnx +``` + +Then, execute the following command to complete the model conversion: + +```bash +paddlex \ + --paddle2onnx \ # Use the paddle2onnx feature + --paddle_model_dir /your/paddle_model/dir \ # Specify the directory containing the Paddle model + --onnx_model_dir /your/onnx_model/output/dir \ # Specify the output directory for the converted ONNX model + --opset_version 7 # Specify the ONNX opset version to use +``` + +The parameters are described as follows: + + + + + + + + + + + + + + + + + + + + + + + + + + +
ParameterTypeDescription
paddle_model_dirstrThe directory containing the Paddle model.
onnx_model_dirstrThe output directory for the ONNX model. It can be the same as the Paddle model directory. Defaults to onnx.
opset_versionintThe ONNX opset version to use. If conversion fails with a lower opset version, a higher version will be automatically selected for conversion. Defaults to 7.
diff --git a/docs/version3.x/deployment/on_device_deployment.en.md b/docs/version3.x/deployment/on_device_deployment.en.md new file mode 100644 index 0000000000..d9cce60d3d --- /dev/null +++ b/docs/version3.x/deployment/on_device_deployment.en.md @@ -0,0 +1,3 @@ +# Edge Deployment + +The PaddleOCR model can be deployed on edge devices using the [PaddleX Edge Deployment Solution](https://paddlepaddle.github.io/PaddleX/3.0/en/pipeline_deploy/edge_deploy.html). For more information about PaddleX and its relationship with PaddleOCR, please refer to [Differences and Connections between PaddleOCR and PaddleX](../paddleocr_and_paddlex.en.md#1-Differences-and-Connections-Between-PaddleOCR-and-PaddleX). diff --git a/docs/version3.x/deployment/serving.en.md b/docs/version3.x/deployment/serving.en.md new file mode 100644 index 0000000000..9949c73ae3 --- /dev/null +++ b/docs/version3.x/deployment/serving.en.md @@ -0,0 +1,92 @@ +# Serving Deployment + +Serving deployment is a common deployment method in real-world production environments. By encapsulating inference capabilities as services, clients can access these services via network requests to obtain inference results. PaddleOCR recommends using [PaddleX](https://github.com/PaddlePaddle/PaddleX) for serving deployment. Please refer to [Differences and Connections between PaddleOCR and PaddleX](../paddleocr_and_paddlex.en.md#1-Differences-and-Connections-Between-PaddleOCR-and-PaddleX) to understand the relationship between PaddleOCR and PaddleX. + +PaddleX provides the following serving deployment solutions: + +- **Basic Serving Deployment**: An easy-to-use serving deployment solution with low development costs. +- **High-Stability Serving Deployment**: Built based on [NVIDIA Triton Inference Server](https://developer.nvidia.com/triton-inference-server). Compared to the basic serving deployment, this solution offers higher stability and allows users to adjust configurations to optimize performance. + +**It is recommended to first use the basic serving deployment solution for quick validation**, and then evaluate whether to try more complex solutions based on actual needs. + +## 1. Basic Serving Deployment + +### 1.1 Install Dependencies + +Run the following command to install the PaddleX serving deployment plugin via PaddleX CLI: + +```bash +paddlex --install serving +``` + +### 1.2 Run the Server + +Run the server via PaddleX CLI: + +```bash +paddlex --serve --pipeline {PaddleX pipeline registration name or pipeline configuration file path} [{other command-line options}] +``` + +Take the general OCR pipeline as an example: + +```bash +paddlex --serve --pipeline OCR +``` + +You should see information similar to the following: + +```text +INFO: Started server process [63108] +INFO: Waiting for application startup. +INFO: Application startup complete. +INFO: Uvicorn running on http://0.0.0.0:8080 (Press CTRL+C to quit) +``` + +To adjust configurations (such as model path, batch size, deployment device, etc.), specify `--pipeline` as a custom configuration file. Refer to [PaddleOCR and PaddleX](../paddleocr_and_paddlex.en.md) for the mapping between PaddleOCR pipelines and PaddleX pipeline registration names, as well as how to obtain and modify PaddleX pipeline configuration files. + +The command-line options related to serving deployment are as follows: + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
NameDescription
--pipelinePaddleX pipeline registration name or pipeline configuration file path.
--deviceDeployment device for the pipeline. Defaults to cpu (if GPU is unavailable) or gpu (if GPU is available).
--hostHostname or IP address to which the server is bound. Defaults to 0.0.0.0.
--portPort number on which the server listens. Defaults to 8080.
--use_hpipIf specified, uses high-performance inference.
--hpi_configHigh-performance inference configuration. Refer to the PaddleX High-Performance Inference Guide for more information.
+ +### 1.3 Invoke the Service + +The "Development Integration/Deployment" section in the PaddleOCR pipeline tutorial provides API references and multi-language invocation examples for the service. + +## 2. High-Stability Serving Deployment + +Please refer to the [PaddleX Serving Deployment Guide](https://paddlepaddle.github.io/PaddleX/3.0/en/pipeline_deploy/serving.html#2). More information about PaddleX pipeline configuration files can be found in [Using PaddleX Pipeline Configuration Files](../paddleocr_and_paddlex.en.md#3-using-paddlex-pipeline-configuration-files). + +It should be noted that, due to the lack of fine-grained optimization and other reasons, the current high-stability serving deployment solution provided by PaddleOCR may not match the performance of the 2.x version based on PaddleServing. However, this new solution fully supports the PaddlePaddle 3.0 framework. We will continue to optimize it and consider introducing more performant deployment solutions in the future. diff --git a/docs/version3.x/deployment/serving.md b/docs/version3.x/deployment/serving.md index 91a0902a8b..06ee5f7cd3 100644 --- a/docs/version3.x/deployment/serving.md +++ b/docs/version3.x/deployment/serving.md @@ -42,7 +42,7 @@ INFO: Application startup complete. INFO: Uvicorn running on http://0.0.0.0:8080 (Press CTRL+C to quit) ``` -如需调整配置(如模型路径、batch size、部署设备等),可指定 `--pipeline` 为自定义配置文件。请参考 [PaddleOCR 与 PaddleX](../advanced/paddleocr_and_paddlex.md) 了解 PaddleOCR 产线与 PaddleX 产线注册名的对应关系,以及 PaddleX 产线配置文件的获取与修改方式。 +如需调整配置(如模型路径、batch size、部署设备等),可指定 `--pipeline` 为自定义配置文件。请参考 [PaddleOCR 与 PaddleX](../paddleocr_and_paddlex.md) 了解 PaddleOCR 产线与 PaddleX 产线注册名的对应关系,以及 PaddleX 产线配置文件的获取与修改方式。 与服务化部署相关的命令行选项如下: @@ -84,7 +84,7 @@ INFO: Uvicorn running on http://0.0.0.0:8080 (Press CTRL+C to quit) ### 1.3 调用服务 -PaddleX 产线使用教程中的 “开发集成/部署” 部分提供了服务的 API 参考与多语言调用示例。在 [PaddleX模型产线使用概览](https://paddlepaddle.github.io/PaddleX/3.0/pipeline_usage/pipeline_develop_guide.html) 中可以找到各产线的使用教程。 +PaddleOCR 产线使用教程中的 “开发集成/部署” 部分提供了服务的 API 参考与多语言调用示例。 ## 2. 高稳定性服务化部署 diff --git a/docs/version3.x/installation.md b/docs/version3.x/installation.md new file mode 100644 index 0000000000..74a3eeef65 --- /dev/null +++ b/docs/version3.x/installation.md @@ -0,0 +1,50 @@ +--- +comments: true +--- + +# 安装 + +# 1. 安装飞桨框架 + +请参考 [飞桨官网](https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/develop/install/pip/linux-pip.html) 安装 `3.0` 及以上版本的飞桨框架。**推荐使用飞桨官方 Docker 镜像。** + +# 2. 安装 PaddleOCR + +如果只希望使用 PaddleOCR 的推理功能,请参考 [安装推理包](#21-安装推理包);如果希望进行模型训练、导出等,请参考 [安装训练依赖](#22-安装训练依赖)。在同一环境中安装推理包和训练依赖是允许的,无需进行环境隔离。 + +## 2.1 安装推理包 + +从 PyPI 安装最新版本 PaddleOCR 推理包: + +```bash +python -m pip install paddleocr +``` + +或者从源码安装(默认为开发分支): + +```bash +python -m pip install "git+https://github.com/PaddlePaddle/PaddleOCR.git" +``` + +## 2.2 安装训练依赖 + +要进行模型训练、导出等,需要首先将仓库克隆到本地: + +```bash +# 推荐方式 +git clone https://github.com/PaddlePaddle/PaddleOCR + +# (可选)切换到指定分支 +git checkout release/3.0 + +# 如果因为网络问题无法克隆成功,也可选择使用码云上的仓库: +git clone https://gitee.com/paddlepaddle/PaddleOCR + +# 注:码云托管代码可能无法实时同步本 GitHub 项目更新,存在3~5天延时,请优先使用推荐方式。 +``` + +执行如下命令安装依赖: + +```bash +python -m pip install -r requirements.txt +``` diff --git a/docs/version3.x/logging.en.md b/docs/version3.x/logging.en.md new file mode 100644 index 0000000000..7892732686 --- /dev/null +++ b/docs/version3.x/logging.en.md @@ -0,0 +1,22 @@ +# Logging + +This document mainly introduces how to configure the logging system for the PaddleOCR inference package. It's important to note that PaddleOCR's inference package uses a different logging system than the training scripts, and this document does not cover the configuration of the logging system used in the training scripts. + +PaddleOCR has built a centralized logging system based on Python's [`logging` standard library](https://docs.python.org/3/library/logging.html#module-logging). In other words, PaddleOCR uses a single logger, which can be accessed and configured via `paddleocr.logger`. + +By default, the logging level in PaddleOCR is set to `ERROR`, meaning that log messages will only be output if their level is `ERROR` or higher (e.g., `CRITICAL`). PaddleOCR also configures a `StreamHandler` for this logger, which outputs logs to the standard error stream, and sets the logger's `propagate` attribute to `False` to prevent log messages from being passed to its parent logger. + +If you wish to disable PaddleOCR's automatic logging configuration behavior, you can set the environment variable `DISABLE_AUTO_LOGGING_CONFIG` to `1`. In this case, PaddleOCR will not perform any additional configuration of the logger. + +For more flexible customization of logging behavior, refer to the relevant documentation of the `logging` standard library. Below is an example of writing logs to a file: + +```python +import logging +from paddleocr import logger + +# Write logs to the file `paddleocr.log` +fh = logging.FileHandler("paddleocr.log") +logger.addHandler(fh) +``` + +Please note that other libraries that PaddleOCR depends on (such as [PaddleX](./paddleocr_and_paddlex.en.md)) have their own independent logging systems, and the above configuration will not affect the log output of these libraries. diff --git a/docs/version3.x/model_list.md b/docs/version3.x/model_list.md new file mode 100644 index 0000000000..9b3971ae23 --- /dev/null +++ b/docs/version3.x/model_list.md @@ -0,0 +1,922 @@ +--- +comments: true +--- + +# PaddleOCR模型列表(CPU/GPU) + +PaddleOCR 内置了多条产线,每条产线都包含了若干模块,每个模块包含若干模型,具体使用哪些模型,您可以根据下边的 benchmark 数据来选择。如您更考虑模型精度,请选择精度较高的模型,如您更考虑模型推理速度,请选择推理速度较快的模型,如您更考虑模型存储大小,请选择存储大小较小的模型。 + +## [文本检测模块](../module_usage/tutorials/ocr_modules/text_detection.md) + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
模型检测Hmean(%)GPU推理耗时(ms)
[常规模式 / 高性能模式]
CPU推理耗时(ms)
[常规模式 / 高性能模式]
模型存储大小(M)yaml 文件模型下载链接
PP-OCRv5_server_det-- / -- / -101PP-OCRv5_server_det.yaml推理模型/训练模型
PP-OCRv5_mobile_det-- / -- / -20PP-OCRv5_mobile_det.yaml推理模型/训练模型
PP-OCRv4_server_det82.5683.34 / 80.91442.58 / 442.58109PP-OCRv4_server_det.yaml推理模型/训练模型
PP-OCRv4_mobile_det77.358.79 / 3.1351.00 / 28.584.7PP-OCRv4_mobile_det.yaml推理模型/训练模型
PP-OCRv3_mobile_det78.688.44 / 2.9127.87 / 27.872.1PP-OCRv3_mobile_det.yaml推理模型/训练模型
PP-OCRv3_server_det80.1165.41 / 13.67305.07 / 305.07102.1PP-OCRv3_server_det.yaml推理模型/训练模型
+注:以上精度指标的评估集是 PaddleOCR 自建的中英文数据集,覆盖街景、网图、文档、手写多个场景,其中文本识别包含 593 张图片。 + +## [印章文本检测模块](../module_usage/tutorials/ocr_modules/seal_text_detection.md) + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
模型名称检测Hmean(%)GPU推理耗时(ms)
[常规模式 / 高性能模式]
CPU推理耗时(ms)
[常规模式 / 高性能模式]
模型存储大小yaml 文件模型下载链接
PP-OCRv4_mobile_seal_det96.477.82 / 3.0948.28 / 23.974.7MPP-OCRv4_mobile_seal_det.yaml推理模型/训练模型
PP-OCRv4_server_seal_det98.2174.75 / 67.72382.55 / 382.55108.3 MPP-OCRv4_server_seal_det.yaml推理模型/训练模型
+注:以上精度指标的评估集是 PaddleX 自建的印章数据集,包含500印章图像。 + +## [文本识别模块](../module_usage/tutorials/ocr_modules/text_recognition.md) + +* 中文识别模型 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
模型识别 Avg Accuracy(%)GPU推理耗时(ms)
[常规模式 / 高性能模式]
CPU推理耗时(ms)
[常规模式 / 高性能模式]
模型存储大小(M)yaml 文件模型下载链接
PP-OCRv5_server_rec-- / -- / -206 MPP-OCRv5_server_rec.yaml推理模型/训练模型
PP-OCRv5_mobile_rec-- / -- / -137 MPP-OCRv5_mobile_rec.yaml推理模型/训练模型
PP-OCRv4_server_rec_doc81.536.65 / 2.3832.92 / 32.9274.7 MPP-OCRv4_server_rec_doc.yaml推理模型/训练模型
PP-OCRv4_mobile_rec78.744.82 / 1.2016.74 / 4.6410.6 MPP-OCRv4_mobile_rec.yaml推理模型/训练模型
PP-OCRv4_server_rec 80.61 6.58 / 2.4333.17 / 33.1771.2 MPP-OCRv4_server_rec.yaml推理模型/训练模型
PP-OCRv3_mobile_rec72.965.87 / 1.199.07 / 4.289.2 MPP-OCRv3_mobile_rec.yaml推理模型/训练模型
+

注:以上精度指标的评估集是 PaddleOCR 自建的中文数据集,覆盖街景、网图、文档、手写多个场景,其中文本识别包含 8367 张图片。

+ + + + + + + + + + + + + + + + + + + +
模型识别 Avg Accuracy(%)GPU推理耗时(ms)
[常规模式 / 高性能模式]
CPU推理耗时(ms)
[常规模式 / 高性能模式]
模型存储大小(M)yaml 文件模型下载链接
ch_SVTRv2_rec68.818.08 / 2.7450.17 / 42.5073.9 Mch_SVTRv2_rec.yaml推理模型/训练模型
+

注:以上精度指标的评估集是 PaddleOCR算法模型挑战赛 - 赛题一:OCR端到端识别任务A榜。

+ + + + + + + + + + + + + + + + + + + +
模型识别 Avg Accuracy(%)GPU推理耗时(ms)
[常规模式 / 高性能模式]
CPU推理耗时(ms)
[常规模式 / 高性能模式]
模型存储大小(M)yaml 文件模型下载链接
ch_RepSVTR_rec65.075.93 / 1.6220.73 / 7.3222.1 Mch_RepSVTR_rec.yaml推理模型/训练模型
+

注:以上精度指标的评估集是 PaddleOCR算法模型挑战赛 - 赛题一:OCR端到端识别任务B榜。

+ +* 英文识别模型 + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
模型识别 Avg Accuracy(%)GPU推理耗时(ms)
[常规模式 / 高性能模式]
CPU推理耗时(ms)
[常规模式 / 高性能模式]
模型存储大小(M)yaml 文件模型下载链接
en_PP-OCRv4_mobile_rec 70.394.81 / 0.7516.10 / 5.316.8 Men_PP-OCRv4_mobile_rec.yaml推理模型/训练模型
en_PP-OCRv3_mobile_rec70.695.44 / 0.758.65 / 5.577.8 M en_PP-OCRv3_mobile_rec.yaml推理模型/训练模型
+ +

注:以上精度指标的评估集是 PaddleX 自建的英文数据集。

+ +* 多语言识别模型 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
模型识别 Avg Accuracy(%)GPU推理耗时(ms)
[常规模式 / 高性能模式]
CPU推理耗时(ms)
[常规模式 / 高性能模式]
模型存储大小(M)yaml 文件模型下载链接
korean_PP-OCRv3_mobile_rec60.215.40 / 0.979.11 / 4.058.6 Mkorean_PP-OCRv3_mobile_rec.yaml推理模型/训练模型
japan_PP-OCRv3_mobile_rec45.695.70 / 1.028.48 / 4.078.8 M japan_PP-OCRv3_mobile_rec.yaml推理模型/训练模型
chinese_cht_PP-OCRv3_mobile_rec82.065.90 / 1.289.28 / 4.349.7 M chinese_cht_PP-OCRv3_mobile_rec.yaml推理模型/训练模型
te_PP-OCRv3_mobile_rec95.885.42 / 0.828.10 / 6.917.8 M te_PP-OCRv3_mobile_rec.yaml推理模型/训练模型
ka_PP-OCRv3_mobile_rec96.965.25 / 0.799.09 / 3.868.0 M ka_PP-OCRv3_mobile_rec.yaml推理模型/训练模型
ta_PP-OCRv3_mobile_rec76.835.23 / 0.7510.13 / 4.308.0 M ta_PP-OCRv3_mobile_rec.yaml推理模型/训练模型
latin_PP-OCRv3_mobile_rec76.935.20 / 0.798.83 / 7.157.8 Mlatin_PP-OCRv3_mobile_rec.yaml推理模型/训练模型
arabic_PP-OCRv3_mobile_rec73.555.35 / 0.798.80 / 4.567.8 Marabic_PP-OCRv3_mobile_rec.yaml推理模型/训练模型
cyrillic_PP-OCRv3_mobile_rec94.285.23 / 0.768.89 / 3.887.9 M cyrillic_PP-OCRv3_mobile_rec.yaml推理模型/训练模型
devanagari_PP-OCRv3_mobile_rec96.445.22 / 0.798.56 / 4.067.9 Mdevanagari_PP-OCRv3_mobile_rec.yaml推理模型/训练模型
+

注:以上精度指标的评估集是 PaddleX 自建的多语种数据集。

+ +## [公式识别模块](../module_usage/tutorials/ocr_modules/formula_recognition.md) + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
模型En-BLEU(%)Zh-BLEU(%)GPU推理耗时(ms)
[常规模式 / 高性能模式]
CPU推理耗时(ms)
[常规模式 / 高性能模式]
模型存储大小 (M)yaml 文件模型下载链接
UniMERNet85.9143.502266.96/--/-1.53 GUniMERNet.yaml推理模型/训练模型
PP-FormulaNet-S87.0045.71202.25/--/-224 MPP-FormulaNet-S.yaml推理模型/训练模型
PP-FormulaNet-L90.3645.781976.52/--/-695 MPP-FormulaNet-L.yaml推理模型/训练模型
PP-FormulaNet_plus-S88.7153.32191.69/--/-248 MPP-FormulaNet_plus-S.yaml推理模型/训练模型
PP-FormulaNet_plus-M91.4589.761301.56/--/-592 MPP-FormulaNet_plus-M.yaml推理模型/训练模型
PP-FormulaNet_plus-L92.2290.641745.25/--/-698 MPP-FormulaNet_plus-L.yaml推理模型/训练模型
LaTeX_OCR_rec74.5539.961244.61/--/-99 MLaTeX_OCR_rec.yaml推理模型/训练模型
+注:以上精度指标测量自 PaddleX 内部自建公式识别测试集。LaTeX_OCR_rec在LaTeX-OCR公式识别测试集的BLEU score为 0.8821。 + +## [表格结构识别模块](../module_usage/tutorials/ocr_modules/table_structure_recognition.md) + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
模型精度(%)GPU推理耗时(ms)
[常规模式 / 高性能模式]
CPU推理耗时(ms)
[常规模式 / 高性能模式]
模型存储大小 (M)yaml 文件模型下载链接
SLANet59.52103.08 / 103.08197.99 / 197.996.9 MSLANet.yaml推理模型/训练模型
SLANet_plus63.69140.29 / 140.29195.39 / 195.396.9 MSLANet_plus.yaml推理模型/训练模型
SLANeXt_wired69.65------SLANeXt_wired.yaml推理模型/训练模型
SLANeXt_wirelessSLANeXt_wireless.yaml推理模型/训练模型
+注:以上精度指标测量自 PaddleX 内部自建高难度中文表格识别数据集。 + + +## [表格单元格检测模块](../module_usage/tutorials/ocr_modules/table_cells_detection.md) + + + + + + + + + + + + + + + + + + + + + + + + + + +
模型mAP(%)GPU推理耗时(ms)
[常规模式 / 高性能模式]
CPU推理耗时(ms)
[常规模式 / 高性能模式]
模型存储大小 (M)yaml 文件模型下载链接
RT-DETR-L_wired_table_cell_det82.735.00 / 10.45495.51 / 495.51124MRT-DETR-L_wired_table_cell_det.yaml推理模型/训练模型
RT-DETR-L_wireless_table_cell_detRT-DETR-L_wireless_table_cell_det.yaml推理模型/训练模型
+

注:以上精度指标测量自 PaddleX 内部自建表格单元格检测数据集。

+ +## [表格分类模块](../module_usage/tutorials/ocr_modules/table_classification.md) + + + + + + + + + + + + + + + + + + + + +
模型Top1 Acc(%)GPU推理耗时(ms)
[常规模式 / 高性能模式]
CPU推理耗时(ms)
[常规模式 / 高性能模式]
模型存储大小 (M)yaml文件模型下载链接
PP-LCNet_x1_0_table_cls94.22.35 / 0.474.03 / 1.356.6MPP-LCNet_x1_0_table_cls.yaml推理模型/训练模型
+

注:以上精度指标测量自 PaddleX 内部自建表格分类数据集。

+ +## [文本图像矫正模块](../module_usage/tutorials/ocr_modules/text_image_unwarping.md) + + + + + + + + + + + + + + + + + + + + + +
模型名称MS-SSIM (%)GPU推理耗时(ms)
[常规模式 / 高性能模式]
CPU推理耗时(ms)
[常规模式 / 高性能模式]
模型存储大小yaml 文件模型下载链接
UVDoc54.4016.27 / 7.76176.97 / 80.6030.3 MUVDoc.yaml推理模型/训练模型
+注:以上精度指标测量自 PaddleX自建的图像矫正数据集 + +## [版面区域检测模块](../module_usage/tutorials/ocr_modules/layout_detection.md) + +* 版面检测模型,包含20个常见的类别:文档标题、段落标题、文本、页码、摘要、目录、参考文献、脚注、页眉、页脚、算法、公式、公式编号、图像、表格、图和表标题(图标题、表格标题和图表标题)、印章、图表、侧栏文本和参考文献内容 + + + + + + + + + + + + + + + + + + + + + + + +
模型mAP(0.5)(%)GPU推理耗时(ms)
[常规模式 / 高性能模式]
CPU推理耗时(ms)
[常规模式 / 高性能模式]
模型存储大小(M)yaml文件模型下载链接
PP-DocLayout_plus-L83.234.6244 / 10.3945510.57 / - 126.01 PP-DocLayout_plus-L.yaml推理模型/训练模型
+ +注:以上精度指标的评估集是自建的版面区域检测数据集,包含中英文论文、杂志、报纸、研报、PPT、试卷、课本等 1300 张文档类型图片。 + +* 文档图像版面子模块检测,包含1个 版面区域 类别,能检测多栏的报纸、杂志的每个子文章的文本区域: + + + + + + + + + + + + + + + + + + + + + + + +
模型mAP(0.5)(%)GPU推理耗时(ms)
[常规模式 / 高性能模式]
CPU推理耗时(ms)
[常规模式 / 高性能模式]
模型存储大小(M)yaml文件模型下载链接
PP-DocBlockLayout95.934.6244 / 10.3945510.57 / - 123 MPP-DocBlockLayout.yaml推理模型/训练模型
+ +注:以上精度指标的评估集是自建的版面子区域检测数据集,包含中英文论文、杂志、报纸、研报、PPT、试卷、课本等 1000 张文档类型图片。 + + +* 版面检测模型,包含23个常见的类别:文档标题、段落标题、文本、页码、摘要、目录、参考文献、脚注、页眉、页脚、算法、公式、公式编号、图像、图表标题、表格、表格标题、印章、图表标题、图表、页眉图像、页脚图像、侧栏文本 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
模型mAP(0.5)(%)GPU推理耗时(ms)
[常规模式 / 高性能模式]
CPU推理耗时(ms)
[常规模式 / 高性能模式]
模型存储大小(M)yaml文件模型下载链接
PP-DocLayout-L90.434.6244 / 10.3945510.57 / - 123.76 PP-DocLayout-L.yaml推理模型/训练模型
PP-DocLayout-M75.213.3259 / 4.868544.0680 / 44.068022.578PP-DocLayout-M.yaml推理模型/训练模型
PP-DocLayout-S70.98.3008 / 2.379410.0623 / 9.92964.834PP-DocLayout-S.yaml推理模型/训练模型
+ +注:以上精度指标的评估集是自建的版面区域检测数据集,包含中英文论文、杂志和研报等常见的 500 张文档类型图片。 + +* 表格版面检测模型 + + + + + + + + + + + + + + + + + + + + + + +
模型mAP(0.5)(%)GPU推理耗时(ms)
[常规模式 / 高性能模式]
CPU推理耗时(ms)
[常规模式 / 高性能模式]
模型存储大小(M)yaml文件模型下载链接
PicoDet_layout_1x_table97.58.02 / 3.0923.70 / 20.417.4 MPicoDet_layout_1x_table.yaml推理模型/训练模型
+注:以上精度指标的评估集是 PaddleOCR 自建的版面表格区域检测数据集,包含中英文 7835 张带有表格的论文文档类型图片。 + +* 3类版面检测模型,包含表格、图像、印章 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
模型mAP(0.5)(%)GPU推理耗时(ms)
[常规模式 / 高性能模式]
CPU推理耗时(ms)
[常规模式 / 高性能模式]
模型存储大小(M)yaml文件模型下载链接
PicoDet-S_layout_3cls88.28.99 / 2.2216.11 / 8.734.8PicoDet-S_layout_3cls.yaml推理模型/训练模型
PicoDet-L_layout_3cls89.013.05 / 4.5041.30 / 41.3022.6PicoDet-L_layout_3cls.yaml推理模型/训练模型
RT-DETR-H_layout_3cls95.8114.93 / 27.71947.56 / 947.56470.1RT-DETR-H_layout_3cls.yaml推理模型/训练模型
+注:以上精度指标的评估集是 PaddleOCR 自建的版面区域检测数据集,包含中英文论文、杂志和研报等常见的 1154 张文档类型图片。 + +* 5类英文文档区域检测模型,包含文字、标题、表格、图片以及列表 + + + + + + + + + + + + + + + + + + + + + + +
模型mAP(0.5)(%)GPU推理耗时(ms)
[常规模式 / 高性能模式]
CPU推理耗时(ms)
[常规模式 / 高性能模式]
模型存储大小(M)yaml文件模型下载链接
PicoDet_layout_1x97.89.03 / 3.1025.82 / 20.707.4PicoDet_layout_1x.yaml推理模型/训练模型
+注:以上精度指标的评估集是 [PubLayNet](https://developer.ibm.com/exchanges/data/all/publaynet/) 的评估数据集,包含英文文档的 11245 张图片。 + +* 17类区域检测模型,包含17个版面常见类别,分别是:段落标题、图片、文本、数字、摘要、内容、图表标题、公式、表格、表格标题、参考文献、文档标题、脚注、页眉、算法、页脚、印章 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
模型mAP(0.5)(%)GPU推理耗时(ms)
[常规模式 / 高性能模式]
CPU推理耗时(ms)
[常规模式 / 高性能模式]
模型存储大小(M)yaml文件模型下载链接
PicoDet-S_layout_17cls87.49.11 / 2.1215.42 / 9.124.8PicoDet-S_layout_17cls.yaml推理模型/训练模型
PicoDet-L_layout_17cls89.013.50 / 4.6943.32 / 43.3222.6PicoDet-L_layout_17cls.yaml推理模型/训练模型
RT-DETR-H_layout_17cls98.3115.29 / 104.09995.27 / 995.27470.2RT-DETR-H_layout_17cls.yaml推理模型/训练模型
+注:以上精度指标的评估集是 PaddleOCR 自建的版面区域检测数据集,包含中英文论文、杂志和研报等常见的 892 张文档类型图片。 + +## [文档图像方向分类模块](../module_usage/tutorials/ocr_modules/doc_img_orientation_classification.md) + + + + + + + + + + + + + + + + + + + + + + + + +
模型Top-1 Acc(%)GPU推理耗时(ms)
[常规模式 / 高性能模式]
CPU推理耗时(ms)
[常规模式 / 高性能模式]
模型存储大小(M)yaml文件模型下载链接
PP-LCNet_x1_0_doc_ori99.062.31 / 0.433.37 / 1.277PP-LCNet_x1_0_doc_ori.yaml推理模型/训练模型
+注:以上精度指标的评估集是自建的数据集,覆盖证件和文档等多个场景,包含 1000 张图片。 + + +## [文本行方向分类模块](../module_usage/tutorials/ocr_modules/doc_img_orientation_classification.md) + + + + + + + + + + + + + + + + + + + + + + + + +
模型Top-1 Acc(%)GPU推理耗时(ms)
[常规模式 / 高性能模式]
CPU推理耗时(ms)
[常规模式 / 高性能模式]
模型存储大小(M)yaml文件模型下载链接
PP-LCNet_x1_0_doc_ori99.062.31 / 0.433.37 / 1.277PP-LCNet_x0_25_textline_ori.yaml推理模型/训练模型
+ +注:以上精度指标的评估集是自建的数据集,覆盖证件和文档等多个场景,包含 1000 张图片。 + +## [文档类视觉语言模型模块](../module_usage/tutorials/vlm_modules/doc_vlm.md) + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
模型模型参数尺寸(B)模型存储大小(GB)yaml文件模型下载链接
PP-DocBee-2B24.2PP-DocBee-2B.yaml推理模型
PP-DocBee-7B715.8PP-DocBee-7B.yaml推理模型
PP-DocBee2-3B37.6推理模型
diff --git a/docs/version3.x/paddleocr_and_paddlex.en.md b/docs/version3.x/paddleocr_and_paddlex.en.md new file mode 100644 index 0000000000..0bad2f200c --- /dev/null +++ b/docs/version3.x/paddleocr_and_paddlex.en.md @@ -0,0 +1,69 @@ +# PaddleOCR and PaddleX + +[PaddleX](https://github.com/PaddlePaddle/PaddleX) is a low-code development tool built on the PaddlePaddle framework. It integrates numerous out-of-the-box pre-trained models, supports the full-pipeline development from model training to inference, and is compatible with various mainstream hardware both domestically and internationally, empowering AI developers to efficiently deploy solutions in industrial practices. + +PaddleOCR leverages PaddleX for inference deployment, enabling seamless collaboration between the two in this regard. When installing PaddleOCR, PaddleX is also installed as a dependency. Additionally, PaddleOCR and PaddleX maintain consistency in pipeline naming conventions. For quick experience, users typically do not need to understand the specific concepts of PaddleX when using basic configurations. However, knowledge of PaddleX can be beneficial in advanced configuration scenarios, service deployment, and other use cases. + +This document introduces the relationship between PaddleOCR and PaddleX and explains how to use these two tools collaboratively. + +## 1. Differences and Connections Between PaddleOCR and PaddleX + +PaddleOCR and PaddleX have distinct focuses and functionalities: PaddleOCR specializes in OCR-related tasks, while PaddleX covers a wide range of task types, including time-series forecasting, face recognition, and more. Furthermore, PaddleX provides rich infrastructure with underlying capabilities for multi-model combined inference, enabling the integration of different models in a unified and flexible manner and supporting the construction of complex model pipelines. + +PaddleOCR fully reuses the capabilities of PaddleX in the inference deployment phase, including: + +- PaddleOCR primarily relies on PaddleX for underlying capabilities such as model inference, pre- and post-processing, and multi-model combination. +- The high-performance inference capabilities of PaddleOCR are achieved through PaddleX's Paddle2ONNX plugin and high-performance inference plugins. +- The service deployment solutions of PaddleOCR are based on PaddleX's implementations. + +## 2. Correspondence Between PaddleOCR Pipelines and PaddleX Pipeline Registration Names + +| PaddleOCR Pipeline | PaddleX Pipeline Registration Name | +| --- | --- | +| General OCR | `OCR` | +| General Layout Analysis v3 | `PP-StructureV3` | +| Document Scenario Information Extraction v4 | `PP-ChatOCRv4-doc` | +| General Table Recognition v2 | `table_recognition_v2` | +| Formula Recognition | `formula_recognition` | +| Seal Text Recognition | `seal_recognition` | +| Document Image Preprocessing | `doc_preprocessor` | +| Document Understanding | `doc_understanding` | + +## 3. Using PaddleX Pipeline Configuration Files + +During the inference deployment phase, PaddleOCR supports exporting and loading PaddleX pipeline configuration files. Users can deeply configure inference deployment-related parameters by editing these configuration files. + +### 3.1 Exporting Pipeline Configuration Files + +You can call the `export_paddlex_config_to_yaml` method of the PaddleOCR pipeline object to export the current pipeline configuration to a YAML file. Here is an example: + +```python +from paddleocr import PaddleOCR + +pipeline = PaddleOCR() +pipeline.export_paddlex_config_to_yaml("ocr_config.yaml") +``` + +The above code will generate a pipeline configuration file named `ocr_config.yaml` in the working directory. + +### 3.2 Editing Pipeline Configuration Files + +The exported PaddleX pipeline configuration file not only includes parameters supported by PaddleOCR's CLI and Python API but also allows for more advanced configurations. Please refer to the corresponding pipeline usage tutorials in [PaddleX Pipeline Usage Overview](https://paddlepaddle.github.io/PaddleX/3.0/en/pipeline_usage/pipeline_develop_guide.html) for detailed instructions on adjusting various configurations according to your needs. + +### 3.3 Loading Pipeline Configuration Files in CLI + +By specifying the path to the PaddleX pipeline configuration file using the `--paddlex_config` parameter, PaddleOCR will read its contents as the default configuration for the pipeline. Here is an example: + +```bash +paddleocr ocr --paddlex_config ocr_config.yaml ... +``` + +### 3.4 Loading Pipeline Configuration Files in Python API + +When initializing the pipeline object, you can pass the path to the PaddleX pipeline configuration file or a configuration dictionary through the `paddlex_config` parameter, and PaddleOCR will use it as the default configuration. Here is an example: + +```python +from paddleocr import PaddleOCR + +pipeline = PaddleOCR(paddlex_config="ocr_config.yaml") +``` diff --git a/docs/version3.x/pipeline_usage/PP-StructureV3.md b/docs/version3.x/pipeline_usage/PP-StructureV3.md index ae6d278802..720918a18e 100644 --- a/docs/version3.x/pipeline_usage/PP-StructureV3.md +++ b/docs/version3.x/pipeline_usage/PP-StructureV3.md @@ -736,19 +736,19 @@ devanagari_PP-OCRv3_mobile_rec_infer.tar">推理模型/