PaddleOCR/docs/version3.x/pipeline_usage/doc_preprocessor.md

---
comments: true
---

# 文档图像预处理产线使用教程

## 1. 文档图像预处理产线介绍

文档图像预处理产线集成了文档方向分类和形变矫正两大功能。文档方向分类可自动识别文档的四个方向（0°、90°、180°、270°），确保文档以正确的方向进行后续处理。文本图像矫正模型则用于修正文档拍摄或扫描过程中的几何扭曲，恢复文档的原始形状和比例。适用于数字化文档管理、OCR类任务前处理、以及任何需要提高文档图像质量的场景。通过自动化的方向校正与形变矫正，该模块显著提升了文档处理的准确性和效率，为用户提供更为可靠的图像分析基础。本产线同时提供了灵活的服务化部署方式，支持在多种硬件上使用多种编程语言调用。不仅如此，本产线也提供了二次开发的能力，您可以基于本产线在您自己的数据集上训练调优，训练后的模型也可以无缝集成。

<img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/main/images/pipelines/doc_preprocessor/02.jpg">

<b>通用文档图像预处理产线中包含以下2个模块。每个模块均可独立进行训练和推理，并包含多个模型。有关详细信息，请点击相应模块以查看文档。</b>

- [文档图像方向分类模块](../module_usage/doc_img_orientation_classification.md)（可选）
- [文本图像矫正模块](../module_usage/text_image_unwarping.md)（可选）

在本产线中，您可以根据下方的基准测试数据选择使用的模型。

<details>
<summary> <b>文档图像方向分类模块（可选）：</b></summary>
<table>
<thead>
<tr>
<th>模型</th><th>模型下载链接</th>
<th>Top-1 Acc（%）</th>
<th>GPU推理耗时（ms）<br/>[常规模式 / 高性能模式]</th>
<th>CPU推理耗时（ms）<br/>[常规模式 / 高性能模式]</th>
<th>模型存储大小（MB）</th>
<th>介绍</th>
</tr>
</thead>
<tbody>
<tr>
<td>PP-LCNet_x1_0_doc_ori</td>
<td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0.0/PP-LCNet_x1_0_doc_ori_infer.tar">推理模型</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/PP-LCNet_x1_0_doc_ori_pretrained.pdparams">训练模型</a></td>
<td>99.06</td>
<td>2.62 / 0.59</td>
<td>3.24 / 1.19</td>
<td>7</td>
<td>基于PP-LCNet_x1_0的文档图像分类模型，含有四个类别，即0度，90度，180度，270度</td>
</tr>
</tbody>
</table>
</details>

<details>
<summary> <b>文本图像矫正模块（可选）：</b></summary>
<table>
<thead>
<tr>
<th>模型</th><th>模型下载链接</th>
<th>CER </th>
<th>GPU推理耗时（ms）<br/>[常规模式 / 高性能模式]</th>
<th>CPU推理耗时（ms）<br/>[常规模式 / 高性能模式]</th>
<th>模型存储大小（MB）</th>
<th>介绍</th>
</tr>
</thead>
<tbody>
<tr>
<td>UVDoc</td>
<td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0.0/UVDoc_infer.tar">推理模型</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/UVDoc_pretrained.pdparams">训练模型</a></td>
<td>0.179</td>
<td>19.05 / 19.05</td>
<td>- / 869.82</td>
<td>30.3</td>
<td>高精度文本图像矫正模型</td>
</tr>
</tbody>
</table>
</details>

<details>
<summary> <b>测试环境说明：</b></summary>

  <ul>
      <li><b>性能测试环境</b>
          <ul>
                      <li><strong>测试数据集：
             </strong>
                <ul>
                  <li>文档图像方向分类模型：自建的内部数据集，覆盖证件和文档等多个场景，包含 1000 张图片。</li>
                  <li> 文本图像矫正模型：<a href="https://www3.cs.stonybrook.edu/~cvl/docunet.html">DocUNet。</a></li>
                </ul>
             </li>
              <li><strong>硬件配置：</strong>
                  <ul>
                      <li>GPU：NVIDIA Tesla T4</li>
                      <li>CPU：Intel Xeon Gold 6271C @ 2.60GHz</li>
                  </ul>
              </li>
              <li><strong>软件环境：</strong>
                  <ul>
                      <li>Ubuntu 20.04 / CUDA 11.8 / cuDNN 8.9 / TensorRT 8.6.1.6</li>
                      <li>paddlepaddle 3.0.0 / paddleocr 3.0.3</li>
                  </ul>
              </li>
          </ul>
      </li>
      <li><b>推理模式说明</b></li>
  </ul>

<table border="1">
    <thead>
        <tr>
            <th>模式</th>
            <th>GPU配置</th>
            <th>CPU配置</th>
            <th>加速技术组合</th>
        </tr>
    </thead>
    <tbody>
        <tr>
            <td>常规模式</td>
            <td>FP32精度 / 无TRT加速</td>
            <td>FP32精度 / 8线程</td>
            <td>PaddleInference</td>
        </tr>
        <tr>
            <td>高性能模式</td>
            <td>选择先验精度类型和加速策略的最优组合</td>
            <td>FP32精度 / 8线程</td>
            <td>选择先验最优后端（Paddle/OpenVINO/TRT等）</td>
        </tr>
    </tbody>
</table>
</details>

## 2. 快速开始


在本地使用通用文档图像预处理产线前，请确保您已经按照[安装教程](../installation.md)完成了wheel包安装。安装完成后，可以在本地使用命令行体验或 Python 集成。

**请注意，如果在执行过程中遇到程序失去响应、程序异常退出、内存资源耗尽、推理速度极慢等问题，请尝试参考文档调整配置，例如关闭不需要使用的功能或使用更轻量的模型。**


### 2.1 命令行方式体验

一行命令即可快速体验 doc_preprocessor 产线效果：

```bash
paddleocr doc_preprocessor -i https://paddle-model-ecology.bj.bcebos.com/paddlex/demo_image/doc_test_rotated.jpg

# 通过 --use_doc_orientation_classify 指定是否使用文档方向分类模型
paddleocr doc_preprocessor -i ./doc_test_rotated.jpg --use_doc_orientation_classify True

# 通过 --use_doc_unwarping 指定是否使用文本图像矫正模块
paddleocr doc_preprocessor -i ./doc_test_rotated.jpg --use_doc_unwarping True

# 通过 --device 指定模型推理时使用 GPU
paddleocr doc_preprocessor -i ./doc_test_rotated.jpg --device gpu
```

<details><summary><b>命令行支持更多参数设置，点击展开以查看命令行参数的详细说明</b></summary>
<table>
<thead>
<tr>
<th>参数</th>
<th>参数说明</th>
<th>参数类型</th>
<th>默认值</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>input</code></td>
<td>待预测数据，必填。
如图像文件或者PDF文件的本地路径：<code>/root/data/img.jpg</code>；<b>如URL链接</b>，如图像文件或PDF文件的网络URL：<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/demo_image/doc_test_rotated.jpg">示例</a>；<b>如本地目录</b>，该目录下需包含待预测图像，如本地路径：<code>/root/data/</code>(当前不支持目录中包含PDF文件的预测，PDF文件需要指定到具体文件路径)。
</td>
<td><code>str</code></td>
<td></td>
</tr>
<tr>
<td><code>save_path</code></td>
<td>指定推理结果文件保存的路径。如果不设置，推理结果将不会保存到本地。</td>
<td><code>str</code></td>
<td></td>
</tr>
<tr>
<td><code>doc_orientation_classify_model_name</code></td>
<td>文档方向分类模型的名称。如果不设置，将会使用产线默认模型。</td>
<td><code>str</code></td>
<td></td>
</tr>
<tr>
<td><code>doc_orientation_classify_model_dir</code></td>
<td>文档方向分类模型的目录路径。如果不设置，将会下载官方模型。</td>
<td><code>str</code></td>
<td></td>
</tr>
<tr>
<td><code>doc_unwarping_model_name</code></td>
<td>文本图像矫正模型的名称。如果不设置，将会使用产线默认模型。</td>
<td><code>str</code></td>
<td></td>
</tr>
<tr>
<td><code>doc_unwarping_model_dir</code></td>
<td>文本图像矫正模型的目录路径。如果不设置，将会下载官方模型。</td>
<td><code>str</code></td>
<td></td>
</tr>
<tr>
<td><code>use_doc_orientation_classify</code></td>
<td>是否加载并使用文档方向分类模块。如果不设置，将使用产线初始化的该参数值，默认初始化为<code>True</code>。
</td>
<td><code>bool</code></td>
<td></td>
</tr>
<tr>
<td><code>use_doc_unwarping</code></td>
<td>是否加载并使用文本图像矫正模块。如果不设置，将使用产线初始化的该参数值，默认初始化为<code>True</code>。
</td>
<td><code>bool</code></td>
<td></td>
</tr>
<tr>
<td><code>device</code></td>
<td>用于推理的设备。支持指定具体卡号：
<ul>
<li><b>CPU</b>：如 <code>cpu</code> 表示使用 CPU 进行推理；</li>
<li><b>GPU</b>：如 <code>gpu:0</code> 表示使用第 1 块 GPU 进行推理；</li>
<li><b>NPU</b>：如 <code>npu:0</code> 表示使用第 1 块 NPU 进行推理；</li>
<li><b>XPU</b>：如 <code>xpu:0</code> 表示使用第 1 块 XPU 进行推理；</li>
<li><b>MLU</b>：如 <code>mlu:0</code> 表示使用第 1 块 MLU 进行推理；</li>
<li><b>DCU</b>：如 <code>dcu:0</code> 表示使用第 1 块 DCU 进行推理；</li>
</ul>如果不设置，将默认使用产线初始化的该参数值，初始化时，会优先使用本地的 GPU 0号设备，如果没有，则使用 CPU 设备。
</td>
<td><code>str</code></td>
<td></td>
</tr>
<tr>
<td><code>enable_hpi</code></td>
<td>是否启用高性能推理。</td>
<td><code>bool</code></td>
<td><code>False</code></td>
</tr>
<tr>
<td><code>use_tensorrt</code></td>
<td>是否启用 Paddle Inference 的 TensorRT 子图引擎。如果模型不支持通过 TensorRT 加速，即使设置了此标志，也不会使用加速。<br/>
对于 CUDA 11.8 版本的飞桨，兼容的 TensorRT 版本为 8.x（x>=6），建议安装 TensorRT 8.6.1.6。<br/>
对于 CUDA 12.6 版本的飞桨，兼容的 TensorRT 版本为 10.x（x>=5），建议安装 TensorRT 10.5.0.18。
</td>
<td><code>bool</code></td>
<td><code>False</code></td>
</tr>
<tr>
<td><code>precision</code></td>
<td>计算精度，如 fp32、fp16。</td>
<td><code>str</code></td>
<td><code>fp32</code></td>
</tr>
<tr>
<td><code>enable_mkldnn</code></td>
<td>是否启用 MKL-DNN 加速推理。如果 MKL-DNN 不可用或模型不支持通过 MKL-DNN 加速，即使设置了此标志，也不会使用加速。
</td>
<td><code>bool</code></td>
<td><code>True</code></td>
</tr>
<tr>
<td><code>mkldnn_cache_capacity</code></td>
<td>
MKL-DNN 缓存容量。
</td>
<td><code>int</code></td>
<td><code>10</code></td>
</tr>
<tr>
<td><code>cpu_threads</code></td>
<td>在 CPU 上进行推理时使用的线程数。</td>
<td><code>int</code></td>
<td><code>8</code></td>
</tr>
<tr>
<td><code>paddlex_config</code></td>
<td>PaddleX产线配置文件路径。</td>
<td><code>str</code></td>
<td></td>
</tr>
</tbody>
</table>
</details>
<br />

运行结果会被打印到终端上，默认配置的 doc_preprocessor 产线的运行结果如下：

```bash
{'res': {'input_path': '/root/.paddlex/predict_input/doc_test_rotated.jpg', 'page_index': None, 'model_settings': {'use_doc_orientation_classify': True, 'use_doc_unwarping': True}, 'angle': 180}}
```

可视化结果保存在`save_path`下，可视化结果如下：

<img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/main/images/pipelines/doc_preprocessor/02.jpg"/>


### 2.2 Python脚本方式集成

命令行方式是为了快速体验查看效果，一般来说，在项目中，往往需要通过代码集成，您可以通过几行代码即可完成产线的快速推理，推理代码如下：

```python
from paddleocr import DocPreprocessor

pipeline = DocPreprocessor()
# docpp = DocPreprocessor(use_doc_orientation_classify=True) # 通过 use_doc_orientation_classify 指定是否使用文档方向分类模型
# docpp = DocPreprocessor(use_doc_unwarping=True) # 通过 use_doc_unwarping 指定是否使用文本图像矫正模块
# docpp = DocPreprocessor(device="gpu") # 通过 device 指定模型推理时使用 GPU
output = pipeline.predict("./doc_test_rotated.jpg")
for res in output:
    res.print() ## 打印预测的结构化输出
    res.save_to_img("./output/")
    res.save_to_json("./output/")
```

在上述 Python 脚本中，执行了如下几个步骤：

（1）通过 `DocPreprocessor()` 实例化 doc_preprocessor 产线对象：具体参数说明如下：

<table>
<thead>
<tr>
<th>参数</th>
<th>参数说明</th>
<th>参数类型</th>
<th>默认值</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>doc_orientation_classify_model_name</code></td>
<td>文档方向分类模型的名称。如果设置为<code>None</code>，将会使用产线默认模型。</td>
<td><code>str|None</code></td>
<td><code>None</code></td>
</tr>
<tr>
<td><code>doc_orientation_classify_model_dir</code></td>
<td>文档方向分类模型的目录路径。如果设置为<code>None</code>，将会下载官方模型。</td>
<td><code>str|None</code></td>
<td><code>None</code></td>
</tr>
<tr>
<td><code>doc_unwarping_model_name</code></td>
<td>文本图像矫正模型的名称。如果设置为<code>None</code>，将会使用产线默认模型。</td>
<td><code>str|None</code></td>
<td><code>None</code></td>
</tr>
<tr>
<td><code>doc_unwarping_model_dir</code></td>
<td>文本图像矫正模型的目录路径。如果设置为<code>None</code>，将会下载官方模型。</td>
<td><code>str|None</code></td>
<td><code>None</code></td>
</tr>
<tr>
<td><code>use_doc_orientation_classify</code></td>
<td>是否加载并使用文档方向分类模块。如果设置为<code>None</code>，将使用产线初始化的该参数值，默认初始化为<code>True</code>。
</td>
<td><code>bool|None</code></td>
<td><code>None</code></td>
</tr>
<tr>
<td><code>use_doc_unwarping</code></td>
<td>是否加载并使用文本图像矫正模块。如果设置为<code>None</code>，将使用产线初始化的该参数值，默认初始化为<code>True</code>。
</td>
<td><code>bool|None</code></td>
<td><code>None</code></td>
</tr>
<tr>
<td><code>device</code></td>
<td>用于推理的设备。支持指定具体卡号：
<ul>
<li><b>CPU</b>：如 <code>cpu</code> 表示使用 CPU 进行推理；</li>
<li><b>GPU</b>：如 <code>gpu:0</code> 表示使用第 1 块 GPU 进行推理；</li>
<li><b>NPU</b>：如 <code>npu:0</code> 表示使用第 1 块 NPU 进行推理；</li>
<li><b>XPU</b>：如 <code>xpu:0</code> 表示使用第 1 块 XPU 进行推理；</li>
<li><b>MLU</b>：如 <code>mlu:0</code> 表示使用第 1 块 MLU 进行推理；</li>
<li><b>DCU</b>：如 <code>dcu:0</code> 表示使用第 1 块 DCU 进行推理；</li>
<li><b>None</b>：如果设置为<code>None</code>，将默认使用产线初始化的该参数值，初始化时，会优先使用本地的 GPU 0号设备，如果没有，则使用 CPU 设备。</li>
</ul>
</td>
<td><code>str|None</code></td>
<td><code>None</code></td>
</tr>
<tr>
<td><code>enable_hpi</code></td>
<td>是否启用高性能推理。</td>
<td><code>bool</code></td>
<td><code>False</code></td>
</tr>
<tr>
<td><code>use_tensorrt</code></td>
<td>是否启用 Paddle Inference 的 TensorRT 子图引擎。如果模型不支持通过 TensorRT 加速，即使设置了此标志，也不会使用加速。<br/>
对于 CUDA 11.8 版本的飞桨，兼容的 TensorRT 版本为 8.x（x>=6），建议安装 TensorRT 8.6.1.6。<br/>
对于 CUDA 12.6 版本的飞桨，兼容的 TensorRT 版本为 10.x（x>=5），建议安装 TensorRT 10.5.0.18。
</td>
<td><code>bool</code></td>
<td><code>False</code></td>
</tr>
<tr>
<td><code>precision</code></td>
<td>计算精度，如 fp32、fp16。</td>
<td><code>str</code></td>
<td><code>"fp32"</code></td>
</tr>
<tr>
<td><code>enable_mkldnn</code></td>
<td>是否启用 MKL-DNN 加速推理。如果 MKL-DNN 不可用或模型不支持通过 MKL-DNN 加速，即使设置了此标志，也不会使用加速。
</td>
<td><code>bool</code></td>
<td><code>True</code></td>
</tr>
<tr>
<td><code>mkldnn_cache_capacity</code></td>
<td>
MKL-DNN 缓存容量。
</td>
<td><code>int</code></td>
<td><code>10</code></td>
</tr>
<tr>
<td><code>cpu_threads</code></td>
<td>在 CPU 上进行推理时使用的线程数。</td>
<td><code>int</code></td>
<td><code>8</code></td>
</tr>
<tr>
<td><code>paddlex_config</code></td>
<td>PaddleX产线配置文件路径。</td>
<td><code>str|None</code></td>
<td><code>None</code></td>
</tr>
</tbody>
</table>

（2）调用 doc_preprocessor 产线对象的 `predict()` 方法进行推理预测，该方法会返回一个结果列表。

另外，产线还提供了 `predict_iter()` 方法。两者在参数接受和结果返回方面是完全一致的，区别在于 `predict_iter()` 返回的是一个 `generator`，能够逐步处理和获取预测结果，适合处理大型数据集或希望节省内存的场景。可以根据实际需求选择使用这两种方法中的任意一种。

以下是 `predict()` 方法的参数及其说明：

<table>
<thead>
<tr>
<th>参数</th>
<th>参数说明</th>
<th>参数类型</th>
<th>默认值</th>
</tr>
</thead>
<tr>
<td><code>input</code></td>
<td>待预测数据，支持多种输入类型，必填。
<ul>
<li><b>Python Var</b>：如 <code>numpy.ndarray</code> 表示的图像数据；</li>
<li><b>str</b>：如图像文件或者PDF文件的本地路径：<code>/root/data/img.jpg</code>；<b>如URL链接</b>，如图像文件或PDF文件的网络URL：<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/demo_image/doc_test_rotated.jpg">示例</a>；<b>如本地目录</b>，该目录下需包含待预测图像，如本地路径：<code>/root/data/</code>(当前不支持目录中包含PDF文件的预测，PDF文件需要指定到具体文件路径)；</li>
<li><b>list</b>：列表元素需为上述类型数据，如<code>[numpy.ndarray, numpy.ndarray]</code>，<code>["/root/data/img1.jpg", "/root/data/img2.jpg"]</code>，<code>["/root/data1", "/root/data2"]</code>。</li>
</ul>
</td>
<td><code>Python Var|str|list</code></td>
<td></td>
</tr>
<tr>
<td><code>use_doc_orientation_classify</code></td>
<td>是否在推理时使用文档方向分类模块。</td>
<td><code>bool|None</code></td>
<td><code>None</code></td>
</tr>
<tr>
<td><code>use_doc_unwarping</code></td>
<td>是否在推理时使用文本图像矫正模块。</td>
<td><code>bool|None</code></td>
<td><code>None</code></td>
</tr>

</table>

（3）对预测结果进行处理，每个样本的预测结果均为对应的Result对象，且支持打印、保存为图片、保存为`json`文件的操作:

<table>
<thead>
<tr>
<th>方法</th>
<th>方法说明</th>
<th>参数</th>
<th>参数类型</th>
<th>参数说明</th>
<th>默认值</th>
</tr>
</thead>
<tr>
<td rowspan="3"><code>print()</code></td>
<td rowspan="3">打印结果到终端</td>
<td><code>format_json</code></td>
<td><code>bool</code></td>
<td>是否对输出内容进行使用 <code>JSON</code> 缩进格式化</td>
<td><code>True</code></td>
</tr>
<tr>
<td><code>indent</code></td>
<td><code>int</code></td>
<td>指定缩进级别，以美化输出的 <code>JSON</code> 数据，使其更具可读性，仅当 <code>format_json</code> 为 <code>True</code> 时有效</td>
<td>4</td>
</tr>
<tr>
<td><code>ensure_ascii</code></td>
<td><code>bool</code></td>
<td>控制是否将非 <code>ASCII</code> 字符转义为 <code>Unicode</code>。设置为 <code>True</code> 时，所有非 <code>ASCII</code> 字符将被转义；<code>False</code> 则保留原始字符，仅当<code>format_json</code>为<code>True</code>时有效</td>
<td><code>False</code></td>
</tr>
<tr>
<td rowspan="3"><code>save_to_json()</code></td>
<td rowspan="3">将结果保存为json格式的文件</td>
<td><code>save_path</code></td>
<td><code>str</code></td>
<td>保存的文件路径，当为目录时，保存文件命名与输入文件类型命名一致</td>
<td>无</td>
</tr>
<tr>
<td><code>indent</code></td>
<td><code>int</code></td>
<td>指定缩进级别，以美化输出的 <code>JSON</code> 数据，使其更具可读性，仅当 <code>format_json</code> 为 <code>True</code> 时有效</td>
<td>4</td>
</tr>
<tr>
<td><code>ensure_ascii</code></td>
<td><code>bool</code></td>
<td>控制是否将非 <code>ASCII</code> 字符转义为 <code>Unicode</code>。设置为 <code>True</code> 时，所有非 <code>ASCII</code> 字符将被转义；<code>False</code> 则保留原始字符，仅当<code>format_json</code>为<code>True</code>时有效</td>
<td><code>False</code></td>
</tr>
<tr>
<td><code>save_to_img()</code></td>
<td>将结果保存为图像格式的文件</td>
<td><code>save_path</code></td>
<td><code>str</code></td>
<td>保存的文件路径，支持目录或文件路径</td>
<td>无</td>
</tr>
</table>

- 调用`print()` 方法会将结果打印到终端，打印到终端的内容解释如下：

    - `input_path`: `(str)` 待预测图像的输入路径

    - `page_index`: `(Union[int, None])` 如果输入是PDF文件，则表示当前是PDF的第几页，否则为 `None`

    - `model_settings`: `(Dict[str, bool])` 配置产线所需的模型参数

        - `use_doc_orientation_classify`: `(bool)` 控制是否启用文档方向分类模块
        - `use_doc_unwarping`: `(bool)` 控制是否启用文本图像矫正模块

    - `angle`: `(int)` 文档方向分类的预测结果。启用时取值为[0,90,180,270]；未启用时为-1

- 调用`save_to_json()` 方法会将上述内容保存到指定的`save_path`中，如果指定为目录，则保存的路径为`save_path/{your_img_basename}.json`，如果指定为文件，则直接保存到该文件中。由于json文件不支持保存numpy数组，因此会将其中的`numpy.array`类型转换为列表形式。
- 调用`save_to_img()` 方法会将可视化结果保存到指定的`save_path`中，如果指定为目录，则保存的路径为`save_path/{your_img_basename}_doc_preprocessor_res_img.{your_img_extension}`，如果指定为文件，则直接保存到该文件中。(产线通常包含较多结果图片，不建议直接指定为具体的文件路径，否则多张图会被覆盖，仅保留最后一张图)

* 此外，也支持通过属性获取带结果的可视化图像和预测结果，具体如下：

<table>
<thead>
<tr>
<th>属性</th>
<th>属性说明</th>
</tr>
</thead>
<tr>
<td rowspan="1"><code>json</code></td>
<td rowspan="1">获取预测的 <code>json</code> 格式的结果</td>
</tr>
<tr>
<td rowspan="2"><code>img</code></td>
<td rowspan="2">获取格式为 <code>dict</code> 的可视化图像</td>
</tr>
</table>

- `json` 属性获取的预测结果为dict类型的数据，相关内容与调用 `save_to_json()` 方法保存的内容一致。
- `img` 属性返回的预测结果是一个dict类型的数据。其中，键为 `preprocessed_img`，对应的值是 `Image.Image` 对象：用于显示 doc_preprocessor 结果的可视化图像。

## 3. 开发集成/部署

如果产线可以达到您对产线推理速度和精度的要求，您可以直接进行开发集成/部署。

若您需要将产线直接应用在您的Python项目中，可以参考 [2.2 Python脚本方式](#22-python脚本方式集成)中的示例代码。

此外，PaddleOCR 也提供了其他两种部署方式，详细说明如下：

🚀 高性能推理：在实际生产环境中，许多应用对部署策略的性能指标（尤其是响应速度）有着较严苛的标准，以确保系统的高效运行与用户体验的流畅性。为此，PaddleOCR 提供高性能推理功能，旨在对模型推理及前后处理进行深度性能优化，实现端到端流程的显著提速，详细的高性能推理流程请参考[高性能推理](../deployment/high_performance_inference.md)。

☁️ 服务化部署：服务化部署是实际生产环境中常见的一种部署形式。通过将推理功能封装为服务，客户端可以通过网络请求来访问这些服务，以获取推理结果。详细的产线服务化部署流程请参考[服务化部署](../deployment/serving.md)。

以下是基础服务化部署的API参考与多语言服务调用示例：

<details><summary>API参考</summary>
<p>对于服务提供的主要操作：</p>
<ul>
<li>HTTP请求方法为POST。</li>
<li>请求体和响应体均为JSON数据（JSON对象）。</li>
<li>当请求处理成功时，响应状态码为<code>200</code>，响应体的属性如下：</li>
</ul>
<table>
<thead>
<tr>
<th>名称</th>
<th>类型</th>
<th>含义</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>logId</code></td>
<td><code>string</code></td>
<td>请求的UUID。</td>
</tr>
<tr>
<td><code>errorCode</code></td>
<td><code>integer</code></td>
<td>错误码。固定为<code>0</code>。</td>
</tr>
<tr>
<td><code>errorMsg</code></td>
<td><code>string</code></td>
<td>错误说明。固定为<code>"Success"</code>。</td>
</tr>
<tr>
<td><code>result</code></td>
<td><code>object</code></td>
<td>操作结果。</td>
</tr>
</tbody>
</table>
<ul>
<li>当请求处理未成功时，响应体的属性如下：</li>
</ul>
<table>
<thead>
<tr>
<th>名称</th>
<th>类型</th>
<th>含义</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>logId</code></td>
<td><code>string</code></td>
<td>请求的UUID。</td>
</tr>
<tr>
<td><code>errorCode</code></td>
<td><code>integer</code></td>
<td>错误码。与响应状态码相同。</td>
</tr>
<tr>
<td><code>errorMsg</code></td>
<td><code>string</code></td>
<td>错误说明。</td>
</tr>
</tbody>
</table>
<p>服务提供的主要操作如下：</p>
<ul>
<li><b><code>infer</code></b></li>
</ul>
<p>获取图像文档图像预处理结果。</p>
<p><code>POST /document-preprocessing</code></p>
<ul>
<li>请求体的属性如下：</li>
</ul>
<table>
<thead>
<tr>
<th>名称</th>
<th>类型</th>
<th>含义</th>
<th>是否必填</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>file</code></td>
<td><code>string</code></td>
<td>服务器可访问的图像文件或PDF文件的URL，或上述类型文件内容的Base64编码结果。默认对于超过10页的PDF文件，只有前10页的内容会被处理。<br /> 要解除页数限制，请在产线配置文件中添加以下配置：
<pre><code>Serving:
  extra:
    max_num_input_imgs: null
</code></pre>
</td>
<td>是</td>
</tr>
<tr>
<td><code>fileType</code></td>
<td><code>integer</code> | <code>null</code></td>
<td>文件类型。<code>0</code>表示PDF文件，<code>1</code>表示图像文件。若请求体无此属性，则将根据URL推断文件类型。</td>
<td>否</td>
</tr>
<tr>
<td><code>useDocOrientationClassify</code></td>
<td><code>boolean</code> | <code>null</code></td>
<td>请参阅产线对象中 <code>predict</code> 方法的 <code>use_doc_orientation_classify</code> 参数相关说明。</td>
<td>否</td>
</tr>
<tr>
<td><code>useDocUnwarping</code></td>
<td><code>boolean</code> | <code>null</code></td>
<td>请参阅产线对象中 <code>predict</code> 方法的 <code>use_doc_unwarping</code> 参数相关说明。</td>
<td>否</td>
</tr>
<tr>
<td><code>visualize</code></td>
<td><code>boolean</code> | <code>null</code></td>
<td>是否返回可视化结果图以及处理过程中的中间图像等。
<ul style="margin: 0 0 0 1em; padding-left: 0em;">
<li>传入 <code>true</code>：返回图像。</li>
<li>传入 <code>false</code>：不返回图像。</li>
<li>若请求体中未提供该参数或传入 <code>null</code>：遵循产线配置文件<code>Serving.visualize</code> 的设置。</li>
</ul>
<br/>例如，在产线配置文件中添加如下字段：<br/>
<pre><code>Serving:
  visualize: False
</code></pre>
将默认不返回图像，通过请求体中的<code>visualize</code>参数可以覆盖默认行为。如果请求体和配置文件中均未设置（或请求体传入<code>null</code>、配置文件中未设置），则默认返回图像。
</td>
<td>否</td>
</tr>
</tbody>
</table>
<ul>
<li>请求处理成功时，响应体的<code>result</code>具有如下属性：</li>
</ul>
<table>
<thead>
<tr>
<th>名称</th>
<th>类型</th>
<th>含义</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>docPreprocessingResults</code></td>
<td><code>object</code></td>
<td>文档图像预处理结果。数组长度为1（对于图像输入）或实际处理的文档页数（对于PDF输入）。对于PDF输入，数组中的每个元素依次表示PDF文件中实际处理的每一页的结果。</td>
</tr>
<tr>
<td><code>dataInfo</code></td>
<td><code>object</code></td>
<td>输入数据信息。</td>
</tr>
</tbody>
</table>
<p><code>docPreprocessingResults</code>中的每个元素为一个<code>object</code>，具有如下属性：</p>
<table>
<thead>
<tr>
<th>名称</th>
<th>类型</th>
<th>含义</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>outputImage</code></td>
<td><code>string</code></td>
<td>经过预处理的图像。图像为PNG格式，使用Base64编码。</td>
</tr>
<tr>
<td><code>prunedResult</code></td>
<td><code>object</code></td>
<td>产线对象的 <code>predict</code> 方法生成结果的 JSON 表示中 <code>res</code> 字段的简化版本，其中去除了 <code>input_path</code> 和 <code>page_index</code> 字段。</td>
</tr>
<tr>
<td><code>docPreprocessingImage</code></td>
<td><code>string</code> ｜ <code>null</code></td>
<td>可视化结果图。图像为JPEG格式，使用Base64编码。</td>
</tr>
<tr>
<td><code>inputImage</code></td>
<td><code>string</code> ｜ <code>null</code></td>
<td>输入图像。图像为JPEG格式，使用Base64编码。</td>
</tr>
</tbody>
</table>
</details>
<details><summary>多语言调用服务示例</summary>
<details>
<summary>Python</summary>

<pre><code class="language-python">import base64
import requests

API_URL = "http://localhost:8080/document-preprocessing"
file_path = "./demo.jpg"

with open(file_path, "rb") as file:
    file_bytes = file.read()
    file_data = base64.b64encode(file_bytes).decode("ascii")

payload = {"file": file_data, "fileType": 1}

response = requests.post(API_URL, json=payload)

assert response.status_code == 200
result = response.json()["result"]
for i, res in enumerate(result["docPreprocessingResults"]):
    print(res["prunedResult"])
    output_img_path = f"out_{i}.png"
    with open(output_img_path, "wb") as f:
        f.write(base64.b64decode(res["outputImage"]))
    print(f"Output image saved at {output_img_path}")
</code></pre></details>

<details><summary>C++</summary>

<pre><code class="language-cpp">#include &lt;iostream&gt;
#include &lt;fstream&gt;
#include &lt;vector&gt;
#include &lt;string&gt;
#include "cpp-httplib/httplib.h" // https://github.com/Huiyicc/cpp-httplib
#include "nlohmann/json.hpp" // https://github.com/nlohmann/json
#include "base64.hpp" // https://github.com/tobiaslocker/base64

int main() {

    httplib::Client client("localhost", 8080);
    const std::string filePath = "./demo.jpg";
    std::ifstream file(filePath, std::ios::binary | std::ios::ate);
    if (!file) {
        std::cerr << "Error opening file: " << filePath << std::endl;
        return 1;
    }

    std::streamsize size = file.tellg();
    file.seekg(0, std::ios::beg);
    std::vector<char> buffer(size);
    if (!file.read(buffer.data(), size)) {
        std::cerr << "Error reading file." << std::endl;
        return 1;
    }

    std::string bufferStr(buffer.data(), static_cast<size_t>(size));
    std::string encodedFile = base64::to_base64(bufferStr);

    nlohmann::json jsonObj;
    jsonObj["file"] = encodedFile;
    jsonObj["fileType"] = 1;

    auto response = client.Post("/document-preprocessing", jsonObj.dump(), "application/json");

    if (response && response->status == 200) {
        nlohmann::json jsonResponse = nlohmann::json::parse(response->body);
        auto result = jsonResponse["result"];

        if (!result.is_object() || !result["docPreprocessingResults"].is_array()) {
            std::cerr << "Unexpected response format." << std::endl;
            return 1;
        }

        for (size_t i = 0; i < result["docPreprocessingResults"].size(); ++i) {
            auto res = result["docPreprocessingResults"][i];

            if (res.contains("prunedResult")) {
                std::cout << "Preprocessed result: " << res["prunedResult"].dump() << std::endl;
            }

            if (res.contains("outputImage")) {
                std::string outputImgPath = "out_" + std::to_string(i) + ".png";
                std::string decodedImage = base64::from_base64(res["outputImage"].get<std::string>());

                std::ofstream outFile(outputImgPath, std::ios::binary);
                if (outFile.is_open()) {
                    outFile.write(decodedImage.c_str(), decodedImage.size());
                    outFile.close();
                    std::cout << "Saved image: " << outputImgPath << std::endl;
                } else {
                    std::cerr << "Failed to write image: " << outputImgPath << std::endl;
                }
            }
        }
    } else {
        std::cerr << "Request failed." << std::endl;
        if (response) {
            std::cerr << "HTTP status: " << response->status << std::endl;
            std::cerr << "Response body: " << response->body << std::endl;
        }
        return 1;
    }

    return 0;
}

</code></pre></details>

<details><summary>Java</summary>

<pre><code class="language-java">import okhttp3.*;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.node.ObjectNode;

import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.util.Base64;

public class Main {
    public static void main(String[] args) throws IOException {
        String API_URL = "http://localhost:8080/document-preprocessing";
        String imagePath = "./demo.jpg";

        File file = new File(imagePath);
        byte[] fileContent = java.nio.file.Files.readAllBytes(file.toPath());
        String base64Image = Base64.getEncoder().encodeToString(fileContent);

        ObjectMapper objectMapper = new ObjectMapper();
        ObjectNode payload = objectMapper.createObjectNode();
        payload.put("file", base64Image);
        payload.put("fileType", 1);

        OkHttpClient client = new OkHttpClient();
        MediaType JSON = MediaType.get("application/json; charset=utf-8");
        RequestBody body = RequestBody.create(JSON, payload.toString());

        Request request = new Request.Builder()
                .url(API_URL)
                .post(body)
                .build();

        try (Response response = client.newCall(request).execute()) {
            if (response.isSuccessful()) {
                String responseBody = response.body().string();
                JsonNode root = objectMapper.readTree(responseBody);
                JsonNode result = root.get("result");

                JsonNode docPreprocessingResults = result.get("docPreprocessingResults");
                for (int i = 0; i < docPreprocessingResults.size(); i++) {
                    JsonNode item = docPreprocessingResults.get(i);
                    int finalI = i;

                    JsonNode prunedResult = item.get("prunedResult");
                    System.out.println("Pruned Result [" + i + "]: " + prunedResult.toString());

                    String outputImgBase64 = item.get("outputImage").asText();
                    byte[] outputImgBytes = Base64.getDecoder().decode(outputImgBase64);
                    String outputImgPath = "out_" + finalI + ".png";
                    try (FileOutputStream fos = new FileOutputStream(outputImgPath)) {
                        fos.write(outputImgBytes);
                        System.out.println("Saved output image: " + outputImgPath);
                    }

                    JsonNode inputImageNode = item.get("inputImage");
                    if (inputImageNode != null && !inputImageNode.isNull()) {
                        String inputImageBase64 = inputImageNode.asText();
                        byte[] inputImageBytes = Base64.getDecoder().decode(inputImageBase64);
                        String inputImgPath = "inputImage_" + i + ".jpg";
                        try (FileOutputStream fos = new FileOutputStream(inputImgPath)) {
                            fos.write(inputImageBytes);
                            System.out.println("Saved input image to: " + inputImgPath);
                        }
                    }
                }
            } else {
                System.err.println("Request failed with HTTP code: " + response.code());
            }
        }
    }
}
</code></pre></details>

<details><summary>Go</summary>

<pre><code class="language-go">package main

import (
    "bytes"
    "encoding/base64"
    "encoding/json"
    "fmt"
    "io/ioutil"
    "net/http"
    "os"
)

func main() {
    API_URL := "http://localhost:8080/document-preprocessing"
    filePath := "./demo.jpg"

    fileBytes, err := ioutil.ReadFile(filePath)
    if err != nil {
        fmt.Printf("Error reading file: %v\n", err)
        return
    }
    fileData := base64.StdEncoding.EncodeToString(fileBytes)

    payload := map[string]interface{}{
        "file":     fileData,
        "fileType": 1,
    }
    payloadBytes, err := json.Marshal(payload)
    if err != nil {
        fmt.Printf("Error marshaling payload: %v\n", err)
        return
    }

    client := &http.Client{}
    req, err := http.NewRequest("POST", API_URL, bytes.NewBuffer(payloadBytes))
    if err != nil {
        fmt.Printf("Error creating request: %v\n", err)
        return
    }
    req.Header.Set("Content-Type", "application/json")

    res, err := client.Do(req)
    if err != nil {
        fmt.Printf("Error sending request: %v\n", err)
        return
    }
    defer res.Body.Close()

    if res.StatusCode != http.StatusOK {
        fmt.Printf("Unexpected status code: %d\n", res.StatusCode)
        return
    }

    body, err := ioutil.ReadAll(res.Body)
    if err != nil {
        fmt.Printf("Error reading response body: %v\n", err)
        return
    }

    type DocPreprocessingResult struct {
        PrunedResult         map[string]interface{} `json:"prunedResult"`
        OutputImage          string                 `json:"outputImage"`
        DocPreprocessingImage *string               `json:"docPreprocessingImage"`
        InputImage           *string                `json:"inputImage"`
    }

    type Response struct {
        Result struct {
            DocPreprocessingResults []DocPreprocessingResult `json:"docPreprocessingResults"`
            DataInfo                interface{}              `json:"dataInfo"`
        } `json:"result"`
    }

    var respData Response
    if err := json.Unmarshal(body, &respData); err != nil {
        fmt.Printf("Error unmarshaling response: %v\n", err)
        return
    }

    for i, res := range respData.Result.DocPreprocessingResults {
        fmt.Printf("Result %d - prunedResult: %+v\n", i, res.PrunedResult)

        imgBytes, err := base64.StdEncoding.DecodeString(res.OutputImage)
        if err != nil {
            fmt.Printf("Error decoding outputImage at index %d: %v\n", i, err)
            continue
        }

        filename := fmt.Sprintf("out_%d.png", i)
        if err := os.WriteFile(filename, imgBytes, 0644); err != nil {
            fmt.Printf("Error saving image %s: %v\n", filename, err)
            continue
        }
        fmt.Printf("Saved output image to %s\n", filename)
    }
}
</code></pre></details>

<details><summary>C#</summary>

<pre><code class="language-csharp">using System;
using System.IO;
using System.Net.Http;
using System.Text;
using System.Threading.Tasks;
using Newtonsoft.Json.Linq;

class Program
{
    static readonly string API_URL = "http://localhost:8080/document-preprocessing";
    static readonly string inputFilePath = "./demo.jpg";

    static async Task Main(string[] args)
    {
        var httpClient = new HttpClient();

        byte[] fileBytes = File.ReadAllBytes(inputFilePath);
        string fileData = Convert.ToBase64String(fileBytes);

        var payload = new JObject
        {
            { "file", fileData },
            { "fileType", 1 }
        };
        var content = new StringContent(payload.ToString(), Encoding.UTF8, "application/json");

        HttpResponseMessage response = await httpClient.PostAsync(API_URL, content);
        response.EnsureSuccessStatusCode();

        string responseBody = await response.Content.ReadAsStringAsync();
        JObject jsonResponse = JObject.Parse(responseBody);

        JArray docPreResults = (JArray)jsonResponse["result"]["docPreprocessingResults"];
        for (int i = 0; i < docPreResults.Count; i++)
        {
            var res = docPreResults[i];
            Console.WriteLine($"[{i}] prunedResult:\n{res["prunedResult"]}");

            string base64Image = res["outputImage"]?.ToString();
            if (!string.IsNullOrEmpty(base64Image))
            {
                string outputPath = $"out_{i}.png";
                byte[] imageBytes = Convert.FromBase64String(base64Image);
                File.WriteAllBytes(outputPath, imageBytes);
                Console.WriteLine($"Output image saved at {outputPath}");
            }
            else
            {
                Console.WriteLine($"outputImage at index {i} is null.");
            }
        }
    }
}
</code></pre></details>

<details><summary>Node.js</summary>

<pre><code class="language-js">const axios = require('axios');
const fs = require('fs');
const path = require('path');

const API_URL = 'http://localhost:8080/document-preprocessing';
const imagePath = './demo.jpg';

function encodeImageToBase64(filePath) {
  const bitmap = fs.readFileSync(filePath);
  return Buffer.from(bitmap).toString('base64');
}

const payload = {
  file: encodeImageToBase64(imagePath),
  fileType: 1
};

axios.post(API_URL, payload, {
  headers: {
    'Content-Type': 'application/json'
  },
  maxBodyLength: Infinity
})
.then((response) => {
  const results = response.data.result.docPreprocessingResults;

  results.forEach((res, index) => {
    console.log(`\n[${index}] prunedResult:`);
    console.log(res.prunedResult);

    const base64Image = res.outputImage;
    if (base64Image) {
      const outputImagePath = `out_${index}.png`;
      const imageBuffer = Buffer.from(base64Image, 'base64');
      fs.writeFileSync(outputImagePath, imageBuffer);
      console.log(`Output image saved at ${outputImagePath}`);
    } else {
      console.log(`outputImage at index ${index} is null.`);
    }
  });
})
.catch((error) => {
  console.error('API error:', error.message);
});
</code></pre></details>

<details><summary>PHP</summary>

<pre><code class="language-php">&lt;?php

$API_URL = "http://localhost:8080/document-preprocessing";
$image_path = "./demo.jpg";
$output_image_path = "./out_0.png";

$image_data = base64_encode(file_get_contents($image_path));
$payload = array("file" => $image_data, "fileType" => 1);

$ch = curl_init($API_URL);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, json_encode($payload));
curl_setopt($ch, CURLOPT_HTTPHEADER, array('Content-Type: application/json'));
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$response = curl_exec($ch);
curl_close($ch);

$result = json_decode($response, true)["result"]["docPreprocessingResults"];

foreach ($result as $i => $item) {
    echo "[$i] prunedResult:\n";
    print_r($item["prunedResult"]);

    if (!empty($item["outputImage"])) {
        $output_image_path = "out_" . $i . ".png";
        file_put_contents($output_image_path, base64_decode($item["outputImage"]));
        echo "Output image saved at $output_image_path\n";
    } else {
        echo "No outputImage found for item $i\n";
    }
}
?&gt;
</code></pre></details>
</details>
<br/>

## 4. 二次开发

如果文档图像预处理产线提供的默认模型权重在您的场景中，精度或速度不满意，您可以尝试利用<b>您自己拥有的特定领域或应用场景的数据</b>对现有模型进行进一步的<b>微调</b>，以提升文档图像预处理产线的在您的场景中的识别效果。

### 4.1 模型微调

由于文档图像预处理产线包含若干模块，模型产线的效果如果不及预期，可能来自于其中任何一个模块。您可以对识别效果差的图片进行分析，进而确定是哪个模块存在问题，并参考以下表格中对应的微调教程链接进行模型微调。


<table>
<thead>
<tr>
<th>情形</th>
<th>微调模块</th>
<th>微调参考链接</th>
</tr>
</thead>
<tbody>
<tr>
<td>整图旋转矫正不准</td>
<td>文档图像方向分类模块</td>
<td><a href="https://paddlepaddle.github.io/PaddleX/latest/module_usage/tutorials/ocr_modules/doc_img_orientation_classification.html#_5">链接</a></td>
</tr>
<tr>
<td>图像扭曲矫正不准</td>
<td>文本图像矫正模块</td>
<td>暂不支持微调</td>
</tr>
</tbody>
</table>
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
+								---
 								comments: true
 								---
 								# 文档图像预处理产线使用教程
 								## 1. 文档图像预处理产线介绍
 								文档图像预处理产线集成了文档方向分类和形变矫正两大功能。文档方向分类可自动识别文档的四个方向（0°、90°、180°、270°），确保文档以正确的方向进行后续处理。文本图像矫正模型则用于修正文档拍摄或扫描过程中的几何扭曲，恢复文档的原始形状和比例。适用于数字化文档管理、OCR类任务前处理、以及任何需要提高文档图像质量的场景。通过自动化的方向校正与形变矫正，该模块显著提升了文档处理的准确性和效率，为用户提供更为可靠的图像分析基础。本产线同时提供了灵活的服务化部署方式，支持在多种硬件上使用多种编程语言调用。不仅如此，本产线也提供了二次开发的能力，您可以基于本产线在您自己的数据集上训练调优，训练后的模型也可以无缝集成。
 								<img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/main/images/pipelines/doc_preprocessor/02.jpg">
 								<b>通用文档图像预处理产线中包含以下2个模块。每个模块均可独立进行训练和推理，并包含多个模型。有关详细信息，请点击相应模块以查看文档。</b>
 								- [文档图像方向分类模块](../module_usage/doc_img_orientation_classification.md)（可选）
 								- [文本图像矫正模块](../module_usage/text_image_unwarping.md)（可选）
 								在本产线中，您可以根据下方的基准测试数据选择使用的模型。
 								<details>
 								<summary> <b>文档图像方向分类模块（可选）：</b></summary>
 								<table>
 								<thead>
 								<tr>
 								<th>模型</th><th>模型下载链接</th>
 								<th>Top-1 Acc（%）</th>
 								<th>GPU推理耗时（ms）<br/>[常规模式 / 高性能模式]</th>
 								<th>CPU推理耗时（ms）<br/>[常规模式 / 高性能模式]</th>
-												[docs] Update Performance Statistics in Docs (#15819)

* Fix docs

* update performance

* update performance

* Fix docs

* update performance

* update performance

* Update docs

* Update tensorrt docs

* update performance

* update performance

* update performance

* update performance

* update performance

* modify chart recognition

* modify seal recognition

* Add resource notice

* Fix mcp docs

* Fix doc

* Fix name

* update performance

* Fix docs

* Fix docs

* Refactor MCP server docs

---------

Co-authored-by: Bobholamovic <mhlin425@whu.edu.cn>
Co-authored-by: Bobholamovic <bob1998425@hotmail.com>
											
										
										
											2025-06-26 20:32:46 +08:00
+								<th>模型存储大小（MB）</th>
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
+								<th>介绍</th>
 								</tr>
 								</thead>
 								<tbody>
 								<tr>
-												[docs] Update Performance Statistics in Docs (#15819)

* Fix docs

* update performance

* update performance

* Fix docs

* update performance

* update performance

* Update docs

* Update tensorrt docs

* update performance

* update performance

* update performance

* update performance

* update performance

* modify chart recognition

* modify seal recognition

* Add resource notice

* Fix mcp docs

* Fix doc

* Fix name

* update performance

* Fix docs

* Fix docs

* Refactor MCP server docs

---------

Co-authored-by: Bobholamovic <mhlin425@whu.edu.cn>
Co-authored-by: Bobholamovic <bob1998425@hotmail.com>
											
										
										
											2025-06-26 20:32:46 +08:00
+								<td>PP-LCNet_x1_0_doc_ori</td>
 								<td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0.0/PP-LCNet_x1_0_doc_ori_infer.tar">推理模型</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/PP-LCNet_x1_0_doc_ori_pretrained.pdparams">训练模型</a></td>
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
+								<td>99.06</td>
-												[docs] Update Performance Statistics in Docs (#15819)

* Fix docs

* update performance

* update performance

* Fix docs

* update performance

* update performance

* Update docs

* Update tensorrt docs

* update performance

* update performance

* update performance

* update performance

* update performance

* modify chart recognition

* modify seal recognition

* Add resource notice

* Fix mcp docs

* Fix doc

* Fix name

* update performance

* Fix docs

* Fix docs

* Refactor MCP server docs

---------

Co-authored-by: Bobholamovic <mhlin425@whu.edu.cn>
Co-authored-by: Bobholamovic <bob1998425@hotmail.com>
											
										
										
											2025-06-26 20:32:46 +08:00
+								<td>2.62 / 0.59</td>
 								<td>3.24 / 1.19</td>
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
+								<td>7</td>
 								<td>基于PP-LCNet_x1_0的文档图像分类模型，含有四个类别，即0度，90度，180度，270度</td>
 								</tr>
 								</tbody>
 								</table>
 								</details>
 								<details>
 								<summary> <b>文本图像矫正模块（可选）：</b></summary>
 								<table>
 								<thead>
 								<tr>
 								<th>模型</th><th>模型下载链接</th>
 								<th>CER </th>
-												[docs] Update Performance Statistics in Docs (#15819)

* Fix docs

* update performance

* update performance

* Fix docs

* update performance

* update performance

* Update docs

* Update tensorrt docs

* update performance

* update performance

* update performance

* update performance

* update performance

* modify chart recognition

* modify seal recognition

* Add resource notice

* Fix mcp docs

* Fix doc

* Fix name

* update performance

* Fix docs

* Fix docs

* Refactor MCP server docs

---------

Co-authored-by: Bobholamovic <mhlin425@whu.edu.cn>
Co-authored-by: Bobholamovic <bob1998425@hotmail.com>
											
										
										
											2025-06-26 20:32:46 +08:00
+								<th>GPU推理耗时（ms）<br/>[常规模式 / 高性能模式]</th>
 								<th>CPU推理耗时（ms）<br/>[常规模式 / 高性能模式]</th>
 								<th>模型存储大小（MB）</th>
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
+								<th>介绍</th>
 								</tr>
 								</thead>
 								<tbody>
 								<tr>
-												[docs] Update Performance Statistics in Docs (#15819)

* Fix docs

* update performance

* update performance

* Fix docs

* update performance

* update performance

* Update docs

* Update tensorrt docs

* update performance

* update performance

* update performance

* update performance

* update performance

* modify chart recognition

* modify seal recognition

* Add resource notice

* Fix mcp docs

* Fix doc

* Fix name

* update performance

* Fix docs

* Fix docs

* Refactor MCP server docs

---------

Co-authored-by: Bobholamovic <mhlin425@whu.edu.cn>
Co-authored-by: Bobholamovic <bob1998425@hotmail.com>
											
										
										
											2025-06-26 20:32:46 +08:00
+								<td>UVDoc</td>
 								<td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0.0/UVDoc_infer.tar">推理模型</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/UVDoc_pretrained.pdparams">训练模型</a></td>
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
+								<td>0.179</td>
-												[docs] Update Performance Statistics in Docs (#15819)

* Fix docs

* update performance

* update performance

* Fix docs

* update performance

* update performance

* Update docs

* Update tensorrt docs

* update performance

* update performance

* update performance

* update performance

* update performance

* modify chart recognition

* modify seal recognition

* Add resource notice

* Fix mcp docs

* Fix doc

* Fix name

* update performance

* Fix docs

* Fix docs

* Refactor MCP server docs

---------

Co-authored-by: Bobholamovic <mhlin425@whu.edu.cn>
Co-authored-by: Bobholamovic <bob1998425@hotmail.com>
											
										
										
											2025-06-26 20:32:46 +08:00
+								<td>19.05 / 19.05</td>
 								<td>- / 869.82</td>
 								<td>30.3</td>
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
+								<td>高精度文本图像矫正模型</td>
 								</tr>
 								</tbody>
 								</table>
 								</details>
 								<details>
 								<summary> <b>测试环境说明：</b></summary>
 								  <ul>
 								      <li><b>性能测试环境</b>
 								          <ul>
 								                      <li><strong>测试数据集：
 								             </strong>
 								                <ul>
-												Fix doc (#15230)

* remove

* remove PaddleX
											
										
										
											2025-05-20 17:42:04 +08:00
+								                  <li>文档图像方向分类模型：自建的内部数据集，覆盖证件和文档等多个场景，包含 1000 张图片。</li>
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
+								                  <li> 文本图像矫正模型：<a href="https://www3.cs.stonybrook.edu/~cvl/docunet.html">DocUNet。</a></li>
 								                </ul>
 								             </li>
 								              <li><strong>硬件配置：</strong>
 								                  <ul>
 								                      <li>GPU：NVIDIA Tesla T4</li>
 								                      <li>CPU：Intel Xeon Gold 6271C @ 2.60GHz</li>
-												[docs] Modify test environment (#15914)

* modify test environment

* modify test environment

* modify test environment
											
										
										
											2025-06-30 20:26:05 +08:00
+								                  </ul>
 								              </li>
 								              <li><strong>软件环境：</strong>
 								                  <ul>
 								                      <li>Ubuntu 20.04 / CUDA 11.8 / cuDNN 8.9 / TensorRT 8.6.1.6</li>
 								                      <li>paddlepaddle 3.0.0 / paddleocr 3.0.3</li>
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
+								                  </ul>
 								              </li>
 								          </ul>
 								      </li>
 								      <li><b>推理模式说明</b></li>
 								  </ul>
 								<table border="1">
 								    <thead>
 								        <tr>
 								            <th>模式</th>
 								            <th>GPU配置</th>
 								            <th>CPU配置</th>
 								            <th>加速技术组合</th>
 								        </tr>
 								    </thead>
 								    <tbody>
 								        <tr>
 								            <td>常规模式</td>
 								            <td>FP32精度 / 无TRT加速</td>
 								            <td>FP32精度 / 8线程</td>
 								            <td>PaddleInference</td>
 								        </tr>
 								        <tr>
 								            <td>高性能模式</td>
 								            <td>选择先验精度类型和加速策略的最优组合</td>
 								            <td>FP32精度 / 8线程</td>
 								            <td>选择先验最优后端（Paddle/OpenVINO/TRT等）</td>
 								        </tr>
 								    </tbody>
 								</table>
 								</details>
 								## 2. 快速开始
-												fix all docs (#15195)

* fix all docs

* fix all docs

* fix all docs
											
										
										
											2025-05-20 03:29:47 +08:00
 								在本地使用通用文档图像预处理产线前，请确保您已经按照[安装教程](../installation.md)完成了wheel包安装。安装完成后，可以在本地使用命令行体验或 Python 集成。
-												[docs] Update Performance Statistics in Docs (#15819)

* Fix docs

* update performance

* update performance

* Fix docs

* update performance

* update performance

* Update docs

* Update tensorrt docs

* update performance

* update performance

* update performance

* update performance

* update performance

* modify chart recognition

* modify seal recognition

* Add resource notice

* Fix mcp docs

* Fix doc

* Fix name

* update performance

* Fix docs

* Fix docs

* Refactor MCP server docs

---------

Co-authored-by: Bobholamovic <mhlin425@whu.edu.cn>
Co-authored-by: Bobholamovic <bob1998425@hotmail.com>
											
										
										
											2025-06-26 20:32:46 +08:00
+								**请注意，如果在执行过程中遇到程序失去响应、程序异常退出、内存资源耗尽、推理速度极慢等问题，请尝试参考文档调整配置，例如关闭不需要使用的功能或使用更轻量的模型。**
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
 								### 2.1 命令行方式体验
 								一行命令即可快速体验 doc_preprocessor 产线效果：
 								```bash
 								paddleocr doc_preprocessor -i https://paddle-model-ecology.bj.bcebos.com/paddlex/demo_image/doc_test_rotated.jpg
 								# 通过 --use_doc_orientation_classify 指定是否使用文档方向分类模型
 								paddleocr doc_preprocessor -i ./doc_test_rotated.jpg --use_doc_orientation_classify True
 								# 通过 --use_doc_unwarping 指定是否使用文本图像矫正模块
 								paddleocr doc_preprocessor -i ./doc_test_rotated.jpg --use_doc_unwarping True
 								# 通过 --device 指定模型推理时使用 GPU
 								paddleocr doc_preprocessor -i ./doc_test_rotated.jpg --device gpu
 								```
 								<details><summary><b>命令行支持更多参数设置，点击展开以查看命令行参数的详细说明</b></summary>
 								<table>
 								<thead>
 								<tr>
 								<th>参数</th>
 								<th>参数说明</th>
 								<th>参数类型</th>
 								<th>默认值</th>
 								</tr>
 								</thead>
 								<tbody>
 								<tr>
-												Modify the information of pipeline_usage (#15188)

* Modify the information of pipeline_usage

* rerun CI
											
										
										
											2025-05-20 15:17:30 +08:00
+								<td><code>input</code></td>
-												Fix docs (#15303)

* Revise the English document

* Remove the device from the predict()

* replace general_formula_recognition_001.png

* modify docs and paddleocr/pp_structurev3.py

* modify docs

* modify docs load and use

* Modify GPU device

* modify docs

* modify docs_2

* modify docs-4

* modify docs-5
											
										
										
											2025-05-28 10:24:37 +08:00
+								<td>待预测数据，必填。
-												[Docs] Fix documentation (#15477)

* modify docs0528

* modify docs0529

* modify docs0529-2

* modify docs0529-3

* change 960,max to 64,min

* Proofread Chinese, English documents. modify ocr.py and test_ocr.py

* modify pipeline to Pipeline

* check-code-style

* check-code-style-2

* modify doc_preprocessor.md

* fix check-code-style

* modify dict

* modify languages

* modify languages

* modify docs
											
										
										
											2025-06-05 17:43:57 +08:00
+								如图像文件或者PDF文件的本地路径：<code>/root/data/img.jpg</code>；<b>如URL链接</b>，如图像文件或PDF文件的网络URL：<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/demo_image/doc_test_rotated.jpg">示例</a>；<b>如本地目录</b>，该目录下需包含待预测图像，如本地路径：<code>/root/data/</code>(当前不支持目录中包含PDF文件的预测，PDF文件需要指定到具体文件路径)。
-												Modify the information of pipeline_usage (#15188)

* Modify the information of pipeline_usage

* rerun CI
											
										
										
											2025-05-20 15:17:30 +08:00
+								</td>
-												Fix docs (#15303)

* Revise the English document

* Remove the device from the predict()

* replace general_formula_recognition_001.png

* modify docs and paddleocr/pp_structurev3.py

* modify docs

* modify docs load and use

* Modify GPU device

* modify docs

* modify docs_2

* modify docs-4

* modify docs-5
											
										
										
											2025-05-28 10:24:37 +08:00
+								<td><code>str</code></td>
-												Modify the information of pipeline_usage (#15188)

* Modify the information of pipeline_usage

* rerun CI
											
										
										
											2025-05-20 15:17:30 +08:00
+								<td></td>
 								</tr>
 								<tr>
 								<td><code>save_path</code></td>
-												Fix docs (#15303)

* Revise the English document

* Remove the device from the predict()

* replace general_formula_recognition_001.png

* modify docs and paddleocr/pp_structurev3.py

* modify docs

* modify docs load and use

* Modify GPU device

* modify docs

* modify docs_2

* modify docs-4

* modify docs-5
											
										
										
											2025-05-28 10:24:37 +08:00
+								<td>指定推理结果文件保存的路径。如果不设置，推理结果将不会保存到本地。</td>
-												Modify the information of pipeline_usage (#15188)

* Modify the information of pipeline_usage

* rerun CI
											
										
										
											2025-05-20 15:17:30 +08:00
+								<td><code>str</code></td>
-												Fix docs (#15303)

* Revise the English document

* Remove the device from the predict()

* replace general_formula_recognition_001.png

* modify docs and paddleocr/pp_structurev3.py

* modify docs

* modify docs load and use

* Modify GPU device

* modify docs

* modify docs_2

* modify docs-4

* modify docs-5
											
										
										
											2025-05-28 10:24:37 +08:00
+								<td></td>
-												Modify the information of pipeline_usage (#15188)

* Modify the information of pipeline_usage

* rerun CI
											
										
										
											2025-05-20 15:17:30 +08:00
+								</tr>
 								<tr>
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
+								<td><code>doc_orientation_classify_model_name</code></td>
-												Fix docs (#15303)

* Revise the English document

* Remove the device from the predict()

* replace general_formula_recognition_001.png

* modify docs and paddleocr/pp_structurev3.py

* modify docs

* modify docs load and use

* Modify GPU device

* modify docs

* modify docs_2

* modify docs-4

* modify docs-5
											
										
										
											2025-05-28 10:24:37 +08:00
+								<td>文档方向分类模型的名称。如果不设置，将会使用产线默认模型。</td>
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
+								<td><code>str</code></td>
-												Fix docs (#15303)

* Revise the English document

* Remove the device from the predict()

* replace general_formula_recognition_001.png

* modify docs and paddleocr/pp_structurev3.py

* modify docs

* modify docs load and use

* Modify GPU device

* modify docs

* modify docs_2

* modify docs-4

* modify docs-5
											
										
										
											2025-05-28 10:24:37 +08:00
+								<td></td>
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
+								</tr>
 								<tr>
 								<td><code>doc_orientation_classify_model_dir</code></td>
-												Fix docs (#15303)

* Revise the English document

* Remove the device from the predict()

* replace general_formula_recognition_001.png

* modify docs and paddleocr/pp_structurev3.py

* modify docs

* modify docs load and use

* Modify GPU device

* modify docs

* modify docs_2

* modify docs-4

* modify docs-5
											
										
										
											2025-05-28 10:24:37 +08:00
+								<td>文档方向分类模型的目录路径。如果不设置，将会下载官方模型。</td>
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
+								<td><code>str</code></td>
-												Fix docs (#15303)

* Revise the English document

* Remove the device from the predict()

* replace general_formula_recognition_001.png

* modify docs and paddleocr/pp_structurev3.py

* modify docs

* modify docs load and use

* Modify GPU device

* modify docs

* modify docs_2

* modify docs-4

* modify docs-5
											
										
										
											2025-05-28 10:24:37 +08:00
+								<td></td>
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
+								</tr>
 								<tr>
 								<td><code>doc_unwarping_model_name</code></td>
-												Fix docs (#15303)

* Revise the English document

* Remove the device from the predict()

* replace general_formula_recognition_001.png

* modify docs and paddleocr/pp_structurev3.py

* modify docs

* modify docs load and use

* Modify GPU device

* modify docs

* modify docs_2

* modify docs-4

* modify docs-5
											
										
										
											2025-05-28 10:24:37 +08:00
+								<td>文本图像矫正模型的名称。如果不设置，将会使用产线默认模型。</td>
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
+								<td><code>str</code></td>
-												Fix docs (#15303)

* Revise the English document

* Remove the device from the predict()

* replace general_formula_recognition_001.png

* modify docs and paddleocr/pp_structurev3.py

* modify docs

* modify docs load and use

* Modify GPU device

* modify docs

* modify docs_2

* modify docs-4

* modify docs-5
											
										
										
											2025-05-28 10:24:37 +08:00
+								<td></td>
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
+								</tr>
 								<tr>
 								<td><code>doc_unwarping_model_dir</code></td>
-												Fix docs (#15303)

* Revise the English document

* Remove the device from the predict()

* replace general_formula_recognition_001.png

* modify docs and paddleocr/pp_structurev3.py

* modify docs

* modify docs load and use

* Modify GPU device

* modify docs

* modify docs_2

* modify docs-4

* modify docs-5
											
										
										
											2025-05-28 10:24:37 +08:00
+								<td>文本图像矫正模型的目录路径。如果不设置，将会下载官方模型。</td>
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
+								<td><code>str</code></td>
-												Fix docs (#15303)

* Revise the English document

* Remove the device from the predict()

* replace general_formula_recognition_001.png

* modify docs and paddleocr/pp_structurev3.py

* modify docs

* modify docs load and use

* Modify GPU device

* modify docs

* modify docs_2

* modify docs-4

* modify docs-5
											
										
										
											2025-05-28 10:24:37 +08:00
+								<td></td>
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
+								</tr>
 								<tr>
 								<td><code>use_doc_orientation_classify</code></td>
-												[Feat] Add PP-DocTranslate and update docs (#15890)

* Add PP-DocTranslate and update docs

* Fix docs

* Add English doc

* Fix ut
											
										
										
											2025-06-28 23:13:32 +08:00
+								<td>是否加载并使用文档方向分类模块。如果不设置，将使用产线初始化的该参数值，默认初始化为<code>True</code>。
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
+								</td>
 								<td><code>bool</code></td>
-												Fix docs (#15303)

* Revise the English document

* Remove the device from the predict()

* replace general_formula_recognition_001.png

* modify docs and paddleocr/pp_structurev3.py

* modify docs

* modify docs load and use

* Modify GPU device

* modify docs

* modify docs_2

* modify docs-4

* modify docs-5
											
										
										
											2025-05-28 10:24:37 +08:00
+								<td></td>
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
+								</tr>
 								<tr>
 								<td><code>use_doc_unwarping</code></td>
-												[Feat] Add PP-DocTranslate and update docs (#15890)

* Add PP-DocTranslate and update docs

* Fix docs

* Add English doc

* Fix ut
											
										
										
											2025-06-28 23:13:32 +08:00
+								<td>是否加载并使用文本图像矫正模块。如果不设置，将使用产线初始化的该参数值，默认初始化为<code>True</code>。
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
+								</td>
 								<td><code>bool</code></td>
-												Fix docs (#15303)

* Revise the English document

* Remove the device from the predict()

* replace general_formula_recognition_001.png

* modify docs and paddleocr/pp_structurev3.py

* modify docs

* modify docs load and use

* Modify GPU device

* modify docs

* modify docs_2

* modify docs-4

* modify docs-5
											
										
										
											2025-05-28 10:24:37 +08:00
+								<td></td>
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
+								</tr>
 								<tr>
 								<td><code>device</code></td>
-												[Docs] Fix documentation (#15477)

* modify docs0528

* modify docs0529

* modify docs0529-2

* modify docs0529-3

* change 960,max to 64,min

* Proofread Chinese, English documents. modify ocr.py and test_ocr.py

* modify pipeline to Pipeline

* check-code-style

* check-code-style-2

* modify doc_preprocessor.md

* fix check-code-style

* modify dict

* modify languages

* modify languages

* modify docs
											
										
										
											2025-06-05 17:43:57 +08:00
+								<td>用于推理的设备。支持指定具体卡号：
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
+								<ul>
 								<li><b>CPU</b>：如 <code>cpu</code> 表示使用 CPU 进行推理；</li>
 								<li><b>GPU</b>：如 <code>gpu:0</code> 表示使用第 1 块 GPU 进行推理；</li>
 								<li><b>NPU</b>：如 <code>npu:0</code> 表示使用第 1 块 NPU 进行推理；</li>
 								<li><b>XPU</b>：如 <code>xpu:0</code> 表示使用第 1 块 XPU 进行推理；</li>
 								<li><b>MLU</b>：如 <code>mlu:0</code> 表示使用第 1 块 MLU 进行推理；</li>
 								<li><b>DCU</b>：如 <code>dcu:0</code> 表示使用第 1 块 DCU 进行推理；</li>
-												Fix docs (#15303)

* Revise the English document

* Remove the device from the predict()

* replace general_formula_recognition_001.png

* modify docs and paddleocr/pp_structurev3.py

* modify docs

* modify docs load and use

* Modify GPU device

* modify docs

* modify docs_2

* modify docs-4

* modify docs-5
											
										
										
											2025-05-28 10:24:37 +08:00
+								</ul>如果不设置，将默认使用产线初始化的该参数值，初始化时，会优先使用本地的 GPU 0号设备，如果没有，则使用 CPU 设备。
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
+								</td>
 								<td><code>str</code></td>
-												Fix docs (#15303)

* Revise the English document

* Remove the device from the predict()

* replace general_formula_recognition_001.png

* modify docs and paddleocr/pp_structurev3.py

* modify docs

* modify docs load and use

* Modify GPU device

* modify docs

* modify docs_2

* modify docs-4

* modify docs-5
											
										
										
											2025-05-28 10:24:37 +08:00
+								<td></td>
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
+								</tr>
 								<tr>
 								<td><code>enable_hpi</code></td>
 								<td>是否启用高性能推理。</td>
 								<td><code>bool</code></td>
 								<td><code>False</code></td>
 								</tr>
 								<tr>
 								<td><code>use_tensorrt</code></td>
-												[docs] Update Performance Statistics in Docs (#15819)

* Fix docs

* update performance

* update performance

* Fix docs

* update performance

* update performance

* Update docs

* Update tensorrt docs

* update performance

* update performance

* update performance

* update performance

* update performance

* modify chart recognition

* modify seal recognition

* Add resource notice

* Fix mcp docs

* Fix doc

* Fix name

* update performance

* Fix docs

* Fix docs

* Refactor MCP server docs

---------

Co-authored-by: Bobholamovic <mhlin425@whu.edu.cn>
Co-authored-by: Bobholamovic <bob1998425@hotmail.com>
											
										
										
											2025-06-26 20:32:46 +08:00
+								<td>是否启用 Paddle Inference 的 TensorRT 子图引擎。如果模型不支持通过 TensorRT 加速，即使设置了此标志，也不会使用加速。<br/>
 								对于 CUDA 11.8 版本的飞桨，兼容的 TensorRT 版本为 8.x（x>=6），建议安装 TensorRT 8.6.1.6。<br/>
-												update descriptions for use_hpip and use_tensorrt settings (#15616)

* update descriptions for use_hpip and use_tensorrt settings

* update

* update
											
										
										
											2025-06-09 14:59:41 +08:00
+								对于 CUDA 12.6 版本的飞桨，兼容的 TensorRT 版本为 10.x（x>=5），建议安装 TensorRT 10.5.0.18。
 								</td>
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
+								<td><code>bool</code></td>
 								<td><code>False</code></td>
 								</tr>
 								<tr>
 								<td><code>precision</code></td>
 								<td>计算精度，如 fp32、fp16。</td>
 								<td><code>str</code></td>
 								<td><code>fp32</code></td>
 								</tr>
 								<tr>
 								<td><code>enable_mkldnn</code></td>
-												Polish  description (#15610)


											
										
										
											2025-06-09 10:53:44 +08:00
+								<td>是否启用 MKL-DNN 加速推理。如果 MKL-DNN 不可用或模型不支持通过 MKL-DNN 加速，即使设置了此标志，也不会使用加速。
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
+								</td>
 								<td><code>bool</code></td>
-												[WIP][Feat] Accommodate PaddleX MKL-DNN new behavior (#15471)

* Accommodate PaddleX MKL-DNN new behavior

* Bump paddlex version

* Use stderr
											
										
										
											2025-06-01 18:27:52 +08:00
+								<td><code>True</code></td>
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
+								</tr>
 								<tr>
-												[Feat] Support setting mkldnn_cache_capacity (#15660)

* Support setting mkldnn_cache_capacity

* Update package description to sync with repo

* Remove min_subgraph_size
											
										
										
											2025-06-12 21:00:47 +08:00
+								<td><code>mkldnn_cache_capacity</code></td>
 								<td>
 								MKL-DNN 缓存容量。
 								</td>
 								<td><code>int</code></td>
 								<td><code>10</code></td>
 								</tr>
 								<tr>
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
+								<td><code>cpu_threads</code></td>
 								<td>在 CPU 上进行推理时使用的线程数。</td>
 								<td><code>int</code></td>
 								<td><code>8</code></td>
 								</tr>
-												Modify the information of pipeline_usage (#15188)

* Modify the information of pipeline_usage

* rerun CI
											
										
										
											2025-05-20 15:17:30 +08:00
+								<tr>
 								<td><code>paddlex_config</code></td>
 								<td>PaddleX产线配置文件路径。</td>
 								<td><code>str</code></td>
-												Fix docs (#15303)

* Revise the English document

* Remove the device from the predict()

* replace general_formula_recognition_001.png

* modify docs and paddleocr/pp_structurev3.py

* modify docs

* modify docs load and use

* Modify GPU device

* modify docs

* modify docs_2

* modify docs-4

* modify docs-5
											
										
										
											2025-05-28 10:24:37 +08:00
+								<td></td>
-												Modify the information of pipeline_usage (#15188)

* Modify the information of pipeline_usage

* rerun CI
											
										
										
											2025-05-20 15:17:30 +08:00
+								</tr>
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
+								</tbody>
 								</table>
 								</details>
 								<br />
 								运行结果会被打印到终端上，默认配置的 doc_preprocessor 产线的运行结果如下：
 								```bash
 								{'res': {'input_path': '/root/.paddlex/predict_input/doc_test_rotated.jpg', 'page_index': None, 'model_settings': {'use_doc_orientation_classify': True, 'use_doc_unwarping': True}, 'angle': 180}}
 								```
 								可视化结果保存在`save_path`下，可视化结果如下：
 								<img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/main/images/pipelines/doc_preprocessor/02.jpg"/>
 								### 2.2 Python脚本方式集成
 								命令行方式是为了快速体验查看效果，一般来说，在项目中，往往需要通过代码集成，您可以通过几行代码即可完成产线的快速推理，推理代码如下：
 								```python
 								from paddleocr import DocPreprocessor
 								pipeline = DocPreprocessor()
-												fix_docs_add_links (#15214)

* fix_docs_add_links

* fix_docs_add_links
											
										
										
											2025-05-20 15:56:36 +08:00
+								# docpp = DocPreprocessor(use_doc_orientation_classify=True) # 通过 use_doc_orientation_classify 指定是否使用文档方向分类模型
 								# docpp = DocPreprocessor(use_doc_unwarping=True) # 通过 use_doc_unwarping 指定是否使用文本图像矫正模块
 								# docpp = DocPreprocessor(device="gpu") # 通过 device 指定模型推理时使用 GPU
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
+								output = pipeline.predict("./doc_test_rotated.jpg")
 								for res in output:
 								    res.print() ## 打印预测的结构化输出
 								    res.save_to_img("./output/")
 								    res.save_to_json("./output/")
 								```
 								在上述 Python 脚本中，执行了如下几个步骤：
 								（1）通过 `DocPreprocessor()` 实例化 doc_preprocessor 产线对象：具体参数说明如下：
 								<table>
 								<thead>
 								<tr>
 								<th>参数</th>
 								<th>参数说明</th>
 								<th>参数类型</th>
 								<th>默认值</th>
 								</tr>
 								</thead>
 								<tbody>
 								<tr>
 								<td><code>doc_orientation_classify_model_name</code></td>
-												Fix docs (#15303)

* Revise the English document

* Remove the device from the predict()

* replace general_formula_recognition_001.png

* modify docs and paddleocr/pp_structurev3.py

* modify docs

* modify docs load and use

* Modify GPU device

* modify docs

* modify docs_2

* modify docs-4

* modify docs-5
											
										
										
											2025-05-28 10:24:37 +08:00
+								<td>文档方向分类模型的名称。如果设置为<code>None</code>，将会使用产线默认模型。</td>
-												[docs] Update Performance Statistics in Docs (#15819)

* Fix docs

* update performance

* update performance

* Fix docs

* update performance

* update performance

* Update docs

* Update tensorrt docs

* update performance

* update performance

* update performance

* update performance

* update performance

* modify chart recognition

* modify seal recognition

* Add resource notice

* Fix mcp docs

* Fix doc

* Fix name

* update performance

* Fix docs

* Fix docs

* Refactor MCP server docs

---------

Co-authored-by: Bobholamovic <mhlin425@whu.edu.cn>
Co-authored-by: Bobholamovic <bob1998425@hotmail.com>
											
										
										
											2025-06-26 20:32:46 +08:00
+								<td><code>str|None</code></td>
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
+								<td><code>None</code></td>
 								</tr>
 								<tr>
 								<td><code>doc_orientation_classify_model_dir</code></td>
-												Fix docs (#15303)

* Revise the English document

* Remove the device from the predict()

* replace general_formula_recognition_001.png

* modify docs and paddleocr/pp_structurev3.py

* modify docs

* modify docs load and use

* Modify GPU device

* modify docs

* modify docs_2

* modify docs-4

* modify docs-5
											
										
										
											2025-05-28 10:24:37 +08:00
+								<td>文档方向分类模型的目录路径。如果设置为<code>None</code>，将会下载官方模型。</td>
-												[docs] Update Performance Statistics in Docs (#15819)

* Fix docs

* update performance

* update performance

* Fix docs

* update performance

* update performance

* Update docs

* Update tensorrt docs

* update performance

* update performance

* update performance

* update performance

* update performance

* modify chart recognition

* modify seal recognition

* Add resource notice

* Fix mcp docs

* Fix doc

* Fix name

* update performance

* Fix docs

* Fix docs

* Refactor MCP server docs

---------

Co-authored-by: Bobholamovic <mhlin425@whu.edu.cn>
Co-authored-by: Bobholamovic <bob1998425@hotmail.com>
											
										
										
											2025-06-26 20:32:46 +08:00
+								<td><code>str|None</code></td>
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
+								<td><code>None</code></td>
 								</tr>
 								<tr>
 								<td><code>doc_unwarping_model_name</code></td>
-												Fix docs (#15303)

* Revise the English document

* Remove the device from the predict()

* replace general_formula_recognition_001.png

* modify docs and paddleocr/pp_structurev3.py

* modify docs

* modify docs load and use

* Modify GPU device

* modify docs

* modify docs_2

* modify docs-4

* modify docs-5
											
										
										
											2025-05-28 10:24:37 +08:00
+								<td>文本图像矫正模型的名称。如果设置为<code>None</code>，将会使用产线默认模型。</td>
-												[docs] Update Performance Statistics in Docs (#15819)

* Fix docs

* update performance

* update performance

* Fix docs

* update performance

* update performance

* Update docs

* Update tensorrt docs

* update performance

* update performance

* update performance

* update performance

* update performance

* modify chart recognition

* modify seal recognition

* Add resource notice

* Fix mcp docs

* Fix doc

* Fix name

* update performance

* Fix docs

* Fix docs

* Refactor MCP server docs

---------

Co-authored-by: Bobholamovic <mhlin425@whu.edu.cn>
Co-authored-by: Bobholamovic <bob1998425@hotmail.com>
											
										
										
											2025-06-26 20:32:46 +08:00
+								<td><code>str|None</code></td>
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
+								<td><code>None</code></td>
 								</tr>
 								<tr>
 								<td><code>doc_unwarping_model_dir</code></td>
-												Fix docs (#15303)

* Revise the English document

* Remove the device from the predict()

* replace general_formula_recognition_001.png

* modify docs and paddleocr/pp_structurev3.py

* modify docs

* modify docs load and use

* Modify GPU device

* modify docs

* modify docs_2

* modify docs-4

* modify docs-5
											
										
										
											2025-05-28 10:24:37 +08:00
+								<td>文本图像矫正模型的目录路径。如果设置为<code>None</code>，将会下载官方模型。</td>
-												[docs] Update Performance Statistics in Docs (#15819)

* Fix docs

* update performance

* update performance

* Fix docs

* update performance

* update performance

* Update docs

* Update tensorrt docs

* update performance

* update performance

* update performance

* update performance

* update performance

* modify chart recognition

* modify seal recognition

* Add resource notice

* Fix mcp docs

* Fix doc

* Fix name

* update performance

* Fix docs

* Fix docs

* Refactor MCP server docs

---------

Co-authored-by: Bobholamovic <mhlin425@whu.edu.cn>
Co-authored-by: Bobholamovic <bob1998425@hotmail.com>
											
										
										
											2025-06-26 20:32:46 +08:00
+								<td><code>str|None</code></td>
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
+								<td><code>None</code></td>
 								</tr>
 								<tr>
 								<td><code>use_doc_orientation_classify</code></td>
-												[Feat] Add PP-DocTranslate and update docs (#15890)

* Add PP-DocTranslate and update docs

* Fix docs

* Add English doc

* Fix ut
											
										
										
											2025-06-28 23:13:32 +08:00
+								<td>是否加载并使用文档方向分类模块。如果设置为<code>None</code>，将使用产线初始化的该参数值，默认初始化为<code>True</code>。
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
+								</td>
-												[docs] Update Performance Statistics in Docs (#15819)

* Fix docs

* update performance

* update performance

* Fix docs

* update performance

* update performance

* Update docs

* Update tensorrt docs

* update performance

* update performance

* update performance

* update performance

* update performance

* modify chart recognition

* modify seal recognition

* Add resource notice

* Fix mcp docs

* Fix doc

* Fix name

* update performance

* Fix docs

* Fix docs

* Refactor MCP server docs

---------

Co-authored-by: Bobholamovic <mhlin425@whu.edu.cn>
Co-authored-by: Bobholamovic <bob1998425@hotmail.com>
											
										
										
											2025-06-26 20:32:46 +08:00
+								<td><code>bool|None</code></td>
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
+								<td><code>None</code></td>
 								</tr>
 								<tr>
 								<td><code>use_doc_unwarping</code></td>
-												[Feat] Add PP-DocTranslate and update docs (#15890)

* Add PP-DocTranslate and update docs

* Fix docs

* Add English doc

* Fix ut
											
										
										
											2025-06-28 23:13:32 +08:00
+								<td>是否加载并使用文本图像矫正模块。如果设置为<code>None</code>，将使用产线初始化的该参数值，默认初始化为<code>True</code>。
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
+								</td>
-												[docs] Update Performance Statistics in Docs (#15819)

* Fix docs

* update performance

* update performance

* Fix docs

* update performance

* update performance

* Update docs

* Update tensorrt docs

* update performance

* update performance

* update performance

* update performance

* update performance

* modify chart recognition

* modify seal recognition

* Add resource notice

* Fix mcp docs

* Fix doc

* Fix name

* update performance

* Fix docs

* Fix docs

* Refactor MCP server docs

---------

Co-authored-by: Bobholamovic <mhlin425@whu.edu.cn>
Co-authored-by: Bobholamovic <bob1998425@hotmail.com>
											
										
										
											2025-06-26 20:32:46 +08:00
+								<td><code>bool|None</code></td>
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
+								<td><code>None</code></td>
 								</tr>
 								<tr>
 								<td><code>device</code></td>
-												[Docs] Fix documentation (#15477)

* modify docs0528

* modify docs0529

* modify docs0529-2

* modify docs0529-3

* change 960,max to 64,min

* Proofread Chinese, English documents. modify ocr.py and test_ocr.py

* modify pipeline to Pipeline

* check-code-style

* check-code-style-2

* modify doc_preprocessor.md

* fix check-code-style

* modify dict

* modify languages

* modify languages

* modify docs
											
										
										
											2025-06-05 17:43:57 +08:00
+								<td>用于推理的设备。支持指定具体卡号：
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
+								<ul>
 								<li><b>CPU</b>：如 <code>cpu</code> 表示使用 CPU 进行推理；</li>
 								<li><b>GPU</b>：如 <code>gpu:0</code> 表示使用第 1 块 GPU 进行推理；</li>
 								<li><b>NPU</b>：如 <code>npu:0</code> 表示使用第 1 块 NPU 进行推理；</li>
 								<li><b>XPU</b>：如 <code>xpu:0</code> 表示使用第 1 块 XPU 进行推理；</li>
 								<li><b>MLU</b>：如 <code>mlu:0</code> 表示使用第 1 块 MLU 进行推理；</li>
 								<li><b>DCU</b>：如 <code>dcu:0</code> 表示使用第 1 块 DCU 进行推理；</li>
-												Fix docs (#15303)

* Revise the English document

* Remove the device from the predict()

* replace general_formula_recognition_001.png

* modify docs and paddleocr/pp_structurev3.py

* modify docs

* modify docs load and use

* Modify GPU device

* modify docs

* modify docs_2

* modify docs-4

* modify docs-5
											
										
										
											2025-05-28 10:24:37 +08:00
+								<li><b>None</b>：如果设置为<code>None</code>，将默认使用产线初始化的该参数值，初始化时，会优先使用本地的 GPU 0号设备，如果没有，则使用 CPU 设备。</li>
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
+								</ul>
 								</td>
-												[docs] Update Performance Statistics in Docs (#15819)

* Fix docs

* update performance

* update performance

* Fix docs

* update performance

* update performance

* Update docs

* Update tensorrt docs

* update performance

* update performance

* update performance

* update performance

* update performance

* modify chart recognition

* modify seal recognition

* Add resource notice

* Fix mcp docs

* Fix doc

* Fix name

* update performance

* Fix docs

* Fix docs

* Refactor MCP server docs

---------

Co-authored-by: Bobholamovic <mhlin425@whu.edu.cn>
Co-authored-by: Bobholamovic <bob1998425@hotmail.com>
											
										
										
											2025-06-26 20:32:46 +08:00
+								<td><code>str|None</code></td>
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
+								<td><code>None</code></td>
 								</tr>
 								<tr>
 								<td><code>enable_hpi</code></td>
 								<td>是否启用高性能推理。</td>
 								<td><code>bool</code></td>
 								<td><code>False</code></td>
 								</tr>
 								<tr>
 								<td><code>use_tensorrt</code></td>
-												[docs] Update Performance Statistics in Docs (#15819)

* Fix docs

* update performance

* update performance

* Fix docs

* update performance

* update performance

* Update docs

* Update tensorrt docs

* update performance

* update performance

* update performance

* update performance

* update performance

* modify chart recognition

* modify seal recognition

* Add resource notice

* Fix mcp docs

* Fix doc

* Fix name

* update performance

* Fix docs

* Fix docs

* Refactor MCP server docs

---------

Co-authored-by: Bobholamovic <mhlin425@whu.edu.cn>
Co-authored-by: Bobholamovic <bob1998425@hotmail.com>
											
										
										
											2025-06-26 20:32:46 +08:00
+								<td>是否启用 Paddle Inference 的 TensorRT 子图引擎。如果模型不支持通过 TensorRT 加速，即使设置了此标志，也不会使用加速。<br/>
 								对于 CUDA 11.8 版本的飞桨，兼容的 TensorRT 版本为 8.x（x>=6），建议安装 TensorRT 8.6.1.6。<br/>
-												update descriptions for use_hpip and use_tensorrt settings (#15616)

* update descriptions for use_hpip and use_tensorrt settings

* update

* update
											
										
										
											2025-06-09 14:59:41 +08:00
+								对于 CUDA 12.6 版本的飞桨，兼容的 TensorRT 版本为 10.x（x>=5），建议安装 TensorRT 10.5.0.18。
 								</td>
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
+								<td><code>bool</code></td>
 								<td><code>False</code></td>
 								</tr>
 								<tr>
 								<td><code>precision</code></td>
 								<td>计算精度，如 fp32、fp16。</td>
 								<td><code>str</code></td>
-												Fix docs (#15303)

* Revise the English document

* Remove the device from the predict()

* replace general_formula_recognition_001.png

* modify docs and paddleocr/pp_structurev3.py

* modify docs

* modify docs load and use

* Modify GPU device

* modify docs

* modify docs_2

* modify docs-4

* modify docs-5
											
										
										
											2025-05-28 10:24:37 +08:00
+								<td><code>"fp32"</code></td>
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
+								</tr>
 								<tr>
 								<td><code>enable_mkldnn</code></td>
-												Polish  description (#15610)


											
										
										
											2025-06-09 10:53:44 +08:00
+								<td>是否启用 MKL-DNN 加速推理。如果 MKL-DNN 不可用或模型不支持通过 MKL-DNN 加速，即使设置了此标志，也不会使用加速。
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
+								</td>
 								<td><code>bool</code></td>
-												[WIP][Feat] Accommodate PaddleX MKL-DNN new behavior (#15471)

* Accommodate PaddleX MKL-DNN new behavior

* Bump paddlex version

* Use stderr
											
										
										
											2025-06-01 18:27:52 +08:00
+								<td><code>True</code></td>
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
+								</tr>
 								<tr>
-												[Feat] Support setting mkldnn_cache_capacity (#15660)

* Support setting mkldnn_cache_capacity

* Update package description to sync with repo

* Remove min_subgraph_size
											
										
										
											2025-06-12 21:00:47 +08:00
+								<td><code>mkldnn_cache_capacity</code></td>
 								<td>
 								MKL-DNN 缓存容量。
 								</td>
 								<td><code>int</code></td>
 								<td><code>10</code></td>
 								</tr>
 								<tr>
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
+								<td><code>cpu_threads</code></td>
 								<td>在 CPU 上进行推理时使用的线程数。</td>
 								<td><code>int</code></td>
 								<td><code>8</code></td>
 								</tr>
-												Modify the information of pipeline_usage (#15188)

* Modify the information of pipeline_usage

* rerun CI
											
										
										
											2025-05-20 15:17:30 +08:00
+								<tr>
 								<td><code>paddlex_config</code></td>
 								<td>PaddleX产线配置文件路径。</td>
-												[docs] Update Performance Statistics in Docs (#15819)

* Fix docs

* update performance

* update performance

* Fix docs

* update performance

* update performance

* Update docs

* Update tensorrt docs

* update performance

* update performance

* update performance

* update performance

* update performance

* modify chart recognition

* modify seal recognition

* Add resource notice

* Fix mcp docs

* Fix doc

* Fix name

* update performance

* Fix docs

* Fix docs

* Refactor MCP server docs

---------

Co-authored-by: Bobholamovic <mhlin425@whu.edu.cn>
Co-authored-by: Bobholamovic <bob1998425@hotmail.com>
											
										
										
											2025-06-26 20:32:46 +08:00
+								<td><code>str|None</code></td>
-												Modify the information of pipeline_usage (#15188)

* Modify the information of pipeline_usage

* rerun CI
											
										
										
											2025-05-20 15:17:30 +08:00
+								<td><code>None</code></td>
 								</tr>
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
+								</tbody>
 								</table>
 								（2）调用 doc_preprocessor 产线对象的 `predict()` 方法进行推理预测，该方法会返回一个结果列表。
 								另外，产线还提供了 `predict_iter()` 方法。两者在参数接受和结果返回方面是完全一致的，区别在于 `predict_iter()` 返回的是一个 `generator`，能够逐步处理和获取预测结果，适合处理大型数据集或希望节省内存的场景。可以根据实际需求选择使用这两种方法中的任意一种。
 								以下是 `predict()` 方法的参数及其说明：
 								<table>
 								<thead>
 								<tr>
 								<th>参数</th>
 								<th>参数说明</th>
 								<th>参数类型</th>
 								<th>默认值</th>
 								</tr>
 								</thead>
 								<tr>
 								<td><code>input</code></td>
 								<td>待预测数据，支持多种输入类型，必填。
 								<ul>
-												Fix docs (#15303)

* Revise the English document

* Remove the device from the predict()

* replace general_formula_recognition_001.png

* modify docs and paddleocr/pp_structurev3.py

* modify docs

* modify docs load and use

* Modify GPU device

* modify docs

* modify docs_2

* modify docs-4

* modify docs-5
											
										
										
											2025-05-28 10:24:37 +08:00
+								<li><b>Python Var</b>：如 <code>numpy.ndarray</code> 表示的图像数据；</li>
-												[Docs] Fix documentation (#15477)

* modify docs0528

* modify docs0529

* modify docs0529-2

* modify docs0529-3

* change 960,max to 64,min

* Proofread Chinese, English documents. modify ocr.py and test_ocr.py

* modify pipeline to Pipeline

* check-code-style

* check-code-style-2

* modify doc_preprocessor.md

* fix check-code-style

* modify dict

* modify languages

* modify languages

* modify docs
											
										
										
											2025-06-05 17:43:57 +08:00
+								<li><b>str</b>：如图像文件或者PDF文件的本地路径：<code>/root/data/img.jpg</code>；<b>如URL链接</b>，如图像文件或PDF文件的网络URL：<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/demo_image/doc_test_rotated.jpg">示例</a>；<b>如本地目录</b>，该目录下需包含待预测图像，如本地路径：<code>/root/data/</code>(当前不支持目录中包含PDF文件的预测，PDF文件需要指定到具体文件路径)；</li>
-												[docs] Update Performance Statistics in Docs (#15819)

* Fix docs

* update performance

* update performance

* Fix docs

* update performance

* update performance

* Update docs

* Update tensorrt docs

* update performance

* update performance

* update performance

* update performance

* update performance

* modify chart recognition

* modify seal recognition

* Add resource notice

* Fix mcp docs

* Fix doc

* Fix name

* update performance

* Fix docs

* Fix docs

* Refactor MCP server docs

---------

Co-authored-by: Bobholamovic <mhlin425@whu.edu.cn>
Co-authored-by: Bobholamovic <bob1998425@hotmail.com>
											
										
										
											2025-06-26 20:32:46 +08:00
+								<li><b>list</b>：列表元素需为上述类型数据，如<code>[numpy.ndarray, numpy.ndarray]</code>，<code>["/root/data/img1.jpg", "/root/data/img2.jpg"]</code>，<code>["/root/data1", "/root/data2"]</code>。</li>
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
+								</ul>
 								</td>
 								<td><code>Python Var|str|list</code></td>
-												Modify the information of pipeline_usage (#15188)

* Modify the information of pipeline_usage

* rerun CI
											
										
										
											2025-05-20 15:17:30 +08:00
+								<td></td>
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
+								</tr>
 								<tr>
 								<td><code>use_doc_orientation_classify</code></td>
 								<td>是否在推理时使用文档方向分类模块。</td>
-												[docs] Update Performance Statistics in Docs (#15819)

* Fix docs

* update performance

* update performance

* Fix docs

* update performance

* update performance

* Update docs

* Update tensorrt docs

* update performance

* update performance

* update performance

* update performance

* update performance

* modify chart recognition

* modify seal recognition

* Add resource notice

* Fix mcp docs

* Fix doc

* Fix name

* update performance

* Fix docs

* Fix docs

* Refactor MCP server docs

---------

Co-authored-by: Bobholamovic <mhlin425@whu.edu.cn>
Co-authored-by: Bobholamovic <bob1998425@hotmail.com>
											
										
										
											2025-06-26 20:32:46 +08:00
+								<td><code>bool|None</code></td>
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
+								<td><code>None</code></td>
 								</tr>
 								<tr>
 								<td><code>use_doc_unwarping</code></td>
 								<td>是否在推理时使用文本图像矫正模块。</td>
-												[docs] Update Performance Statistics in Docs (#15819)

* Fix docs

* update performance

* update performance

* Fix docs

* update performance

* update performance

* Update docs

* Update tensorrt docs

* update performance

* update performance

* update performance

* update performance

* update performance

* modify chart recognition

* modify seal recognition

* Add resource notice

* Fix mcp docs

* Fix doc

* Fix name

* update performance

* Fix docs

* Fix docs

* Refactor MCP server docs

---------

Co-authored-by: Bobholamovic <mhlin425@whu.edu.cn>
Co-authored-by: Bobholamovic <bob1998425@hotmail.com>
											
										
										
											2025-06-26 20:32:46 +08:00
+								<td><code>bool|None</code></td>
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
+								<td><code>None</code></td>
 								</tr>
 								</table>
 								（3）对预测结果进行处理，每个样本的预测结果均为对应的Result对象，且支持打印、保存为图片、保存为`json`文件的操作:
 								<table>
 								<thead>
 								<tr>
 								<th>方法</th>
 								<th>方法说明</th>
 								<th>参数</th>
 								<th>参数类型</th>
 								<th>参数说明</th>
 								<th>默认值</th>
 								</tr>
 								</thead>
 								<tr>
 								<td rowspan="3"><code>print()</code></td>
 								<td rowspan="3">打印结果到终端</td>
 								<td><code>format_json</code></td>
 								<td><code>bool</code></td>
 								<td>是否对输出内容进行使用 <code>JSON</code> 缩进格式化</td>
 								<td><code>True</code></td>
 								</tr>
 								<tr>
 								<td><code>indent</code></td>
 								<td><code>int</code></td>
 								<td>指定缩进级别，以美化输出的 <code>JSON</code> 数据，使其更具可读性，仅当 <code>format_json</code> 为 <code>True</code> 时有效</td>
 								<td>4</td>
 								</tr>
 								<tr>
 								<td><code>ensure_ascii</code></td>
 								<td><code>bool</code></td>
 								<td>控制是否将非 <code>ASCII</code> 字符转义为 <code>Unicode</code>。设置为 <code>True</code> 时，所有非 <code>ASCII</code> 字符将被转义；<code>False</code> 则保留原始字符，仅当<code>format_json</code>为<code>True</code>时有效</td>
 								<td><code>False</code></td>
 								</tr>
 								<tr>
 								<td rowspan="3"><code>save_to_json()</code></td>
 								<td rowspan="3">将结果保存为json格式的文件</td>
 								<td><code>save_path</code></td>
 								<td><code>str</code></td>
 								<td>保存的文件路径，当为目录时，保存文件命名与输入文件类型命名一致</td>
 								<td>无</td>
 								</tr>
 								<tr>
 								<td><code>indent</code></td>
 								<td><code>int</code></td>
 								<td>指定缩进级别，以美化输出的 <code>JSON</code> 数据，使其更具可读性，仅当 <code>format_json</code> 为 <code>True</code> 时有效</td>
 								<td>4</td>
 								</tr>
 								<tr>
 								<td><code>ensure_ascii</code></td>
 								<td><code>bool</code></td>
 								<td>控制是否将非 <code>ASCII</code> 字符转义为 <code>Unicode</code>。设置为 <code>True</code> 时，所有非 <code>ASCII</code> 字符将被转义；<code>False</code> 则保留原始字符，仅当<code>format_json</code>为<code>True</code>时有效</td>
 								<td><code>False</code></td>
 								</tr>
 								<tr>
 								<td><code>save_to_img()</code></td>
 								<td>将结果保存为图像格式的文件</td>
 								<td><code>save_path</code></td>
 								<td><code>str</code></td>
 								<td>保存的文件路径，支持目录或文件路径</td>
 								<td>无</td>
 								</tr>
 								</table>
 								- 调用`print()` 方法会将结果打印到终端，打印到终端的内容解释如下：
 								    - `input_path`: `(str)` 待预测图像的输入路径
 								    - `page_index`: `(Union[int, None])` 如果输入是PDF文件，则表示当前是PDF的第几页，否则为 `None`
 								    - `model_settings`: `(Dict[str, bool])` 配置产线所需的模型参数
 								        - `use_doc_orientation_classify`: `(bool)` 控制是否启用文档方向分类模块
 								        - `use_doc_unwarping`: `(bool)` 控制是否启用文本图像矫正模块
 								    - `angle`: `(int)` 文档方向分类的预测结果。启用时取值为[0,90,180,270]；未启用时为-1
 								- 调用`save_to_json()` 方法会将上述内容保存到指定的`save_path`中，如果指定为目录，则保存的路径为`save_path/{your_img_basename}.json`，如果指定为文件，则直接保存到该文件中。由于json文件不支持保存numpy数组，因此会将其中的`numpy.array`类型转换为列表形式。
 								- 调用`save_to_img()` 方法会将可视化结果保存到指定的`save_path`中，如果指定为目录，则保存的路径为`save_path/{your_img_basename}_doc_preprocessor_res_img.{your_img_extension}`，如果指定为文件，则直接保存到该文件中。(产线通常包含较多结果图片，不建议直接指定为具体的文件路径，否则多张图会被覆盖，仅保留最后一张图)
 								* 此外，也支持通过属性获取带结果的可视化图像和预测结果，具体如下：
 								<table>
 								<thead>
 								<tr>
 								<th>属性</th>
 								<th>属性说明</th>
 								</tr>
 								</thead>
 								<tr>
 								<td rowspan="1"><code>json</code></td>
 								<td rowspan="1">获取预测的 <code>json</code> 格式的结果</td>
 								</tr>
 								<tr>
 								<td rowspan="2"><code>img</code></td>
 								<td rowspan="2">获取格式为 <code>dict</code> 的可视化图像</td>
 								</tr>
 								</table>
 								- `json` 属性获取的预测结果为dict类型的数据，相关内容与调用 `save_to_json()` 方法保存的内容一致。
-												[Docs] Fix documentation (#15477)

* modify docs0528

* modify docs0529

* modify docs0529-2

* modify docs0529-3

* change 960,max to 64,min

* Proofread Chinese, English documents. modify ocr.py and test_ocr.py

* modify pipeline to Pipeline

* check-code-style

* check-code-style-2

* modify doc_preprocessor.md

* fix check-code-style

* modify dict

* modify languages

* modify languages

* modify docs
											
										
										
											2025-06-05 17:43:57 +08:00
+								- `img` 属性返回的预测结果是一个dict类型的数据。其中，键为 `preprocessed_img`，对应的值是 `Image.Image` 对象：用于显示 doc_preprocessor 结果的可视化图像。
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
 								## 3. 开发集成/部署
 								如果产线可以达到您对产线推理速度和精度的要求，您可以直接进行开发集成/部署。
 								若您需要将产线直接应用在您的Python项目中，可以参考 [2.2 Python脚本方式](#22-python脚本方式集成)中的示例代码。
 								此外，PaddleOCR 也提供了其他两种部署方式，详细说明如下：
-												[Docs] Add upgrade notes and fix docs (#15198)

* Unify refs

* Fix extra newline

* Update mkldnn_blocklists

* Add upgrade notes

* Fix docs

* Add English upgrade notes

* Bump paddlex to 3.0.0

* Update upgrade notes
											
										
										
											2025-05-20 15:13:40 +08:00
+								🚀 高性能推理：在实际生产环境中，许多应用对部署策略的性能指标（尤其是响应速度）有着较严苛的标准，以确保系统的高效运行与用户体验的流畅性。为此，PaddleOCR 提供高性能推理功能，旨在对模型推理及前后处理进行深度性能优化，实现端到端流程的显著提速，详细的高性能推理流程请参考[高性能推理](../deployment/high_performance_inference.md)。
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
-												[Docs] Add upgrade notes and fix docs (#15198)

* Unify refs

* Fix extra newline

* Update mkldnn_blocklists

* Add upgrade notes

* Fix docs

* Add English upgrade notes

* Bump paddlex to 3.0.0

* Update upgrade notes
											
										
										
											2025-05-20 15:13:40 +08:00
+								☁️ 服务化部署：服务化部署是实际生产环境中常见的一种部署形式。通过将推理功能封装为服务，客户端可以通过网络请求来访问这些服务，以获取推理结果。详细的产线服务化部署流程请参考[服务化部署](../deployment/serving.md)。
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
 								以下是基础服务化部署的API参考与多语言服务调用示例：
 								<details><summary>API参考</summary>
 								<p>对于服务提供的主要操作：</p>
 								<ul>
 								<li>HTTP请求方法为POST。</li>
 								<li>请求体和响应体均为JSON数据（JSON对象）。</li>
 								<li>当请求处理成功时，响应状态码为<code>200</code>，响应体的属性如下：</li>
 								</ul>
 								<table>
 								<thead>
 								<tr>
 								<th>名称</th>
 								<th>类型</th>
 								<th>含义</th>
 								</tr>
 								</thead>
 								<tbody>
 								<tr>
 								<td><code>logId</code></td>
 								<td><code>string</code></td>
 								<td>请求的UUID。</td>
 								</tr>
 								<tr>
 								<td><code>errorCode</code></td>
 								<td><code>integer</code></td>
 								<td>错误码。固定为<code>0</code>。</td>
 								</tr>
 								<tr>
 								<td><code>errorMsg</code></td>
 								<td><code>string</code></td>
 								<td>错误说明。固定为<code>"Success"</code>。</td>
 								</tr>
 								<tr>
 								<td><code>result</code></td>
 								<td><code>object</code></td>
 								<td>操作结果。</td>
 								</tr>
 								</tbody>
 								</table>
 								<ul>
 								<li>当请求处理未成功时，响应体的属性如下：</li>
 								</ul>
 								<table>
 								<thead>
 								<tr>
 								<th>名称</th>
 								<th>类型</th>
 								<th>含义</th>
 								</tr>
 								</thead>
 								<tbody>
 								<tr>
 								<td><code>logId</code></td>
 								<td><code>string</code></td>
 								<td>请求的UUID。</td>
 								</tr>
 								<tr>
 								<td><code>errorCode</code></td>
 								<td><code>integer</code></td>
 								<td>错误码。与响应状态码相同。</td>
 								</tr>
 								<tr>
 								<td><code>errorMsg</code></td>
 								<td><code>string</code></td>
 								<td>错误说明。</td>
 								</tr>
 								</tbody>
 								</table>
 								<p>服务提供的主要操作如下：</p>
 								<ul>
 								<li><b><code>infer</code></b></li>
 								</ul>
 								<p>获取图像文档图像预处理结果。</p>
 								<p><code>POST /document-preprocessing</code></p>
 								<ul>
 								<li>请求体的属性如下：</li>
 								</ul>
 								<table>
 								<thead>
 								<tr>
 								<th>名称</th>
 								<th>类型</th>
 								<th>含义</th>
 								<th>是否必填</th>
 								</tr>
 								</thead>
 								<tbody>
 								<tr>
 								<td><code>file</code></td>
 								<td><code>string</code></td>
 								<td>服务器可访问的图像文件或PDF文件的URL，或上述类型文件内容的Base64编码结果。默认对于超过10页的PDF文件，只有前10页的内容会被处理。<br /> 要解除页数限制，请在产线配置文件中添加以下配置：
 								<pre><code>Serving:
 								  extra:
 								    max_num_input_imgs: null
 								</code></pre>
 								</td>
 								<td>是</td>
 								</tr>
 								<tr>
 								<td><code>fileType</code></td>
 								<td><code>integer</code> | <code>null</code></td>
 								<td>文件类型。<code>0</code>表示PDF文件，<code>1</code>表示图像文件。若请求体无此属性，则将根据URL推断文件类型。</td>
 								<td>否</td>
 								</tr>
 								<tr>
-												[Feat] Add PP-DocTranslate and update docs (#15890)

* Add PP-DocTranslate and update docs

* Fix docs

* Add English doc

* Fix ut
											
										
										
											2025-06-28 23:13:32 +08:00
+								<td><code>useDocOrientationClassify</code></td>
 								<td><code>boolean</code> | <code>null</code></td>
 								<td>请参阅产线对象中 <code>predict</code> 方法的 <code>use_doc_orientation_classify</code> 参数相关说明。</td>
 								<td>否</td>
 								</tr>
 								<tr>
 								<td><code>useDocUnwarping</code></td>
 								<td><code>boolean</code> | <code>null</code></td>
 								<td>请参阅产线对象中 <code>predict</code> 方法的 <code>use_doc_unwarping</code> 参数相关说明。</td>
 								<td>否</td>
 								</tr>
 								<tr>
-												add visualize parameter (#15849)


											
										
										
											2025-06-25 14:10:03 +08:00
+								<td><code>visualize</code></td>
 								<td><code>boolean</code> | <code>null</code></td>
 								<td>是否返回可视化结果图以及处理过程中的中间图像等。
 								<ul style="margin: 0 0 0 1em; padding-left: 0em;">
 								<li>传入 <code>true</code>：返回图像。</li>
 								<li>传入 <code>false</code>：不返回图像。</li>
 								<li>若请求体中未提供该参数或传入 <code>null</code>：遵循产线配置文件<code>Serving.visualize</code> 的设置。</li>
 								</ul>
 								<br/>例如，在产线配置文件中添加如下字段：<br/>
 								<pre><code>Serving:
 								  visualize: False
 								</code></pre>
 								将默认不返回图像，通过请求体中的<code>visualize</code>参数可以覆盖默认行为。如果请求体和配置文件中均未设置（或请求体传入<code>null</code>、配置文件中未设置），则默认返回图像。
 								</td>
 								<td>否</td>
 								</tr>
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
+								</tbody>
 								</table>
 								<ul>
 								<li>请求处理成功时，响应体的<code>result</code>具有如下属性：</li>
 								</ul>
 								<table>
 								<thead>
 								<tr>
 								<th>名称</th>
 								<th>类型</th>
 								<th>含义</th>
 								</tr>
 								</thead>
 								<tbody>
 								<tr>
 								<td><code>docPreprocessingResults</code></td>
 								<td><code>object</code></td>
 								<td>文档图像预处理结果。数组长度为1（对于图像输入）或实际处理的文档页数（对于PDF输入）。对于PDF输入，数组中的每个元素依次表示PDF文件中实际处理的每一页的结果。</td>
 								</tr>
 								<tr>
 								<td><code>dataInfo</code></td>
 								<td><code>object</code></td>
 								<td>输入数据信息。</td>
 								</tr>
 								</tbody>
 								</table>
 								<p><code>docPreprocessingResults</code>中的每个元素为一个<code>object</code>，具有如下属性：</p>
 								<table>
 								<thead>
 								<tr>
 								<th>名称</th>
 								<th>类型</th>
 								<th>含义</th>
 								</tr>
 								</thead>
 								<tbody>
 								<tr>
 								<td><code>outputImage</code></td>
 								<td><code>string</code></td>
 								<td>经过预处理的图像。图像为PNG格式，使用Base64编码。</td>
 								</tr>
 								<tr>
 								<td><code>prunedResult</code></td>
 								<td><code>object</code></td>
 								<td>产线对象的 <code>predict</code> 方法生成结果的 JSON 表示中 <code>res</code> 字段的简化版本，其中去除了 <code>input_path</code> 和 <code>page_index</code> 字段。</td>
 								</tr>
 								<tr>
 								<td><code>docPreprocessingImage</code></td>
 								<td><code>string</code> ｜ <code>null</code></td>
 								<td>可视化结果图。图像为JPEG格式，使用Base64编码。</td>
 								</tr>
 								<tr>
 								<td><code>inputImage</code></td>
 								<td><code>string</code> ｜ <code>null</code></td>
 								<td>输入图像。图像为JPEG格式，使用Base64编码。</td>
 								</tr>
 								</tbody>
 								</table>
 								</details>
 								<details><summary>多语言调用服务示例</summary>
 								<details>
 								<summary>Python</summary>
 								<pre><code class="language-python">import base64
 								import requests
 								API_URL = "http://localhost:8080/document-preprocessing"
 								file_path = "./demo.jpg"
 								with open(file_path, "rb") as file:
 								    file_bytes = file.read()
 								    file_data = base64.b64encode(file_bytes).decode("ascii")
 								payload = {"file": file_data, "fileType": 1}
 								response = requests.post(API_URL, json=payload)
 								assert response.status_code == 200
 								result = response.json()["result"]
 								for i, res in enumerate(result["docPreprocessingResults"]):
 								    print(res["prunedResult"])
 								    output_img_path = f"out_{i}.png"
 								    with open(output_img_path, "wb") as f:
 								        f.write(base64.b64decode(res["outputImage"]))
 								    print(f"Output image saved at {output_img_path}")
 								</code></pre></details>
-												update multi languages for serve (#15739)


											
										
										
											2025-06-16 14:56:00 +08:00
 								<details><summary>C++</summary>
 								<pre><code class="language-cpp">#include &lt;iostream&gt;
 								#include &lt;fstream&gt;
 								#include &lt;vector&gt;
 								#include &lt;string&gt;
 								#include "cpp-httplib/httplib.h" // https://github.com/Huiyicc/cpp-httplib
 								#include "nlohmann/json.hpp" // https://github.com/nlohmann/json
 								#include "base64.hpp" // https://github.com/tobiaslocker/base64
 								int main() {
 								    httplib::Client client("localhost", 8080);
 								    const std::string filePath = "./demo.jpg";
 								    std::ifstream file(filePath, std::ios::binary | std::ios::ate);
 								    if (!file) {
 								        std::cerr << "Error opening file: " << filePath << std::endl;
 								        return 1;
 								    }
 								    std::streamsize size = file.tellg();
 								    file.seekg(0, std::ios::beg);
 								    std::vector<char> buffer(size);
 								    if (!file.read(buffer.data(), size)) {
 								        std::cerr << "Error reading file." << std::endl;
 								        return 1;
 								    }
 								    std::string bufferStr(buffer.data(), static_cast<size_t>(size));
 								    std::string encodedFile = base64::to_base64(bufferStr);
 								    nlohmann::json jsonObj;
 								    jsonObj["file"] = encodedFile;
 								    jsonObj["fileType"] = 1;
 								    auto response = client.Post("/document-preprocessing", jsonObj.dump(), "application/json");
 								    if (response && response->status == 200) {
 								        nlohmann::json jsonResponse = nlohmann::json::parse(response->body);
 								        auto result = jsonResponse["result"];
 								        if (!result.is_object() || !result["docPreprocessingResults"].is_array()) {
 								            std::cerr << "Unexpected response format." << std::endl;
 								            return 1;
 								        }
 								        for (size_t i = 0; i < result["docPreprocessingResults"].size(); ++i) {
 								            auto res = result["docPreprocessingResults"][i];
 								            if (res.contains("prunedResult")) {
 								                std::cout << "Preprocessed result: " << res["prunedResult"].dump() << std::endl;
 								            }
 								            if (res.contains("outputImage")) {
 								                std::string outputImgPath = "out_" + std::to_string(i) + ".png";
 								                std::string decodedImage = base64::from_base64(res["outputImage"].get<std::string>());
 								                std::ofstream outFile(outputImgPath, std::ios::binary);
 								                if (outFile.is_open()) {
 								                    outFile.write(decodedImage.c_str(), decodedImage.size());
 								                    outFile.close();
 								                    std::cout << "Saved image: " << outputImgPath << std::endl;
 								                } else {
 								                    std::cerr << "Failed to write image: " << outputImgPath << std::endl;
 								                }
 								            }
 								        }
 								    } else {
 								        std::cerr << "Request failed." << std::endl;
 								        if (response) {
 								            std::cerr << "HTTP status: " << response->status << std::endl;
 								            std::cerr << "Response body: " << response->body << std::endl;
 								        }
 								        return 1;
 								    }
 								    return 0;
 								}
 								</code></pre></details>
 								<details><summary>Java</summary>
 								<pre><code class="language-java">import okhttp3.*;
 								import com.fasterxml.jackson.databind.ObjectMapper;
 								import com.fasterxml.jackson.databind.JsonNode;
 								import com.fasterxml.jackson.databind.node.ObjectNode;
 								import java.io.File;
 								import java.io.FileOutputStream;
 								import java.io.IOException;
 								import java.util.Base64;
 								public class Main {
 								    public static void main(String[] args) throws IOException {
 								        String API_URL = "http://localhost:8080/document-preprocessing";
 								        String imagePath = "./demo.jpg";
 								        File file = new File(imagePath);
 								        byte[] fileContent = java.nio.file.Files.readAllBytes(file.toPath());
 								        String base64Image = Base64.getEncoder().encodeToString(fileContent);
 								        ObjectMapper objectMapper = new ObjectMapper();
 								        ObjectNode payload = objectMapper.createObjectNode();
 								        payload.put("file", base64Image);
 								        payload.put("fileType", 1);
 								        OkHttpClient client = new OkHttpClient();
 								        MediaType JSON = MediaType.get("application/json; charset=utf-8");
 								        RequestBody body = RequestBody.create(JSON, payload.toString());
 								        Request request = new Request.Builder()
 								                .url(API_URL)
 								                .post(body)
 								                .build();
 								        try (Response response = client.newCall(request).execute()) {
 								            if (response.isSuccessful()) {
 								                String responseBody = response.body().string();
 								                JsonNode root = objectMapper.readTree(responseBody);
 								                JsonNode result = root.get("result");
 								                JsonNode docPreprocessingResults = result.get("docPreprocessingResults");
 								                for (int i = 0; i < docPreprocessingResults.size(); i++) {
 								                    JsonNode item = docPreprocessingResults.get(i);
 								                    int finalI = i;
 								                    JsonNode prunedResult = item.get("prunedResult");
 								                    System.out.println("Pruned Result [" + i + "]: " + prunedResult.toString());
 								                    String outputImgBase64 = item.get("outputImage").asText();
 								                    byte[] outputImgBytes = Base64.getDecoder().decode(outputImgBase64);
 								                    String outputImgPath = "out_" + finalI + ".png";
 								                    try (FileOutputStream fos = new FileOutputStream(outputImgPath)) {
 								                        fos.write(outputImgBytes);
 								                        System.out.println("Saved output image: " + outputImgPath);
 								                    }
 								                    JsonNode inputImageNode = item.get("inputImage");
 								                    if (inputImageNode != null && !inputImageNode.isNull()) {
 								                        String inputImageBase64 = inputImageNode.asText();
 								                        byte[] inputImageBytes = Base64.getDecoder().decode(inputImageBase64);
 								                        String inputImgPath = "inputImage_" + i + ".jpg";
 								                        try (FileOutputStream fos = new FileOutputStream(inputImgPath)) {
 								                            fos.write(inputImageBytes);
 								                            System.out.println("Saved input image to: " + inputImgPath);
 								                        }
 								                    }
 								                }
 								            } else {
 								                System.err.println("Request failed with HTTP code: " + response.code());
 								            }
 								        }
 								    }
 								}
 								</code></pre></details>
 								<details><summary>Go</summary>
 								<pre><code class="language-go">package main
 								import (
 								    "bytes"
 								    "encoding/base64"
 								    "encoding/json"
 								    "fmt"
 								    "io/ioutil"
 								    "net/http"
 								    "os"
 								)
 								func main() {
 								    API_URL := "http://localhost:8080/document-preprocessing"
 								    filePath := "./demo.jpg"
 								    fileBytes, err := ioutil.ReadFile(filePath)
 								    if err != nil {
 								        fmt.Printf("Error reading file: %v\n", err)
 								        return
 								    }
 								    fileData := base64.StdEncoding.EncodeToString(fileBytes)
 								    payload := map[string]interface{}{
 								        "file":     fileData,
 								        "fileType": 1,
 								    }
 								    payloadBytes, err := json.Marshal(payload)
 								    if err != nil {
 								        fmt.Printf("Error marshaling payload: %v\n", err)
 								        return
 								    }
 								    client := &http.Client{}
 								    req, err := http.NewRequest("POST", API_URL, bytes.NewBuffer(payloadBytes))
 								    if err != nil {
 								        fmt.Printf("Error creating request: %v\n", err)
 								        return
 								    }
 								    req.Header.Set("Content-Type", "application/json")
 								    res, err := client.Do(req)
 								    if err != nil {
 								        fmt.Printf("Error sending request: %v\n", err)
 								        return
 								    }
 								    defer res.Body.Close()
 								    if res.StatusCode != http.StatusOK {
 								        fmt.Printf("Unexpected status code: %d\n", res.StatusCode)
 								        return
 								    }
 								    body, err := ioutil.ReadAll(res.Body)
 								    if err != nil {
 								        fmt.Printf("Error reading response body: %v\n", err)
 								        return
 								    }
 								    type DocPreprocessingResult struct {
 								        PrunedResult         map[string]interface{} `json:"prunedResult"`
 								        OutputImage          string                 `json:"outputImage"`
 								        DocPreprocessingImage *string               `json:"docPreprocessingImage"`
 								        InputImage           *string                `json:"inputImage"`
 								    }
 								    type Response struct {
 								        Result struct {
 								            DocPreprocessingResults []DocPreprocessingResult `json:"docPreprocessingResults"`
 								            DataInfo                interface{}              `json:"dataInfo"`
 								        } `json:"result"`
 								    }
 								    var respData Response
 								    if err := json.Unmarshal(body, &respData); err != nil {
 								        fmt.Printf("Error unmarshaling response: %v\n", err)
 								        return
 								    }
 								    for i, res := range respData.Result.DocPreprocessingResults {
 								        fmt.Printf("Result %d - prunedResult: %+v\n", i, res.PrunedResult)
 								        imgBytes, err := base64.StdEncoding.DecodeString(res.OutputImage)
 								        if err != nil {
 								            fmt.Printf("Error decoding outputImage at index %d: %v\n", i, err)
 								            continue
 								        }
 								        filename := fmt.Sprintf("out_%d.png", i)
 								        if err := os.WriteFile(filename, imgBytes, 0644); err != nil {
 								            fmt.Printf("Error saving image %s: %v\n", filename, err)
 								            continue
 								        }
 								        fmt.Printf("Saved output image to %s\n", filename)
 								    }
 								}
 								</code></pre></details>
 								<details><summary>C#</summary>
 								<pre><code class="language-csharp">using System;
 								using System.IO;
 								using System.Net.Http;
 								using System.Text;
 								using System.Threading.Tasks;
 								using Newtonsoft.Json.Linq;
 								class Program
 								{
 								    static readonly string API_URL = "http://localhost:8080/document-preprocessing";
 								    static readonly string inputFilePath = "./demo.jpg";
 								    static async Task Main(string[] args)
 								    {
 								        var httpClient = new HttpClient();
 								        byte[] fileBytes = File.ReadAllBytes(inputFilePath);
 								        string fileData = Convert.ToBase64String(fileBytes);
 								        var payload = new JObject
 								        {
 								            { "file", fileData },
 								            { "fileType", 1 }
 								        };
 								        var content = new StringContent(payload.ToString(), Encoding.UTF8, "application/json");
 								        HttpResponseMessage response = await httpClient.PostAsync(API_URL, content);
 								        response.EnsureSuccessStatusCode();
 								        string responseBody = await response.Content.ReadAsStringAsync();
 								        JObject jsonResponse = JObject.Parse(responseBody);
 								        JArray docPreResults = (JArray)jsonResponse["result"]["docPreprocessingResults"];
 								        for (int i = 0; i < docPreResults.Count; i++)
 								        {
 								            var res = docPreResults[i];
 								            Console.WriteLine($"[{i}] prunedResult:\n{res["prunedResult"]}");
 								            string base64Image = res["outputImage"]?.ToString();
 								            if (!string.IsNullOrEmpty(base64Image))
 								            {
 								                string outputPath = $"out_{i}.png";
 								                byte[] imageBytes = Convert.FromBase64String(base64Image);
 								                File.WriteAllBytes(outputPath, imageBytes);
 								                Console.WriteLine($"Output image saved at {outputPath}");
 								            }
 								            else
 								            {
 								                Console.WriteLine($"outputImage at index {i} is null.");
 								            }
 								        }
 								    }
 								}
 								</code></pre></details>
 								<details><summary>Node.js</summary>
 								<pre><code class="language-js">const axios = require('axios');
 								const fs = require('fs');
 								const path = require('path');
 								const API_URL = 'http://localhost:8080/document-preprocessing';
 								const imagePath = './demo.jpg';
 								function encodeImageToBase64(filePath) {
 								  const bitmap = fs.readFileSync(filePath);
 								  return Buffer.from(bitmap).toString('base64');
 								}
 								const payload = {
 								  file: encodeImageToBase64(imagePath),
 								  fileType: 1
 								};
 								axios.post(API_URL, payload, {
 								  headers: {
 								    'Content-Type': 'application/json'
 								  },
 								  maxBodyLength: Infinity
 								})
 								.then((response) => {
 								  const results = response.data.result.docPreprocessingResults;
 								  results.forEach((res, index) => {
 								    console.log(`\n[${index}] prunedResult:`);
 								    console.log(res.prunedResult);
 								    const base64Image = res.outputImage;
 								    if (base64Image) {
 								      const outputImagePath = `out_${index}.png`;
 								      const imageBuffer = Buffer.from(base64Image, 'base64');
 								      fs.writeFileSync(outputImagePath, imageBuffer);
 								      console.log(`Output image saved at ${outputImagePath}`);
 								    } else {
 								      console.log(`outputImage at index ${index} is null.`);
 								    }
 								  });
 								})
 								.catch((error) => {
 								  console.error('API error:', error.message);
 								});
 								</code></pre></details>
 								<details><summary>PHP</summary>
 								<pre><code class="language-php">&lt;?php
 								$API_URL = "http://localhost:8080/document-preprocessing";
 								$image_path = "./demo.jpg";
 								$output_image_path = "./out_0.png";
 								$image_data = base64_encode(file_get_contents($image_path));
 								$payload = array("file" => $image_data, "fileType" => 1);
 								$ch = curl_init($API_URL);
 								curl_setopt($ch, CURLOPT_POST, true);
 								curl_setopt($ch, CURLOPT_POSTFIELDS, json_encode($payload));
 								curl_setopt($ch, CURLOPT_HTTPHEADER, array('Content-Type: application/json'));
 								curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
 								$response = curl_exec($ch);
 								curl_close($ch);
 								$result = json_decode($response, true)["result"]["docPreprocessingResults"];
 								foreach ($result as $i => $item) {
 								    echo "[$i] prunedResult:\n";
 								    print_r($item["prunedResult"]);
 								    if (!empty($item["outputImage"])) {
 								        $output_image_path = "out_" . $i . ".png";
 								        file_put_contents($output_image_path, base64_decode($item["outputImage"]));
 								        echo "Output image saved at $output_image_path\n";
 								    } else {
 								        echo "No outputImage found for item $i\n";
 								    }
 								}
 								?&gt;
 								</code></pre></details>
-												add new docs (pipelines, modules, legacy) (#15096)

* add ocr doc

* add docs

* fix

* add pipeline docs

* add module docs

* update the descriptions of parameters

* update

* update the description of  predict_iter

* update

* delete 2.2python脚本

* add char_recognition and region_detection

* modify in predict

* remove redundant 2.2 Python scripts

* modify use_wired_table_cells_trans_to_html

* add use_chart_recognition and use_region_detection

* add information

* add use_orc_model

* add legacy docs

* update

---------

Co-authored-by: guoshengjian <guoshengjian@baidu.com>
											
										
										
											2025-05-18 21:09:53 +08:00
+								</details>
 								<br/>
 								## 4. 二次开发
 								如果文档图像预处理产线提供的默认模型权重在您的场景中，精度或速度不满意，您可以尝试利用<b>您自己拥有的特定领域或应用场景的数据</b>对现有模型进行进一步的<b>微调</b>，以提升文档图像预处理产线的在您的场景中的识别效果。
-												all_docs_review_final (#15221)

* all_docs_review_final

* all_docs_review_final
											
										
										
											2025-05-20 17:11:58 +08:00
+								### 4.1 模型微调
-												Docs paddleocr fix baijiahao (#15182)

* review docs

* add doc_preprocessor.en.md doc_img_orientation_class.en.md and fix some docs

* fix review

* 根据意见修改，同时补充text_recognition.en.md

* 修改markdown格式

* fix error

* fix doc

* 根据意见修改，同时补充text_recognition.en.md

* 修改markdown格式

* fix error

* fix doc

* fix

---------

Co-authored-by: baijiahao01 <baijiahao01@MacBook-Pro-5.local>
Co-authored-by: Sunflower7788 <sunting13@baidu.com>
											
										
										
											2025-05-20 02:53:42 +08:00
 								由于文档图像预处理产线包含若干模块，模型产线的效果如果不及预期，可能来自于其中任何一个模块。您可以对识别效果差的图片进行分析，进而确定是哪个模块存在问题，并参考以下表格中对应的微调教程链接进行模型微调。
 								<table>
 								<thead>
 								<tr>
 								<th>情形</th>
 								<th>微调模块</th>
 								<th>微调参考链接</th>
 								</tr>
 								</thead>
 								<tbody>
 								<tr>
 								<td>整图旋转矫正不准</td>
 								<td>文档图像方向分类模块</td>
-												fix_docs_add_links (#15214)

* fix_docs_add_links

* fix_docs_add_links
											
										
										
											2025-05-20 15:56:36 +08:00
+								<td><a href="https://paddlepaddle.github.io/PaddleX/latest/module_usage/tutorials/ocr_modules/doc_img_orientation_classification.html#_5">链接</a></td>
-												Docs paddleocr fix baijiahao (#15182)

* review docs

* add doc_preprocessor.en.md doc_img_orientation_class.en.md and fix some docs

* fix review

* 根据意见修改，同时补充text_recognition.en.md

* 修改markdown格式

* fix error

* fix doc

* 根据意见修改，同时补充text_recognition.en.md

* 修改markdown格式

* fix error

* fix doc

* fix

---------

Co-authored-by: baijiahao01 <baijiahao01@MacBook-Pro-5.local>
Co-authored-by: Sunflower7788 <sunting13@baidu.com>
											
										
										
											2025-05-20 02:53:42 +08:00
+								</tr>
 								<tr>
 								<td>图像扭曲矫正不准</td>
 								<td>文本图像矫正模块</td>
 								<td>暂不支持微调</td>
 								</tr>
 								</tbody>
 								</table>