[Docs] Add OCR speed data (#15412)

* Add speed data * Temporarily remove paddleocr version * Update * Move to algorithm * Temp
2025-12-24 21:48:23 +00:00 · 2025-05-30 14:36:45 +08:00 · 2025-05-30 14:36:45 +08:00 · 3ea2fb9086
commit 3ea2fb9086
parent 878a076e98
2 changed files with 259 additions and 2 deletions
--- a/docs/version3.x/algorithm/PP-OCRv5/PP-OCRv5.en.md
+++ b/docs/version3.x/algorithm/PP-OCRv5/PP-OCRv5.en.md
@ -201,9 +201,137 @@ A single model can cover multiple languages and text types, with recognition acc

 <a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/PaddleX3.0/doc_images/PP-OCRv5/algorithm_ppocrv5_demo.pdf">More Demos</a>

+## Reference Data for Inference Performance
+
+Test Environment:
+
+- NVIDIA Tesla V100
+- Intel Xeon Gold 6271C
+- PaddlePaddle 3.0.0
+
+Tested on 200 images (including both general and document images). During testing, images are read from disk, so the image reading time and other associated overhead are also included in the total time consumption. If the images are preloaded into memory, the average time per image can be further reduced by approximately 25 ms.
+
+Unless otherwise specified:
+
+- PP-OCRv4_mobile_det and PP-OCRv4_mobile_rec models are used.
+- Document orientation classification, image correction, and text line orientation classification are not used.
+- `text_det_limit_type` is set to `"min"` and `text_det_limit_side_len` to `732`.
+
+### 1. Comparison of Inference Performance Between PP-OCRv5 and PP-OCRv4
+
+| Config | Description                                                  |
+| --------------- | ------------------------------------------------------------ |
+| v5_mobile      | Uses PP-OCRv5_mobile_det and PP-OCRv5_mobile_rec models. |
+| v4_mobile      | Uses PP-OCRv4_mobile_det and PP-OCRv4_mobile_rec models. |
+| v5_server      | Uses PP-OCRv5_server_det and PP-OCRv5_server_rec models. |
+| v4_server      | Uses PP-OCRv4_server_det and PP-OCRv4_server_rec models. |
+
+**GPU, without high-performance inference:**
+
+| Config     | Avg Time/Image (s) | Avg Chars/sec | Avg CPU Usage (%) | Peak RAM (MB) | Avg RAM (MB) | Peak VRAM (MB) | Avg VRAM (MB) |
+| ---------- | ------------------ | ------------- | ----------------- | ------------- | ------------ | -------------- | ------------- |
+| v5_mobile | 0.56               | 1162          | 106.02            | 1576.43       | 1420.83      | 4342.00        | 3258.95       |
+| v4_mobile | 0.27               | 2246          | 111.20            | 1392.22       | 1318.76      | 1304.00        | 1166.46       |
+| v5_server | 0.70               | 929           | 105.31            | 1634.85       | 1428.55      | 5402.00        | 4685.13       |
+| v4_server | 0.44               | 1418          | 106.96            | 1455.34       | 1346.95      | 6760.00        | 5817.46       |
+
+**GPU, with high-performance inference:**
+
+| Config     | Avg Time/Image (s) | Avg Chars/sec | Avg CPU Usage (%) | Peak RAM (MB) | Avg RAM (MB) | Peak VRAM (MB) | Avg VRAM (MB) |
+| ---------- | ------------------ | ------------- | ----------------- | ------------- | ------------ | -------------- | ------------- |
+| v5_mobile | 0.50               | 1301          | 106.50            | 1338.12       | 1155.86      | 4112.00        | 3536.36       |
+| v4_mobile | 0.21               | 2887          | 114.09            | 1113.27       | 1054.46      | 2072.00        | 1840.59       |
+| v5_server | 0.60               | 1084          | 105.73            | 1980.73       | 1776.20      | 12150.00       | 11849.40      |
+| v4_server | 0.36               | 1687          | 104.15            | 1186.42       | 1065.67      | 13058.00       | 12679.00      |
+
+**CPU, without high-performance inference:**
+
+| Config     | Avg Time/Image (s) | Avg Chars/sec | Avg CPU Usage (%) | Peak RAM (MB) | Avg RAM (MB) |
+| ---------- | ------------------ | ------------- | ----------------- | ------------- | ------------ |
+| v5_mobile | 1.43               | 455           | 798.93            | 11695.40      | 6829.09      |
+| v4_mobile | 1.09               | 556           | 813.16            | 11996.30      | 6834.25      |
+| v5_server | 3.79               | 172           | 799.24            | 50216.00      | 27902.40     |
+| v4_server | 4.22               | 148           | 803.74            | 51428.70      | 28593.60     |
+
+**CPU, with high-performance inference:**
+
+| Config     | Avg Time/Image (s) | Avg Chars/sec | Avg CPU Usage (%) | Peak RAM (MB) | Avg RAM (MB) |
+| ---------- | ------------------ | ------------- | ----------------- | ------------- | ------------ |
+| v5_mobile | 1.14               | 571           | 339.68            | 3245.17       | 2560.55      |
+| v4_mobile | 0.68               | 892           | 443.00            | 3057.38       | 2329.44      |
+| v5_server | 3.56               | 183           | 797.03            | 45664.70      | 26905.90     |
+| v4_server | 4.22               | 148           | 803.74            | 51428.70      | 28593.60     |
+
+> Note: PP-OCRv5 uses a larger dictionary in the recognition model, which increases inference time and causes slower performance compared to PP-OCRv4.
+
+### 2. Impact of Auxiliary Features on PP-OCRv5 Inference Performance
+
+| Config | Description                                                                                               |
+| --------------- | --------------------------------------------------------------------------------------------------------- |
+| base            | No document orientation classification, no image correction, no text line orientation classification.     |
+| with_textline  | Includes text line orientation classification only.                                                       |
+| with_all       | Includes document orientation classification, image correction, and text line orientation classification. |
+
+**GPU, without high-performance inference:**
+
+| Config         | Avg Time/Image (s) | Avg Chars/sec | Avg CPU Usage (%) | Peak RAM (MB) | Avg RAM (MB) | Peak VRAM (MB) | Avg VRAM (MB) |
+| -------------- | ------------------ | ------------- | ----------------- | ------------- | ------------ | -------------- | ------------- |
+| base           | 0.56               | 1162          | 106.02            | 1576.43       | 1420.83      | 4342.00        | 3258.95       |
+| with_textline | 0.60               | 1083          | 105.59            | 1715.65       | 1510.83      | 4342.00        | 3266.05       |
+| with_all      | 1.01               | 605           | 104.89            | 1949.11       | 1612.00      | 2624.00        | 2210.15       |
+
+**CPU, without high-performance inference:**
+
+| Config         | Avg Time/Image (s) | Avg Chars/sec | Avg CPU Usage (%) | Peak RAM (MB) | Avg RAM (MB) |
+| -------------- | ------------------ | ------------- | ----------------- | ------------- | ------------ |
+| base           | 1.43               | 455           | 798.93            | 11695.40      | 6829.09      |
+| with_textline | 1.43               | 454           | 801.90            | 11994.30      | 6947.94      |
+| with_all      | 1.90               | 320           | 642.48            | 11710.80      | 6944.01      |
+
+> Note: Auxiliary features such as image unwarping can impact inference accuracy. More features do not necessarily yield better results and may increase resource usage.
+
+### 3. Impact of Input Scaling Strategy in Text Detection Module on PP-OCRv5 Inference Performance
+
+| Config            | Description                                                                            |
+| ----------------- | -------------------------------------------------------------------------------------- |
+| mobile_min_1280 | Uses `min` limit type and `text_det_limit_side_len=1280` with PP-OCRv5_mobile models. |
+| mobile_min_736  | Same as default, `min`, `side_len=736`.                                                |
+| mobile_max_960  | Uses `max` limit type and `side_len=960`.                                              |
+| mobile_max_640  | Uses `max` limit type and `side_len=640`.                                              |
+| server_min_1280 | Uses `min`, `side_len=1280` with PP-OCRv5_server models.                              |
+| server_min_736  | Same as default, `min`, `side_len=736`.                                                |
+| server_max_960  | Uses `max`, `side_len=960`.                                                            |
+| server_max_640  | Uses `max`, `side_len=640`.                                                            |
+
+**GPU, without high-performance inference:**
+
+| Config            | Avg Time/Image (s) | Avg Chars/sec | Avg CPU Usage (%) | Peak RAM (MB) | Avg RAM (MB) | Peak VRAM (MB) | Avg VRAM (MB) |
+| ----------------- | ------------------ | ------------- | ----------------- | ------------- | ------------ | -------------- | ------------- |
+| mobile_min_1280 | 0.61               | 1071          | 109.12            | 1663.71       | 1439.72      | 4202.00        | 3550.32       |
+| mobile_min_736  | 0.56               | 1162          | 106.02            | 1576.43       | 1420.83      | 4342.00        | 3258.95       |
+| mobile_max_960  | 0.48               | 1313          | 103.49            | 1587.25       | 1395.48      | 2642.00        | 2319.03       |
+| mobile_max_640  | 0.42               | 1436          | 103.07            | 1651.14       | 1422.62      | 2530.00        | 2149.11       |
+| server_min_1280 | 0.82               | 795           | 107.17            | 1678.16       | 1428.94      | 10368.00       | 8320.43       |
+| server_min_736  | 0.70               | 929           | 105.31            | 1634.85       | 1428.55      | 5402.00        | 4685.13       |
+| server_max_960  | 0.59               | 1073          | 103.03            | 1590.19       | 1383.62      | 2928.00        | 2079.47       |
+| server_max_640  | 0.54               | 1099          | 102.63            | 1602.09       | 1416.49      | 3152.00        | 2737.81       |
+
+**CPU, without high-performance inference:**
+
+| Config            | Avg Time/Image (s) | Avg Chars/sec | Avg CPU Usage (%) | Peak RAM (MB) | Avg RAM (MB) |
+| ----------------- | ------------------ | ------------- | ----------------- | ------------- | ------------ |
+| mobile_min_1280 | 1.64               | 398           | 799.45            | 12344.10      | 7100.60      |
+| mobile_min_736  | 1.43               | 455           | 798.93            | 11695.40      | 6829.09      |
+| mobile_max_960  | 1.21               | 521           | 800.13            | 11099.10      | 6369.49      |
+| mobile_max_640  | 1.01               | 597           | 802.52            | 9585.48       | 5573.52      |
+| server_min_1280 | 4.48               | 145           | 800.49            | 50683.10      | 28273.30     |
+| server_min_736  | 3.79               | 172           | 799.24            | 50216.00      | 27902.40     |
+| server_max_960  | 2.67               | 237           | 797.63            | 49362.50      | 26075.60     |
+| server_max_640  | 2.36               | 251           | 795.18            | 45656.10      | 24900.80     |
+
 # Deployment and Secondary Development
 * **Multiple System Support**: Compatible with mainstream operating systems including Windows, Linux, and Mac.
 * **Multiple Hardware Support**: Besides NVIDIA GPUs, it also supports inference and deployment on Intel CPU, Kunlun chips, Ascend, and other new hardware.
 * **High-Performance Inference Plugin**: Recommended to combine with high-performance inference plugins to further improve inference speed. See [High-Performance Inference Guide](../../deployment/high_performance_inference.md) for details.
 * **Service Deployment**: Supports highly stable service deployment solutions. See [Service Deployment Guide](../../deployment/serving.md) for details.
-* **Secondary Development Capability**: Supports custom dataset training, dictionary extension, and model fine-tuning. Example: To add Korean recognition, you can extend the dictionary and fine-tune the model, seamlessly integrating into existing production lines. See [Text Detection Module Usage Tutorial](../../module_usage/text_detection.en.md) and [Text Recognition Module Usage Tutorial](../../module_usage/text_recognition.en.md) for details.
+* **Secondary Development Capability**: Supports custom dataset training, dictionary extension, and model fine-tuning. Example: To add Korean recognition, you can extend the dictionary and fine-tune the model, seamlessly integrating into existing pipelines. See [Text Detection Module Usage Tutorial](../../module_usage/text_detection.en.md) and [Text Recognition Module Usage Tutorial](../../module_usage/text_recognition.en.md) for details.
--- a/docs/version3.x/algorithm/PP-OCRv5/PP-OCRv5.md
+++ b/docs/version3.x/algorithm/PP-OCRv5/PP-OCRv5.md
@ -203,7 +203,136 @@

 <a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/PaddleX3.0/doc_images/PP-OCRv5/algorithm_ppocrv5_demo.pdf">更多示例</a>

-# 四、部署与二次开发
+## 四、推理性能参考数据
+
+测试环境：
+
+- NVIDIA Tesla V100
+- Intel Xeon Gold 6271C
+- PaddlePaddle 3.0.0
+
+在 200 张图像（包括通用图像与文档图像）上测试。测试时从磁盘读取图像，因此读图时间及其他额外开销也被包含在总耗时内。如果将图像提前载入到内存，可进一步减少平均每图约 25 ms 的时间开销。
+
+如果不特别说明，则：
+
+- 使用 PP-OCRv4_mobile_det 和 PP-OCRv4_mobile_rec 模型。
+- 不使用文档图像方向分类、文本图像矫正、文本行方向分类。
+- 将 `text_det_limit_type` 设置为 `"min"`、`text_det_limit_side_len` 设置为 `732`。
+
+### 1. PP-OCRv5 与 PP-OCRv4 推理性能对比
+
+| 配置 | 说明 |
+| --- | --- |
+| v5_mobile | 使用 PP-OCRv5_mobile_det 和 PP-OCRv5_mobile_rec 模型。 |
+| v4_mobile | 使用 PP-OCRv4_mobile_det 和 PP-OCRv4_mobile_rec 模型。 |
+| v5_server | 使用 PP-OCRv5_server_det 和 PP-OCRv5_server_rec 模型。 |
+| v4_server | 使用 PP-OCRv4_server_det 和 PP-OCRv4_server_rec 模型。 |
+
+**GPU，不使用高性能推理：**
+
+| 配置 | 平均每图耗时（s） | 平均每秒预测字符数量 | 平均 CPU 利用率（%） | 峰值 RAM 用量（MB） | 平均 RAM 用量（MB） | 峰值 VRAM 用量（MB） | 平均 VRAM 用量（MB） |
+| --- | --- | --- | --- | --- | --- | --- | --- |
+| v5_mobile | 0.56 | 1162 | 106.02 | 1576.43 | 1420.83 | 18.95 | 4342.00 | 3258.95 |
+| v4_mobile | 0.27 | 2246 | 111.20 | 1392.22 | 1318.76 | 28.90 | 1304.00 | 1166.46 |
+| v5_server | 0.70 | 929 | 105.31 | 1634.85 | 1428.55 | 36.21 | 5402.00 | 4685.13 |
+| v4_server | 0.44 | 1418 | 106.96 | 1455.34 | 1346.95 | 58.82 | 6760.00 | 5817.46 |
+
+**GPU，使用高性能推理：**
+
+| 配置 | 平均每图耗时（s） | 平均每秒预测字符数量 | 平均 CPU 利用率（%） | 峰值 RAM 用量（MB） | 平均 RAM 用量（MB） | 峰值 VRAM 用量（MB） | 平均 VRAM 用量（MB） |
+| --- | --- | --- | --- | --- | --- | --- | --- |
+| v5_mobile | 0.50 | 1301 | 106.50 | 1338.12 | 1155.86 | 11.97 | 4112.00 | 3536.36 |
+| v4_mobile | 0.21 | 2887 | 114.09 | 1113.27 | 1054.46 | 15.22 | 2072.00 | 1840.59 |
+| v5_server | 0.60 | 1084 | 105.73 | 1980.73 | 1776.20 | 22.10 | 12150.00 | 11849.40 |
+| v4_server | 0.36 | 1687 | 104.15 | 1186.42 | 1065.67 | 38.12 | 13058.00 | 12679.00 |
+
+**CPU，不使用高性能推理：**
+
+| 配置 | 平均每图耗时（s） | 平均每秒预测字符数量 | 平均 CPU 利用率（%） | 峰值 RAM 用量（MB） | 平均 RAM 用量（MB） |
+| --- | --- | --- | --- | --- | --- |
+| v5_mobile | 1.43 | 455 | 798.93 | 11695.40 | 6829.09 |
+| v4_mobile | 1.09 | 556 | 813.16 | 11996.30 | 6834.25 |
+| v5_server | 3.79 | 172 | 799.24 | 50216.00 | 27902.40 |
+| v4_server | 4.22 | 148 | 803.74 | 51428.70 | 28593.60 |
+
+**CPU，使用高性能推理：**
+
+| 配置 | 平均每图耗时（s） | 平均每秒预测字符数量 | 平均 CPU 利用率（%） | 峰值 RAM 用量（MB） | 平均 RAM 用量（MB） |
+| --- | --- | --- | --- | --- | --- |
+| v5_mobile | 1.14 | 571 | 339.68 | 3245.17 | 2560.55 |
+| v4_mobile | 0.68 | 892 | 443.00 | 3057.38 | 2329.44 |
+| v5_server | 3.56 | 183 | 797.03 | 45664.70 | 26905.90 |
+| v4_server | 4.22 | 148 | 803.74 | 51428.70 | 28593.60 |
+
+> 说明：PP-OCRv5 的识别模型使用了更大的字典，需要更长的推理时间，导致 PP-OCRv5 的推理速度慢于 PP-OCRv4。
+
+### 2. 使用辅助功能对 PP-OCRv5 推理性能的影响
+
+| 配置 | 说明 |
+| --- | --- |
+| base | 不使用文档图像方向分类、文本图像矫正、文本行方向分类。 |
+| with_textline | 使用文本行方向分类，不使用文档图像方向分类、文本图像矫正。 |
+| with_all | 使用文档图像方向分类、文本图像矫正、文本行方向分类。 |
+
+**GPU，不使用高性能推理：**
+
+| 配置 | 平均每图耗时（s） | 平均每秒预测字符数量 | 平均 CPU 利用率（%） | 峰值 RAM 用量（MB） | 平均 RAM 用量（MB） | 峰值 VRAM 用量（MB） | 平均 VRAM 用量（MB） |
+| --- | --- | --- | --- | --- | --- | --- | --- |
+| base | 0.56 | 1162 | 106.02 | 1576.43 | 1420.83 | 18.95 | 4342.00 | 3258.95 |
+| with_textline | 0.60 | 1083 | 105.59 | 1715.65 | 1510.83 | 18.48 | 4342.00 | 3266.05 |
+| with_all | 1.01 | 605 | 104.89 | 1949.11 | 1612.00 | 10.85 | 2624.00 | 2210.15 |
+
+**CPU，不使用高性能推理：**
+
+| 配置 | 平均每图耗时（s） | 平均每秒预测字符数量 | 平均 CPU 利用率（%） | 峰值 RAM 用量（MB） | 平均 RAM 用量（MB） |
+| --- | --- | --- | --- | --- | --- |
+| base | 1.43 | 455 | 798.93 | 11695.40 | 6829.09 |
+| with_textline | 1.43 | 454 | 801.90 | 11994.30 | 6947.94 |
+| with_all | 1.90 | 320 | 642.48 | 11710.80 | 6944.01 |
+
+> 说明：文本图像矫正等辅助功能会对端到端推理精度造成影响，因此并不一定使用的辅助功能越多、资源用量越大。
+
+### 3. 文本检测模块输入缩放尺寸策略对 PP-OCRv5 推理性能的影响
+
+| 配置 | 说明 |
+| --- | --- |
+| mobile_min_1280 | 使用 PP-OCRv5_mobile_det 和 PP-OCRv5_mobile_rec 模型，将 `text_det_limit_type` 设置为 `"min"`、`text_det_limit_side_len` 设置为 `1280`。 |
+| mobile_min_736 | 使用 PP-OCRv5_mobile_det 和 PP-OCRv5_mobile_rec 模型，将 `text_det_limit_type` 设置为 `"min"`、`text_det_limit_side_len` 设置为 `1280`。 |
+| mobile_max_960 | 使用 PP-OCRv5_mobile_det 和 PP-OCRv5_mobile_rec 模型，将 `text_det_limit_type` 设置为 `"max"`、`text_det_limit_side_len` 设置为 `960`。 |
+| mobile_max_640 | 使用 PP-OCRv5_mobile_det 和 PP-OCRv5_mobile_rec 模型，将 `text_det_limit_type` 设置为 `"max"`、`text_det_limit_side_len` 设置为 `640`。 |
+| server_min_1280 | 使用 PP-OCRv5_server_det 和 PP-OCRv5_server_rec 模型，将 `text_det_limit_type` 设置为 `"min"`、`text_det_limit_side_len` 设置为 `1280`。 |
+| server_min_736 | 使用 PP-OCRv5_server_det 和 PP-OCRv5_server_rec 模型，将 `text_det_limit_type` 设置为 `"min"`、`text_det_limit_side_len` 设置为 `1280`。 |
+| server_max_960 | 使用 PP-OCRv5_server_det 和 PP-OCRv5_server_rec 模型，将 `text_det_limit_type` 设置为 `"max"`、`text_det_limit_side_len` 设置为 `960`。 |
+| server_max_640 | 使用 PP-OCRv5_server_det 和 PP-OCRv5_server_rec 模型，将 `text_det_limit_type` 设置为 `"max"`、`text_det_limit_side_len` 设置为 `640`。 |
+
+**GPU，不使用高性能推理：**
+
+| 配置 | 平均每图耗时（s） | 平均每秒预测字符数量 | 平均 CPU 利用率（%） | 峰值 RAM 用量（MB） | 平均 RAM 用量（MB） | 峰值 VRAM 用量（MB） | 平均 VRAM 用量（MB） |
+| --- | --- | --- | --- | --- | --- | --- | --- |
+| mobile_min_1280 | 0.61 | 1071 | 109.12 | 1663.71 | 1439.72 | 19.27 | 4202.00 | 3550.32 |
+| mobile_min_736 | 0.56 | 1162 | 106.02 | 1576.43 | 1420.83 | 18.95 | 4342.00 | 3258.95 |
+| mobile_max_960 | 0.48 | 1313 | 103.49 | 1587.25 | 1395.48 | 19.37 | 2642.00 | 2319.03 |
+| mobile_max_640 | 0.42 | 1436 | 103.07 | 1651.14 | 1422.62 | 18.95 | 2530.00 | 2149.11 |
+| server_min_1280 | 0.82 | 795 | 107.17 | 1678.16 | 1428.94 | 40.43 | 10368.00 | 8320.43 |
+| server_min_736 | 0.70 | 929 | 105.31 | 1634.85 | 1428.55 | 36.21 | 5402.00 | 4685.13 |
+| server_max_960 | 0.59 | 1073 | 103.03 | 1590.19 | 1383.62 | 33.42 | 2928.00 | 2079.47 |
+| server_max_640 | 0.54 | 1099 | 102.63 | 1602.09 | 1416.49 | 30.77 | 3152.00 | 2737.81 |
+
+**CPU，不使用高性能推理：**
+
+| 配置 | 平均每图耗时（s） | 平均每秒预测字符数量 | 平均 CPU 利用率（%） | 峰值 RAM 用量（MB） | 平均 RAM 用量（MB） |
+| --- | --- | --- | --- | --- | --- |
+| mobile_min_1280 | 1.64 | 398 | 799.45 | 12344.10 | 7100.60 |
+| mobile_min_736 | 1.43 | 455 | 798.93 | 11695.40 | 6829.09 |
+| mobile_max_960 | 1.21 | 521 | 800.13 | 11099.10 | 6369.49 |
+| mobile_max_640 | 1.01 | 597 | 802.52 | 9585.48 | 5573.52 |
+| server_min_1280 | 4.48 | 145 | 800.49 | 50683.10 | 28273.30 |
+| server_min_736 | 3.79 | 172 | 799.24 | 50216.00 | 27902.40 |
+| server_max_960 | 2.67 | 237 | 797.63 | 49362.50 | 26075.60 |
+| server_max_640 | 2.36 | 251 | 795.18 | 45656.10 | 24900.80 |
+
+
+# 五、部署与二次开发
 * **多系统支持**：兼容Windows、Linux、Mac等主流操作系统。
 * **多硬件支持**：除了英伟达GPU外，还支持Intel CPU、昆仑芯、昇腾等新硬件推理和部署。
 * **高性能推理插件**：推荐结合高性能推理插件进一步提升推理速度，详见[高性能推理指南](../../deployment/high_performance_inference.md)。