update readme (#15857)

* update readme * Update docs * Add example * Fix comma splice --------- Co-authored-by: Bobholamovic <mhlin425@whu.edu.cn>
2025-06-26 21:24:27 +00:00 · 2025-06-25 04:11:23 -04:00 · 2025-06-25 04:11:23 -04:00 · a922845f7e
commit a922845f7e
parent da3d8b257a
2 changed files with 56 additions and 5 deletions
--- a/docs/version3.x/deployment/mcp_server.en.md
+++ b/docs/version3.x/deployment/mcp_server.en.md
@ -197,9 +197,34 @@ Below are complete Claude for Desktop configuration examples for different worki
 **Note**:

 - `PADDLEOCR_MCP_PIPELINE_CONFIG` is optional. If not set, the default pipeline configuration is used. To adjust settings, such as changing models, refer to the [PaddleOCR and PaddleX documentation](../paddleocr_and_paddlex.en.md), export a pipeline configuration file, and set `PADDLEOCR_MCP_PIPELINE_CONFIG` to its absolute path.
- **CPU Inference Performance Tip**:
-  - **OCR Pipeline**: If you are running in a CPU environment, it is recommended to switch to the `mobile` series models for better performance. You can change the detection and recognition models in your pipeline configuration file to `text_detection_model_name="PP-OCRv5_mobile_det"` and `text_recognition_model_name="PP-OCRv5_mobile_rec"` respectively.
-  - **PP-StructureV3 Pipeline**: Due to its model complexity, using this pipeline in an environment without a GPU is not recommended.
+- **CPU Inference Performance Tips:**
+  - **OCR Pipeline:** The default models used are relatively complex. If you want to improve inference speed and reduce memory usage, it is recommended to switch to the `mobile` series models. For example, you can modify the detection and recognition models in the pipeline configuration file to `PP-OCRv5_mobile_det` and `PP-OCRv5_mobile_rec`, respectively.
+  - **PP-StructureV3 Pipeline:** Using the default configuration requires more computational resources. If you want to improve inference speed and reduce memory consumption, please consider the following suggestions:
+    - Disable features you do not need. For example, set `use_formula_recognition` to `False` to disable formula recognition.
+    - Use lightweight models, such as replacing the OCR model with a `mobile` version, or using a lightweight formula recognition model like PP-FormulaNet-S.
+
+    The following example code can be used to obtain a pipeline configuration file, in which most optional features of the PP-StructureV3 pipeline are disabled, while some key models are replaced with lightweight versions.
+
+    ```python
+    from paddleocr import PPStructureV3
+
+    pipeline = PPStructureV3(
+        use_doc_orientation_classify=False, # Disable document image orientation classification
+        use_doc_unwarping=False,            # Disable text image unwarping
+        use_textline_orientation=False,     # Disable text line orientation classification
+        use_formula_recognition=False,      # Disable formula recognition
+        use_seal_recognition=False,         # Disable seal text recognition
+        use_table_recognition=False,        # Disable table recognition
+        use_chart_recognition=False,        # Disable chart parsing
+        # Use lightweight models
+        text_detection_model_name="PP-OCRv5_mobile_det",
+        text_recognition_model_name="PP-OCRv5_mobile_rec",
+        layout_detection_model_name="PP-DocLayout-S",
+    )
+
+    # The configuration file is saved to `PP-StructureV3.yaml`
+    pipeline.export_paddlex_config_to_yaml("PP-StructureV3.yaml")
+    ```

 ### 5.3 Self-hosted Service Configuration

--- a/docs/version3.x/deployment/mcp_server.md
+++ b/docs/version3.x/deployment/mcp_server.md
@ -198,8 +198,34 @@ ImportError: failed to find libmagic.  Check your installation

 - `PADDLEOCR_MCP_PIPELINE_CONFIG` 为可选项，不设置时使用产线默认配置。如需调整配置，例如更换模型，请参考 [PaddleOCR 文档](../paddleocr_and_paddlex.md) 导出产线配置文件，并将 `PADDLEOCR_MCP_PIPELINE_CONFIG` 设置为配置文件的绝对路径。
 - **CPU 推理性能提示**：
-  - **OCR 产线**：若您在 CPU 环境下运行，为获得更佳性能，建议更换为 `mobile` 系列模型。您可以在产线配置文件中将检测和识别模型分别修改为 `text_detection_model_name="PP-OCRv5_mobile_det"` 和 `text_recognition_model_name="PP-OCRv5_mobile_rec"`。
-  - **PP-StructureV3 产线**：由于模型复杂度较高，不建议在没有 GPU 的环境中使用此产线。
+  - **OCR 产线**：默认使用的模型复杂度较高，如果您希望提升产线推理速度、降低内存消耗，建议更换 `mobile` 系列模型。例如，您可以在产线配置文件中将检测和识别模型分别修改为 `PP-OCRv5_mobile_det` 和 `PP-OCRv5_mobile_rec`。
+  - **PP-StructureV3 产线**：使用默认配置需要消耗较多计算资源，如果您希望提升产线推理速度、降低内存消耗，请参考如下建议调整配置：
+    
+    - 关闭不需要用到的功能，例如设置 `use_formula_recognition` 为 `False` 以禁用公式识别。
+    - 使用轻量级的模型，例如将 OCR 模型替换为 `mobile` 版本、换用轻量的公式识别模型 PP-FormulaNet-S 等。
+
+    以下示例代码可用于获取产线配置文件，其中关闭了 PP-StructureV3 产线的大部分可选功能，同时将部分关键模型更换为轻量级版本。
+
+    ```python
+    from paddleocr import PPStructureV3
+
+    pipeline = PPStructureV3(
+        use_doc_orientation_classify=False, # 禁用文档图像方向分类
+        use_doc_unwarping=False,            # 禁用文本图像矫正
+        use_textline_orientation=False,     # 禁用文本行方向分类
+        use_formula_recognition=False,      # 禁用公式识别
+        use_seal_recognition=False,         # 禁用印章文本识别
+        use_table_recognition=False,        # 禁用表格识别
+        use_chart_recognition=False,        # 禁用图表解析
+        # 使用轻量级模型
+        text_detection_model_name="PP-OCRv5_mobile_det",
+        text_recognition_model_name="PP-OCRv5_mobile_rec",
+        layout_detection_model_name="PP-DocLayout-S",
+    )
+
+    # 配置文件保存到 `PP-StructureV3.yaml` 中
+    pipeline.export_paddlex_config_to_yaml("PP-StructureV3.yaml")
+    ```

 ### 5.3 自托管服务配置