mirror of
https://github.com/PaddlePaddle/PaddleOCR.git
synced 2025-06-26 21:24:27 +00:00
update readme (#15857)
* update readme * Update docs * Add example * Fix comma splice --------- Co-authored-by: Bobholamovic <mhlin425@whu.edu.cn>
This commit is contained in:
parent
da3d8b257a
commit
a922845f7e
@ -197,9 +197,34 @@ Below are complete Claude for Desktop configuration examples for different worki
|
||||
**Note**:
|
||||
|
||||
- `PADDLEOCR_MCP_PIPELINE_CONFIG` is optional. If not set, the default pipeline configuration is used. To adjust settings, such as changing models, refer to the [PaddleOCR and PaddleX documentation](../paddleocr_and_paddlex.en.md), export a pipeline configuration file, and set `PADDLEOCR_MCP_PIPELINE_CONFIG` to its absolute path.
|
||||
- **CPU Inference Performance Tip**:
|
||||
- **OCR Pipeline**: If you are running in a CPU environment, it is recommended to switch to the `mobile` series models for better performance. You can change the detection and recognition models in your pipeline configuration file to `text_detection_model_name="PP-OCRv5_mobile_det"` and `text_recognition_model_name="PP-OCRv5_mobile_rec"` respectively.
|
||||
- **PP-StructureV3 Pipeline**: Due to its model complexity, using this pipeline in an environment without a GPU is not recommended.
|
||||
- **CPU Inference Performance Tips:**
|
||||
- **OCR Pipeline:** The default models used are relatively complex. If you want to improve inference speed and reduce memory usage, it is recommended to switch to the `mobile` series models. For example, you can modify the detection and recognition models in the pipeline configuration file to `PP-OCRv5_mobile_det` and `PP-OCRv5_mobile_rec`, respectively.
|
||||
- **PP-StructureV3 Pipeline:** Using the default configuration requires more computational resources. If you want to improve inference speed and reduce memory consumption, please consider the following suggestions:
|
||||
- Disable features you do not need. For example, set `use_formula_recognition` to `False` to disable formula recognition.
|
||||
- Use lightweight models, such as replacing the OCR model with a `mobile` version, or using a lightweight formula recognition model like PP-FormulaNet-S.
|
||||
|
||||
The following example code can be used to obtain a pipeline configuration file, in which most optional features of the PP-StructureV3 pipeline are disabled, while some key models are replaced with lightweight versions.
|
||||
|
||||
```python
|
||||
from paddleocr import PPStructureV3
|
||||
|
||||
pipeline = PPStructureV3(
|
||||
use_doc_orientation_classify=False, # Disable document image orientation classification
|
||||
use_doc_unwarping=False, # Disable text image unwarping
|
||||
use_textline_orientation=False, # Disable text line orientation classification
|
||||
use_formula_recognition=False, # Disable formula recognition
|
||||
use_seal_recognition=False, # Disable seal text recognition
|
||||
use_table_recognition=False, # Disable table recognition
|
||||
use_chart_recognition=False, # Disable chart parsing
|
||||
# Use lightweight models
|
||||
text_detection_model_name="PP-OCRv5_mobile_det",
|
||||
text_recognition_model_name="PP-OCRv5_mobile_rec",
|
||||
layout_detection_model_name="PP-DocLayout-S",
|
||||
)
|
||||
|
||||
# The configuration file is saved to `PP-StructureV3.yaml`
|
||||
pipeline.export_paddlex_config_to_yaml("PP-StructureV3.yaml")
|
||||
```
|
||||
|
||||
### 5.3 Self-hosted Service Configuration
|
||||
|
||||
|
@ -198,8 +198,34 @@ ImportError: failed to find libmagic. Check your installation
|
||||
|
||||
- `PADDLEOCR_MCP_PIPELINE_CONFIG` 为可选项,不设置时使用产线默认配置。如需调整配置,例如更换模型,请参考 [PaddleOCR 文档](../paddleocr_and_paddlex.md) 导出产线配置文件,并将 `PADDLEOCR_MCP_PIPELINE_CONFIG` 设置为配置文件的绝对路径。
|
||||
- **CPU 推理性能提示**:
|
||||
- **OCR 产线**:若您在 CPU 环境下运行,为获得更佳性能,建议更换为 `mobile` 系列模型。您可以在产线配置文件中将检测和识别模型分别修改为 `text_detection_model_name="PP-OCRv5_mobile_det"` 和 `text_recognition_model_name="PP-OCRv5_mobile_rec"`。
|
||||
- **PP-StructureV3 产线**:由于模型复杂度较高,不建议在没有 GPU 的环境中使用此产线。
|
||||
- **OCR 产线**:默认使用的模型复杂度较高,如果您希望提升产线推理速度、降低内存消耗,建议更换 `mobile` 系列模型。例如,您可以在产线配置文件中将检测和识别模型分别修改为 `PP-OCRv5_mobile_det` 和 `PP-OCRv5_mobile_rec`。
|
||||
- **PP-StructureV3 产线**:使用默认配置需要消耗较多计算资源,如果您希望提升产线推理速度、降低内存消耗,请参考如下建议调整配置:
|
||||
|
||||
- 关闭不需要用到的功能,例如设置 `use_formula_recognition` 为 `False` 以禁用公式识别。
|
||||
- 使用轻量级的模型,例如将 OCR 模型替换为 `mobile` 版本、换用轻量的公式识别模型 PP-FormulaNet-S 等。
|
||||
|
||||
以下示例代码可用于获取产线配置文件,其中关闭了 PP-StructureV3 产线的大部分可选功能,同时将部分关键模型更换为轻量级版本。
|
||||
|
||||
```python
|
||||
from paddleocr import PPStructureV3
|
||||
|
||||
pipeline = PPStructureV3(
|
||||
use_doc_orientation_classify=False, # 禁用文档图像方向分类
|
||||
use_doc_unwarping=False, # 禁用文本图像矫正
|
||||
use_textline_orientation=False, # 禁用文本行方向分类
|
||||
use_formula_recognition=False, # 禁用公式识别
|
||||
use_seal_recognition=False, # 禁用印章文本识别
|
||||
use_table_recognition=False, # 禁用表格识别
|
||||
use_chart_recognition=False, # 禁用图表解析
|
||||
# 使用轻量级模型
|
||||
text_detection_model_name="PP-OCRv5_mobile_det",
|
||||
text_recognition_model_name="PP-OCRv5_mobile_rec",
|
||||
layout_detection_model_name="PP-DocLayout-S",
|
||||
)
|
||||
|
||||
# 配置文件保存到 `PP-StructureV3.yaml` 中
|
||||
pipeline.export_paddlex_config_to_yaml("PP-StructureV3.yaml")
|
||||
```
|
||||
|
||||
### 5.3 自托管服务配置
|
||||
|
||||
|
Loading…
x
Reference in New Issue
Block a user