From c3ba8d0c779c717278a221d0edb47b62339cbe7b Mon Sep 17 00:00:00 2001 From: Sunflower7788 Date: Tue, 20 May 2025 17:11:58 +0800 Subject: [PATCH] all_docs_review_final (#15221) * all_docs_review_final * all_docs_review_final --- .../module_usage/formula_recognition.en.md | 12 +++++----- .../module_usage/formula_recognition.md | 23 ++++++++++--------- .../module_usage/seal_text_detection.en.md | 6 +++-- .../module_usage/seal_text_detection.md | 10 ++++---- .../table_structure_recognition.en.md | 4 ++-- .../table_structure_recognition.md | 10 ++++---- .../module_usage/text_detection.en.md | 6 ++--- .../version3.x/module_usage/text_detection.md | 12 +++++----- .../module_usage/text_recognition.en.md | 6 ++--- .../module_usage/text_recognition.md | 18 +++++++-------- .../pipeline_usage/doc_preprocessor.en.md | 2 +- .../pipeline_usage/doc_preprocessor.md | 2 +- 12 files changed, 57 insertions(+), 54 deletions(-) diff --git a/docs/version3.x/module_usage/formula_recognition.en.md b/docs/version3.x/module_usage/formula_recognition.en.md index 4e8c29d63e..8cf5f613e8 100644 --- a/docs/version3.x/module_usage/formula_recognition.en.md +++ b/docs/version3.x/module_usage/formula_recognition.en.md @@ -336,7 +336,7 @@ You can choose either method based on your actual needs. The `predict()` method If the models above do not perform well in your scenario, you can try the following steps for custom development. Here we take training `PP-FormulaNet_plus-M` as an example. For other models, just replace the corresponding config file. First, you need to prepare a formula recognition dataset. You can follow the format of the [formula recognition demo data](https://paddle-model-ecology.bj.bcebos.com/paddlex/data/ocr_rec_latexocr_dataset_example.tar). Once the data is ready, follow the steps below to train and export the model. After export, the model can be quickly integrated into the API described above. This example uses the demo dataset. Before training the model, please ensure you have installed all PaddleOCR dependencies as described in the [installation documentation](xxx). -## 4.1 Environment Setup +### 4.1 Environment Setup To train the formula recognition model, you need to install additional Python and Linux dependencies. Run the following commands: @@ -346,16 +346,16 @@ sudo apt-get install libmagickwand-dev pip install tokenizers==0.19.1 imagesize ftfy Wand ``` -## 4.2 Dataset and Pretrained Model Preparation +### 4.2 Dataset and Pretrained Model Preparation -### 4.2.1 Prepare the Dataset +#### 4.2.1 Prepare the Dataset ```shell # Download the demo dataset wget https://paddle-model-ecology.bj.bcebos.com/paddlex/data/ocr_rec_latexocr_dataset_example.tar tar -xf ocr_rec_latexocr_dataset_example.tar ``` -### 4.2.2 Download the Pretrained Model +#### 4.2.2 Download the Pretrained Model ```shell # Download the PP-FormulaNet_plus-M pre-trained model wget https://paddleocr.bj.bcebos.com/contribution/rec_ppformulanet_plus_m_train.tar @@ -394,14 +394,14 @@ You can evaluate trained weights, e.g., output/xxx/xxx.pdparams, or use the down # Demo test set evaluation python3 tools/eval.py -c configs/rec/PP-FormuaNet/PP-FormulaNet_plus-M.yaml -o \ Global.pretrained_model=./rec_ppformulanet_plus_m_train/best_accuracy.pdparams - ``` + ### 4.5 Model Export ```bash python3 tools/export_model.py -c configs/rec/PP-FormuaNet/PP-FormulaNet_plus-M.yaml -o \ Global.pretrained_model=./rec_ppformulanet_plus_m_train/best_accuracy.pdparams \ Global.save_inference_dir="./PP-FormulaNet_plus-M_infer/" - ``` +``` After exporting, the static graph model will be saved in `./PP-FormulaNet_plus-M_infer/`, and you will see the following files: ``` diff --git a/docs/version3.x/module_usage/formula_recognition.md b/docs/version3.x/module_usage/formula_recognition.md index 2409bc57d2..b54950f426 100644 --- a/docs/version3.x/module_usage/formula_recognition.md +++ b/docs/version3.x/module_usage/formula_recognition.md @@ -344,7 +344,7 @@ sudo apt-get install texlive texlive-latex-base texlive-xetex latex-cjk-all texl ## 四、二次开发 如果以上模型在您的场景下效果仍然不理想,您可以尝试以下步骤进行二次开发,此处以训练 `PP-FormulaNet-S` 举例,其他模型替换对应配置文件即可。首先,您需要准备公式识别的数据集,可以参考[公式识别 Demo 数据](https://paddle-model-ecology.bj.bcebos.com/paddlex/data/ocr_rec_latexocr_dataset_example.tar)的格式准备,准备好后,即可按照以下步骤进行模型训练和导出,导出后,可以将模型快速集成到上述API中。此处以公式识别 Demo 数据示例。在训练模型之前,请确保已经按照[安装文档](../installation.md)安装了 PaddleOCR 所需要的依赖。 -## 4.1 环境配置 +### 4.1 环境配置 训练公式识别模型需要安装额外的Python依赖和linux依赖,执行如下命令安装: ```shell @@ -353,9 +353,9 @@ sudo apt-get install libmagickwand-dev pip install tokenizers==0.19.1 imagesize ftfy Wand ``` -## 4.2 数据集、预训练模型准备 +### 4.2 数据集、预训练模型准备 -### 4.2.1 准备数据集 +#### 4.2.1 准备数据集 ```shell # 下载示例数据集 @@ -363,7 +363,7 @@ wget https://paddle-model-ecology.bj.bcebos.com/paddlex/data/ocr_rec_latexocr_da tar -xf ocr_rec_latexocr_dataset_example.tar ``` -### 4.2.2 下载预训练模型 +#### 4.2.2 下载预训练模型 ```shell # 下载 PP-FormulaNet_plus-M 预训练模型 @@ -401,19 +401,20 @@ python3 -m paddle.distributed.launch --gpus '0,1,2,3' tools/train.py -c config 您可以评估已经训练好的权重,如,`output/xxx/xxx.pdparams`,也可以使用已经下载的[模型文件](https://paddleocr.bj.bcebos.com/contribution/rec_ppformulanet_s_train.tar),使用如下命令进行评估: ```bash -# 注意将pretrained_model的路径设置为本地路径。若使用自行训练保存的模型,请注意修改路径和文件名为{path/to/weights}/{model_name}。 - # demo 测试集评估 + +#注意将pretrained_model的路径设置为本地路径。若使用自行训练保存的模型,请注意修改路径和文件名为{path/to/weights}/{model_name}。 +#demo 测试集评估 python3 tools/eval.py -c configs/rec/PP-FormuaNet/PP-FormulaNet_plus-M.yaml -o \ Global.pretrained_model=./rec_ppformulanet_plus_m_train/best_accuracy.pdparams - ``` +``` ### 4.5 模型导出 ```bash - python3 tools/export_model.py -c configs/rec/PP-FormuaNet/PP-FormulaNet_plus-M.yaml -o \ - Global.pretrained_model=./rec_ppformulanet_plus_m_train/best_accuracy.pdparams \ - Global.save_inference_dir="./PP-FormulaNet_plus-M_infer/" - ``` +python3 tools/export_model.py -c configs/rec/PP-FormuaNet/PP-FormulaNet_plus-M.yaml -o \ +Global.pretrained_model=./rec_ppformulanet_plus_m_train/best_accuracy.pdparams \ +Global.save_inference_dir="./PP-FormulaNet_plus-M_infer/" +``` 导出模型后,静态图模型会存放于当前目录的`./PP-FormulaNet_plus-M_infer/`中,在该目录下,您将看到如下文件: ``` diff --git a/docs/version3.x/module_usage/seal_text_detection.en.md b/docs/version3.x/module_usage/seal_text_detection.en.md index 057afc5f3a..2aafd7a885 100644 --- a/docs/version3.x/module_usage/seal_text_detection.en.md +++ b/docs/version3.x/module_usage/seal_text_detection.en.md @@ -457,13 +457,15 @@ If the above model is still not performing well in your scenario, you can try th ### 4.1 Dataset and Pre-trained Model Preparation -### 4.1.1 Preparing the Dataset +#### 4.1.1 Preparing the Dataset ```shell wget https://paddle-model-ecology.bj.bcebos.com/paddlex/data/ocr_curve_det_dataset_examples.tar -P ./dataset tar -xf ./dataset/ocr_curve_det_dataset_examples.tar -C ./dataset/ ``` -### 4.1.2 Preparing the pre-trained model + +#### 4.1.1 Preparing the pre-trained model + ```shell wget https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/PP-OCRv4_server_seal_det_pretrained.pdparams diff --git a/docs/version3.x/module_usage/seal_text_detection.md b/docs/version3.x/module_usage/seal_text_detection.md index 22caf89af2..af9c7fb629 100644 --- a/docs/version3.x/module_usage/seal_text_detection.md +++ b/docs/version3.x/module_usage/seal_text_detection.md @@ -451,9 +451,9 @@ for res in output: 如果以上模型在您的场景上效果仍然不理想,您可以尝试以下步骤进行二次开发,此处以训练 `PP-OCRv4_server_seal_det` 举例,其他模型替换对应配置文件即可。首先,您需要准备文本检测的数据集,可以参考[印章文本检测 Demo 数据](https://paddle-model-ecology.bj.bcebos.com/paddlex/data/ocr_curve_det_dataset_examples.tar)的格式准备,准备好后,即可按照以下步骤进行模型训练和导出,导出后,可以将模型快速集成到上述 API 中。此处以印章文本检测 Demo 数据示例。在训练模型之前,请确保已经按照[安装文档](../installation.md)安装了 PaddleOCR 所需要的依赖。 -## 4.1 数据集、预训练模型准备 +### 4.1 数据集、预训练模型准备 -### 4.1.1 准备数据集 +#### 4.1.1 准备数据集 ```shell # 下载示例数据集 @@ -461,7 +461,7 @@ wget https://paddle-model-ecology.bj.bcebos.com/paddlex/data/ocr_curve_det_datas tar -xf ./dataset/ocr_curve_det_dataset_examples.tar -C ./dataset/ ``` -### 4.1.2 下载预训练模型 +#### 4.1.2 下载预训练模型 ```shell # 下载 PP-OCRv4_server_seal_det 预训练模型 @@ -495,7 +495,7 @@ python3 -m paddle.distributed.launch --gpus '0,1,2,3' tools/train.py -c configs # demo 测试集评估 python3 tools/eval.py -c configs/det/PP-OCRv4/PP-OCRv4_server_seal_det.yml -o \ Global.pretrained_model=output/xxx/xxx.pdparams - ``` +``` ### 4.4 模型导出 @@ -503,7 +503,7 @@ python3 -m paddle.distributed.launch --gpus '0,1,2,3' tools/train.py -c configs python3 tools/export_model.py -c configs/det/PP-OCRv4/PP-OCRv4_server_seal_det.yml -o \ Global.pretrained_model=output/xxx/xxx.pdparams \ save_inference_dir="./PP-OCRv4_server_seal_det_infer/" - ``` +``` 导出模型后,静态图模型会存放于当前目录的`./PP-OCRv4_server_seal_det_infer/`中,在该目录下,您将看到如下文件: ``` diff --git a/docs/version3.x/module_usage/table_structure_recognition.en.md b/docs/version3.x/module_usage/table_structure_recognition.en.md index b1fa16c1e6..040551ec9d 100644 --- a/docs/version3.x/module_usage/table_structure_recognition.en.md +++ b/docs/version3.x/module_usage/table_structure_recognition.en.md @@ -336,7 +336,7 @@ You can evaluate the trained weights, such as `output/xxx/xxx.pdparams`, using t # Demo test set evaluation python3 tools/eval.py -c configs/table/SLANet.yml -o \ Global.pretrained_model=output/xxx/xxx.pdparams - ``` +``` ### 4.4 Model Export @@ -344,7 +344,7 @@ You can evaluate the trained weights, such as `output/xxx/xxx.pdparams`, using t python3 tools/export_model.py -c configs/table/SLANet.yml -o \ Global.pretrained_model=output/xxx/xxx.pdparams \ save_inference_dir="./SLANet_infer/" - ``` +``` After exporting the model, the static graph model will be stored in `./SLANet_infer/` in the current directory. In this directory, you will see the following files: ``` diff --git a/docs/version3.x/module_usage/table_structure_recognition.md b/docs/version3.x/module_usage/table_structure_recognition.md index e8f483e716..153eae189e 100644 --- a/docs/version3.x/module_usage/table_structure_recognition.md +++ b/docs/version3.x/module_usage/table_structure_recognition.md @@ -293,9 +293,9 @@ for res in output: 如果以上模型在您的场景上效果仍然不理想,您可以尝试以下步骤进行二次开发,此处以训练 `SLANet` 举例,其他模型替换对应配置文件即可。首先,您需要准备表格结构识别的数据集,可以参考[表格结构识别 Demo 数据](https://paddle-model-ecology.bj.bcebos.com/paddlex/data/table_rec_dataset_examples.tar)的格式准备,准备好后,即可按照以下步骤进行模型训练和导出,导出后,可以将模型快速集成到上述 API 中。此处以表格结构识别 Demo 数据示例。在训练模型之前,请确保已经按照[[安装文档](../installation.md)安装了 PaddleOCR 所需要的依赖。 -## 4.1 数据集、预训练模型准备 +### 4.1 数据集、预训练模型准备 -### 4.1.1 准备数据集 +#### 4.1.1 准备数据集 ```shell # 下载示例数据集 @@ -303,7 +303,7 @@ wget https://paddle-model-ecology.bj.bcebos.com/paddlex/data/table_rec_dataset_e tar -xf table_rec_dataset_examples.tar ``` -### 4.1.2 下载预训练模型 +#### 4.1.2 下载预训练模型 ```shell # 下载 SLANet 预训练模型 @@ -336,7 +336,7 @@ python3 -m paddle.distributed.launch --gpus '0,1,2,3' tools/train.py -c configs # demo 测试集评估 python3 tools/eval.py -c configs/table/SLANet.yml -o \ Global.pretrained_model=output/xxx/xxx.pdparams - ``` +``` ### 4.4 模型导出 @@ -344,7 +344,7 @@ python3 -m paddle.distributed.launch --gpus '0,1,2,3' tools/train.py -c configs python3 tools/export_model.py -c configs/table/SLANet.yml -o \ Global.pretrained_model=output/xxx/xxx.pdparams \ save_inference_dir="./SLANet_infer/" - ``` +``` 导出模型后,静态图模型会存放于当前目录的`./SLANet_infer/`中,在该目录下,您将看到如下文件: ``` diff --git a/docs/version3.x/module_usage/text_detection.en.md b/docs/version3.x/module_usage/text_detection.en.md index 753b5d956b..e5928e757c 100644 --- a/docs/version3.x/module_usage/text_detection.en.md +++ b/docs/version3.x/module_usage/text_detection.en.md @@ -376,9 +376,9 @@ Method and parameter descriptions: If the above models do not meet your requirements, follow these steps for custom development (using `PP-OCRv5_server_det` as an example). First, prepare a text detection dataset (refer to the [Demo Dataset](https://paddle-model-ecology.bj.bcebos.com/paddlex/data/ocr_det_dataset_examples.tar) format). After preparation, proceed with model training and export. The exported model can be integrated into the API. Ensure PaddleOCR dependencies are installed as per the [Installation Guide](../installation.en.md). -## 4.1 Dataset and Pretrained Model Preparation +### 4.1 Dataset and Pretrained Model Preparation -### 4.1.1 Prepare Dataset +#### 4.1.1 Prepare Dataset ```shell # Download example dataset @@ -386,7 +386,7 @@ wget https://paddle-model-ecology.bj.bcebos.com/paddlex/data/ocr_det_dataset_exa tar -xf ocr_det_dataset_examples.tar ``` -### 4.1.2 Download Pretrained Model +#### 4.1.2 Download Pretrained Model ```shell # Download PP-OCRv5_server_det pretrained model diff --git a/docs/version3.x/module_usage/text_detection.md b/docs/version3.x/module_usage/text_detection.md index aa756b0e1a..276658c567 100644 --- a/docs/version3.x/module_usage/text_detection.md +++ b/docs/version3.x/module_usage/text_detection.md @@ -102,7 +102,7 @@ comments: true ## 三、快速开始 -> ❗ 在快速开始前,请先安装 PaddleOCR 的 wheel 包,详细请参考 [安装教程](..installation.md)。 +> ❗ 在快速开始前,请先安装 PaddleOCR 的 wheel 包,详细请参考 [安装教程](../installation.md)。 使用一行命令即可快速体验: @@ -433,9 +433,9 @@ for res in output: 如果以上模型在您的场景上效果仍然不理想,您可以尝试以下步骤进行二次开发,此处以训练 `PP-OCRv5_server_det` 举例,其他模型替换对应配置文件即可。首先,您需要准备文本检测的数据集,可以参考[文本检测 Demo 数据](https://paddle-model-ecology.bj.bcebos.com/paddlex/data/ocr_det_dataset_examples.tar)的格式准备,准备好后,即可按照以下步骤进行模型训练和导出,导出后,可以将模型快速集成到上述 API 中。此处以文本检测 Demo 数据示例。在训练模型之前,请确保已经按照[安装文档](../installation.md)安装了 PaddleOCR 所需要的依赖。 -## 4.1 数据集、预训练模型准备 +### 4.1 数据集、预训练模型准备 -### 4.1.1 准备数据集 +#### 4.1.1 准备数据集 ```shell # 下载示例数据集 @@ -443,7 +443,7 @@ wget https://paddle-model-ecology.bj.bcebos.com/paddlex/data/ocr_det_dataset_exa tar -xf ocr_det_dataset_examples.tar ``` -### 4.1.2 下载预训练模型 +#### 4.1.2 下载预训练模型 ```shell # 下载 PP-OCRv5_server_det 预训练模型 @@ -487,7 +487,7 @@ python3 tools/eval.py -c configs/det/PP-OCRv5/PP-OCRv5_server_det.yml \ -o Global.pretrained_model=output/PP-OCRv5_server_det/best_accuracy.pdparams \ Eval.dataset.data_dir=./ocr_det_dataset_examples \ Eval.dataset.label_file_list=[./ocr_det_dataset_examples/val.txt] - ``` +``` ### 4.4 模型导出 @@ -495,7 +495,7 @@ python3 tools/eval.py -c configs/det/PP-OCRv5/PP-OCRv5_server_det.yml \ python3 tools/export_model.py -c configs/det/PP-OCRv5/PP-OCRv5_server_det.yml -o \ Global.pretrained_model=output/PP-OCRv5_server_det/best_accuracy.pdparams \ Global.save_inference_dir="./PP-OCRv5_server_det_infer/" - ``` +``` 导出模型后,静态图模型会存放于当前目录的`./PP-OCRv5_server_det_infer/`中,在该目录下,您将看到如下文件: ``` diff --git a/docs/version3.x/module_usage/text_recognition.en.md b/docs/version3.x/module_usage/text_recognition.en.md index 2e96d5a21f..680ad04e11 100644 --- a/docs/version3.x/module_usage/text_recognition.en.md +++ b/docs/version3.x/module_usage/text_recognition.en.md @@ -611,9 +611,9 @@ The descriptions of relevant methods and parameters are as follows: If the performance of the above models does not meet your requirements in your specific scenario, you can follow the steps below for secondary development. Here, we use the training of `PP-OCRv5_server_rec` as an example; for other models, simply replace the corresponding configuration files. First, you need to prepare a dataset for text recognition. You can refer to the format of the [Text Recognition Demo Dataset](https://paddle-model-ecology.bj.bcebos.com/paddlex/data/ocr_rec_dataset_examples.tar) for preparation. Once prepared, you can proceed with model training and exporting as described below. After exporting, the model can be quickly integrated into the aforementioned API. This example uses the Text Recognition Demo Dataset. Before training the model, ensure that you have installed the dependencies required by PaddleOCR as per the [Installation Guide](../installation.md). -## 4.1 Dataset and Pre-trained Model Preparation +### 4.1 Dataset and Pre-trained Model Preparation -### 4.1.1 Prepare the Dataset +#### 4.1.1 Prepare the Dataset ```shell # Download the example dataset @@ -621,7 +621,7 @@ wget https://paddle-model-ecology.bj.bcebos.com/paddlex/data/ocr_rec_dataset_exa tar -xf ocr_rec_dataset_examples.tar ``` -### 4.1.2 Download the Pre-trained Model +#### 4.1.2 Download the Pre-trained Model ```shell # Download the PP-OCRv5_server_rec pre-trained model diff --git a/docs/version3.x/module_usage/text_recognition.md b/docs/version3.x/module_usage/text_recognition.md index a425bc4d29..398f42b000 100644 --- a/docs/version3.x/module_usage/text_recognition.md +++ b/docs/version3.x/module_usage/text_recognition.md @@ -653,19 +653,19 @@ python3 -m paddle.distributed.launch --gpus '0,1,2,3' tools/train.py -c configs 您可以评估已经训练好的权重,如,`output/xxx/xxx.pdparams`,使用如下命令进行评估: ```bash -# 注意将pretrained_model的路径设置为本地路径。若使用自行训练保存的模型,请注意修改路径和文件名为{path/to/weights}/{model_name}。 - # demo 测试集评估 - python3 tools/eval.py -c configs/rec/PP-OCRv5/PP-OCRv5_server_rec.yml -o \ - Global.pretrained_model=output/xxx/xxx.pdparams - ``` +#注意将pretrained_model的路径设置为本地路径。若使用自行训练保存的模型,请注意修改路径和文件名为{path/to/weights}/{model_name}。 +#demo 测试集评估 +python3 tools/eval.py -c configs/rec/PP-OCRv5/PP-OCRv5_server_rec.yml -o \ +Global.pretrained_model=output/xxx/xxx.pdparams +``` ### 4.4 模型导出 ```bash - python3 tools/export_model.py -c configs/rec/PP-OCRv5/PP-OCRv5_server_rec.yml -o \ - Global.pretrained_model=output/xxx/xxx.pdparams \ - Global.save_inference_dir="./PP-OCRv5_server_rec_infer/" - ``` +python3 tools/export_model.py -c configs/rec/PP-OCRv5/PP-OCRv5_server_rec.yml -o \ +Global.pretrained_model=output/xxx/xxx.pdparams \ +Global.save_inference_dir="./PP-OCRv5_server_rec_infer/" +``` 导出模型后,静态图模型会存放于当前目录的`./PP-OCRv5_server_rec_infer/`中,在该目录下,您将看到如下文件: ``` diff --git a/docs/version3.x/pipeline_usage/doc_preprocessor.en.md b/docs/version3.x/pipeline_usage/doc_preprocessor.en.md index 8c808206ba..d2b4b9f5e1 100644 --- a/docs/version3.x/pipeline_usage/doc_preprocessor.en.md +++ b/docs/version3.x/pipeline_usage/doc_preprocessor.en.md @@ -776,7 +776,7 @@ for i, res in enumerate(result["docPreprocessingResults"]): If the default model weights provided by the document image preprocessing pipeline do not meet your accuracy or speed requirements in your specific scenario, you can attempt to further **fine-tune** the existing model using **your own domain-specific or application-specific data** to enhance the recognition performance of the document image preprocessing pipeline in your context. -## 4.1 Model Fine-Tuning +### 4.1 Model Fine-Tuning Since the document image preprocessing pipeline comprises multiple modules, any module could potentially contribute to suboptimal performance if the overall pipeline does not meet expectations. You can analyze images with poor recognition results to identify which module is causing the issue and then refer to the corresponding fine-tuning tutorial links in the table below to perform model fine-tuning. diff --git a/docs/version3.x/pipeline_usage/doc_preprocessor.md b/docs/version3.x/pipeline_usage/doc_preprocessor.md index 0c19fa09ef..be533c6778 100644 --- a/docs/version3.x/pipeline_usage/doc_preprocessor.md +++ b/docs/version3.x/pipeline_usage/doc_preprocessor.md @@ -784,7 +784,7 @@ for i, res in enumerate(result["docPreprocessingResults"]): 如果文档图像预处理产线提供的默认模型权重在您的场景中,精度或速度不满意,您可以尝试利用您自己拥有的特定领域或应用场景的数据对现有模型进行进一步的微调,以提升文档图像预处理产线的在您的场景中的识别效果。 -## 4.1 模型微调 +### 4.1 模型微调 由于文档图像预处理产线包含若干模块,模型产线的效果如果不及预期,可能来自于其中任何一个模块。您可以对识别效果差的图片进行分析,进而确定是哪个模块存在问题,并参考以下表格中对应的微调教程链接进行模型微调。