all_docs_review_final (#15221)

* all_docs_review_final

* all_docs_review_final
This commit is contained in:
Sunflower7788 2025-05-20 17:11:58 +08:00 committed by GitHub
parent e3811e252e
commit c3ba8d0c77
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
12 changed files with 57 additions and 54 deletions

View File

@ -336,7 +336,7 @@ You can choose either method based on your actual needs. The `predict()` method
If the models above do not perform well in your scenario, you can try the following steps for custom development.
Here we take training `PP-FormulaNet_plus-M` as an example. For other models, just replace the corresponding config file. First, you need to prepare a formula recognition dataset. You can follow the format of the [formula recognition demo data](https://paddle-model-ecology.bj.bcebos.com/paddlex/data/ocr_rec_latexocr_dataset_example.tar). Once the data is ready, follow the steps below to train and export the model. After export, the model can be quickly integrated into the API described above. This example uses the demo dataset. Before training the model, please ensure you have installed all PaddleOCR dependencies as described in the [installation documentation](xxx).
## 4.1 Environment Setup
### 4.1 Environment Setup
To train the formula recognition model, you need to install additional Python and Linux dependencies. Run the following commands:
@ -346,16 +346,16 @@ sudo apt-get install libmagickwand-dev
pip install tokenizers==0.19.1 imagesize ftfy Wand
```
## 4.2 Dataset and Pretrained Model Preparation
### 4.2 Dataset and Pretrained Model Preparation
### 4.2.1 Prepare the Dataset
#### 4.2.1 Prepare the Dataset
```shell
# Download the demo dataset
wget https://paddle-model-ecology.bj.bcebos.com/paddlex/data/ocr_rec_latexocr_dataset_example.tar
tar -xf ocr_rec_latexocr_dataset_example.tar
```
### 4.2.2 Download the Pretrained Model
#### 4.2.2 Download the Pretrained Model
```shell
# Download the PP-FormulaNet_plus-M pre-trained model
wget https://paddleocr.bj.bcebos.com/contribution/rec_ppformulanet_plus_m_train.tar
@ -394,14 +394,14 @@ You can evaluate trained weights, e.g., output/xxx/xxx.pdparams, or use the down
# Demo test set evaluation
python3 tools/eval.py -c configs/rec/PP-FormuaNet/PP-FormulaNet_plus-M.yaml -o \
Global.pretrained_model=./rec_ppformulanet_plus_m_train/best_accuracy.pdparams
```
### 4.5 Model Export
```bash
python3 tools/export_model.py -c configs/rec/PP-FormuaNet/PP-FormulaNet_plus-M.yaml -o \
Global.pretrained_model=./rec_ppformulanet_plus_m_train/best_accuracy.pdparams \
Global.save_inference_dir="./PP-FormulaNet_plus-M_infer/"
```
```
After exporting, the static graph model will be saved in `./PP-FormulaNet_plus-M_infer/`, and you will see the following files:
```

View File

@ -344,7 +344,7 @@ sudo apt-get install texlive texlive-latex-base texlive-xetex latex-cjk-all texl
## 四、二次开发
如果以上模型在您的场景下效果仍然不理想,您可以尝试以下步骤进行二次开发,此处以训练 `PP-FormulaNet-S` 举例,其他模型替换对应配置文件即可。首先,您需要准备公式识别的数据集,可以参考[公式识别 Demo 数据](https://paddle-model-ecology.bj.bcebos.com/paddlex/data/ocr_rec_latexocr_dataset_example.tar)的格式准备准备好后即可按照以下步骤进行模型训练和导出导出后可以将模型快速集成到上述API中。此处以公式识别 Demo 数据示例。在训练模型之前,请确保已经按照[安装文档](../installation.md)安装了 PaddleOCR 所需要的依赖。
## 4.1 环境配置
### 4.1 环境配置
训练公式识别模型需要安装额外的Python依赖和linux依赖执行如下命令安装
```shell
@ -353,9 +353,9 @@ sudo apt-get install libmagickwand-dev
pip install tokenizers==0.19.1 imagesize ftfy Wand
```
## 4.2 数据集、预训练模型准备
### 4.2 数据集、预训练模型准备
### 4.2.1 准备数据集
#### 4.2.1 准备数据集
```shell
# 下载示例数据集
@ -363,7 +363,7 @@ wget https://paddle-model-ecology.bj.bcebos.com/paddlex/data/ocr_rec_latexocr_da
tar -xf ocr_rec_latexocr_dataset_example.tar
```
### 4.2.2 下载预训练模型
#### 4.2.2 下载预训练模型
```shell
# 下载 PP-FormulaNet_plus-M 预训练模型
@ -401,19 +401,20 @@ python3 -m paddle.distributed.launch --gpus '0,1,2,3' tools/train.py -c config
您可以评估已经训练好的权重,如,`output/xxx/xxx.pdparams`,也可以使用已经下载的[模型文件](https://paddleocr.bj.bcebos.com/contribution/rec_ppformulanet_s_train.tar),使用如下命令进行评估:
```bash
# 注意将pretrained_model的路径设置为本地路径。若使用自行训练保存的模型请注意修改路径和文件名为{path/to/weights}/{model_name}。
# demo 测试集评估
#注意将pretrained_model的路径设置为本地路径。若使用自行训练保存的模型,请注意修改路径和文件名为{path/to/weights}/{model_name}。
#demo 测试集评估
python3 tools/eval.py -c configs/rec/PP-FormuaNet/PP-FormulaNet_plus-M.yaml -o \
Global.pretrained_model=./rec_ppformulanet_plus_m_train/best_accuracy.pdparams
```
```
### 4.5 模型导出
```bash
python3 tools/export_model.py -c configs/rec/PP-FormuaNet/PP-FormulaNet_plus-M.yaml -o \
Global.pretrained_model=./rec_ppformulanet_plus_m_train/best_accuracy.pdparams \
Global.save_inference_dir="./PP-FormulaNet_plus-M_infer/"
```
python3 tools/export_model.py -c configs/rec/PP-FormuaNet/PP-FormulaNet_plus-M.yaml -o \
Global.pretrained_model=./rec_ppformulanet_plus_m_train/best_accuracy.pdparams \
Global.save_inference_dir="./PP-FormulaNet_plus-M_infer/"
```
导出模型后,静态图模型会存放于当前目录的`./PP-FormulaNet_plus-M_infer/`中,在该目录下,您将看到如下文件:
```

View File

@ -457,13 +457,15 @@ If the above model is still not performing well in your scenario, you can try th
### 4.1 Dataset and Pre-trained Model Preparation
### 4.1.1 Preparing the Dataset
#### 4.1.1 Preparing the Dataset
```shell
wget https://paddle-model-ecology.bj.bcebos.com/paddlex/data/ocr_curve_det_dataset_examples.tar -P ./dataset
tar -xf ./dataset/ocr_curve_det_dataset_examples.tar -C ./dataset/
```
### 4.1.2 Preparing the pre-trained model
#### 4.1.1 Preparing the pre-trained model
```shell
wget https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/PP-OCRv4_server_seal_det_pretrained.pdparams

View File

@ -451,9 +451,9 @@ for res in output:
如果以上模型在您的场景上效果仍然不理想,您可以尝试以下步骤进行二次开发,此处以训练 `PP-OCRv4_server_seal_det` 举例,其他模型替换对应配置文件即可。首先,您需要准备文本检测的数据集,可以参考[印章文本检测 Demo 数据](https://paddle-model-ecology.bj.bcebos.com/paddlex/data/ocr_curve_det_dataset_examples.tar)的格式准备,准备好后,即可按照以下步骤进行模型训练和导出,导出后,可以将模型快速集成到上述 API 中。此处以印章文本检测 Demo 数据示例。在训练模型之前,请确保已经按照[安装文档](../installation.md)安装了 PaddleOCR 所需要的依赖。
## 4.1 数据集、预训练模型准备
### 4.1 数据集、预训练模型准备
### 4.1.1 准备数据集
#### 4.1.1 准备数据集
```shell
# 下载示例数据集
@ -461,7 +461,7 @@ wget https://paddle-model-ecology.bj.bcebos.com/paddlex/data/ocr_curve_det_datas
tar -xf ./dataset/ocr_curve_det_dataset_examples.tar -C ./dataset/
```
### 4.1.2 下载预训练模型
#### 4.1.2 下载预训练模型
```shell
# 下载 PP-OCRv4_server_seal_det 预训练模型
@ -495,7 +495,7 @@ python3 -m paddle.distributed.launch --gpus '0,1,2,3' tools/train.py -c configs
# demo 测试集评估
python3 tools/eval.py -c configs/det/PP-OCRv4/PP-OCRv4_server_seal_det.yml -o \
Global.pretrained_model=output/xxx/xxx.pdparams
```
```
### 4.4 模型导出
@ -503,7 +503,7 @@ python3 -m paddle.distributed.launch --gpus '0,1,2,3' tools/train.py -c configs
python3 tools/export_model.py -c configs/det/PP-OCRv4/PP-OCRv4_server_seal_det.yml -o \
Global.pretrained_model=output/xxx/xxx.pdparams \
save_inference_dir="./PP-OCRv4_server_seal_det_infer/"
```
```
导出模型后,静态图模型会存放于当前目录的`./PP-OCRv4_server_seal_det_infer/`中,在该目录下,您将看到如下文件:
```

View File

@ -336,7 +336,7 @@ You can evaluate the trained weights, such as `output/xxx/xxx.pdparams`, using t
# Demo test set evaluation
python3 tools/eval.py -c configs/table/SLANet.yml -o \
Global.pretrained_model=output/xxx/xxx.pdparams
```
```
### 4.4 Model Export
@ -344,7 +344,7 @@ You can evaluate the trained weights, such as `output/xxx/xxx.pdparams`, using t
python3 tools/export_model.py -c configs/table/SLANet.yml -o \
Global.pretrained_model=output/xxx/xxx.pdparams \
save_inference_dir="./SLANet_infer/"
```
```
After exporting the model, the static graph model will be stored in `./SLANet_infer/` in the current directory. In this directory, you will see the following files:
```

View File

@ -293,9 +293,9 @@ for res in output:
如果以上模型在您的场景上效果仍然不理想,您可以尝试以下步骤进行二次开发,此处以训练 `SLANet` 举例,其他模型替换对应配置文件即可。首先,您需要准备表格结构识别的数据集,可以参考[表格结构识别 Demo 数据](https://paddle-model-ecology.bj.bcebos.com/paddlex/data/table_rec_dataset_examples.tar)的格式准备,准备好后,即可按照以下步骤进行模型训练和导出,导出后,可以将模型快速集成到上述 API 中。此处以表格结构识别 Demo 数据示例。在训练模型之前,请确保已经按照[[安装文档](../installation.md)安装了 PaddleOCR 所需要的依赖。
## 4.1 数据集、预训练模型准备
### 4.1 数据集、预训练模型准备
### 4.1.1 准备数据集
#### 4.1.1 准备数据集
```shell
# 下载示例数据集
@ -303,7 +303,7 @@ wget https://paddle-model-ecology.bj.bcebos.com/paddlex/data/table_rec_dataset_e
tar -xf table_rec_dataset_examples.tar
```
### 4.1.2 下载预训练模型
#### 4.1.2 下载预训练模型
```shell
# 下载 SLANet 预训练模型
@ -336,7 +336,7 @@ python3 -m paddle.distributed.launch --gpus '0,1,2,3' tools/train.py -c configs
# demo 测试集评估
python3 tools/eval.py -c configs/table/SLANet.yml -o \
Global.pretrained_model=output/xxx/xxx.pdparams
```
```
### 4.4 模型导出
@ -344,7 +344,7 @@ python3 -m paddle.distributed.launch --gpus '0,1,2,3' tools/train.py -c configs
python3 tools/export_model.py -c configs/table/SLANet.yml -o \
Global.pretrained_model=output/xxx/xxx.pdparams \
save_inference_dir="./SLANet_infer/"
```
```
导出模型后,静态图模型会存放于当前目录的`./SLANet_infer/`中,在该目录下,您将看到如下文件:
```

View File

@ -376,9 +376,9 @@ Method and parameter descriptions:
If the above models do not meet your requirements, follow these steps for custom development (using `PP-OCRv5_server_det` as an example). First, prepare a text detection dataset (refer to the [Demo Dataset](https://paddle-model-ecology.bj.bcebos.com/paddlex/data/ocr_det_dataset_examples.tar) format). After preparation, proceed with model training and export. The exported model can be integrated into the API. Ensure PaddleOCR dependencies are installed as per the [Installation Guide](../installation.en.md).
## 4.1 Dataset and Pretrained Model Preparation
### 4.1 Dataset and Pretrained Model Preparation
### 4.1.1 Prepare Dataset
#### 4.1.1 Prepare Dataset
```shell
# Download example dataset
@ -386,7 +386,7 @@ wget https://paddle-model-ecology.bj.bcebos.com/paddlex/data/ocr_det_dataset_exa
tar -xf ocr_det_dataset_examples.tar
```
### 4.1.2 Download Pretrained Model
#### 4.1.2 Download Pretrained Model
```shell
# Download PP-OCRv5_server_det pretrained model

View File

@ -102,7 +102,7 @@ comments: true
## 三、快速开始
> ❗ 在快速开始前,请先安装 PaddleOCR 的 wheel 包,详细请参考 [安装教程](..installation.md)。
> ❗ 在快速开始前,请先安装 PaddleOCR 的 wheel 包,详细请参考 [安装教程](../installation.md)。
使用一行命令即可快速体验:
@ -433,9 +433,9 @@ for res in output:
如果以上模型在您的场景上效果仍然不理想,您可以尝试以下步骤进行二次开发,此处以训练 `PP-OCRv5_server_det` 举例,其他模型替换对应配置文件即可。首先,您需要准备文本检测的数据集,可以参考[文本检测 Demo 数据](https://paddle-model-ecology.bj.bcebos.com/paddlex/data/ocr_det_dataset_examples.tar)的格式准备,准备好后,即可按照以下步骤进行模型训练和导出,导出后,可以将模型快速集成到上述 API 中。此处以文本检测 Demo 数据示例。在训练模型之前,请确保已经按照[安装文档](../installation.md)安装了 PaddleOCR 所需要的依赖。
## 4.1 数据集、预训练模型准备
### 4.1 数据集、预训练模型准备
### 4.1.1 准备数据集
#### 4.1.1 准备数据集
```shell
# 下载示例数据集
@ -443,7 +443,7 @@ wget https://paddle-model-ecology.bj.bcebos.com/paddlex/data/ocr_det_dataset_exa
tar -xf ocr_det_dataset_examples.tar
```
### 4.1.2 下载预训练模型
#### 4.1.2 下载预训练模型
```shell
# 下载 PP-OCRv5_server_det 预训练模型
@ -487,7 +487,7 @@ python3 tools/eval.py -c configs/det/PP-OCRv5/PP-OCRv5_server_det.yml \
-o Global.pretrained_model=output/PP-OCRv5_server_det/best_accuracy.pdparams \
Eval.dataset.data_dir=./ocr_det_dataset_examples \
Eval.dataset.label_file_list=[./ocr_det_dataset_examples/val.txt]
```
```
### 4.4 模型导出
@ -495,7 +495,7 @@ python3 tools/eval.py -c configs/det/PP-OCRv5/PP-OCRv5_server_det.yml \
python3 tools/export_model.py -c configs/det/PP-OCRv5/PP-OCRv5_server_det.yml -o \
Global.pretrained_model=output/PP-OCRv5_server_det/best_accuracy.pdparams \
Global.save_inference_dir="./PP-OCRv5_server_det_infer/"
```
```
导出模型后,静态图模型会存放于当前目录的`./PP-OCRv5_server_det_infer/`中,在该目录下,您将看到如下文件:
```

View File

@ -611,9 +611,9 @@ The descriptions of relevant methods and parameters are as follows:
If the performance of the above models does not meet your requirements in your specific scenario, you can follow the steps below for secondary development. Here, we use the training of `PP-OCRv5_server_rec` as an example; for other models, simply replace the corresponding configuration files. First, you need to prepare a dataset for text recognition. You can refer to the format of the [Text Recognition Demo Dataset](https://paddle-model-ecology.bj.bcebos.com/paddlex/data/ocr_rec_dataset_examples.tar) for preparation. Once prepared, you can proceed with model training and exporting as described below. After exporting, the model can be quickly integrated into the aforementioned API. This example uses the Text Recognition Demo Dataset. Before training the model, ensure that you have installed the dependencies required by PaddleOCR as per the [Installation Guide](../installation.md).
## 4.1 Dataset and Pre-trained Model Preparation
### 4.1 Dataset and Pre-trained Model Preparation
### 4.1.1 Prepare the Dataset
#### 4.1.1 Prepare the Dataset
```shell
# Download the example dataset
@ -621,7 +621,7 @@ wget https://paddle-model-ecology.bj.bcebos.com/paddlex/data/ocr_rec_dataset_exa
tar -xf ocr_rec_dataset_examples.tar
```
### 4.1.2 Download the Pre-trained Model
#### 4.1.2 Download the Pre-trained Model
```shell
# Download the PP-OCRv5_server_rec pre-trained model

View File

@ -653,19 +653,19 @@ python3 -m paddle.distributed.launch --gpus '0,1,2,3' tools/train.py -c configs
您可以评估已经训练好的权重,如,`output/xxx/xxx.pdparams`,使用如下命令进行评估:
```bash
# 注意将pretrained_model的路径设置为本地路径。若使用自行训练保存的模型请注意修改路径和文件名为{path/to/weights}/{model_name}。
# demo 测试集评估
python3 tools/eval.py -c configs/rec/PP-OCRv5/PP-OCRv5_server_rec.yml -o \
Global.pretrained_model=output/xxx/xxx.pdparams
```
#注意将pretrained_model的路径设置为本地路径。若使用自行训练保存的模型,请注意修改路径和文件名为{path/to/weights}/{model_name}。
#demo 测试集评估
python3 tools/eval.py -c configs/rec/PP-OCRv5/PP-OCRv5_server_rec.yml -o \
Global.pretrained_model=output/xxx/xxx.pdparams
```
### 4.4 模型导出
```bash
python3 tools/export_model.py -c configs/rec/PP-OCRv5/PP-OCRv5_server_rec.yml -o \
Global.pretrained_model=output/xxx/xxx.pdparams \
Global.save_inference_dir="./PP-OCRv5_server_rec_infer/"
```
python3 tools/export_model.py -c configs/rec/PP-OCRv5/PP-OCRv5_server_rec.yml -o \
Global.pretrained_model=output/xxx/xxx.pdparams \
Global.save_inference_dir="./PP-OCRv5_server_rec_infer/"
```
导出模型后,静态图模型会存放于当前目录的`./PP-OCRv5_server_rec_infer/`中,在该目录下,您将看到如下文件:
```

View File

@ -776,7 +776,7 @@ for i, res in enumerate(result["docPreprocessingResults"]):
If the default model weights provided by the document image preprocessing pipeline do not meet your accuracy or speed requirements in your specific scenario, you can attempt to further **fine-tune** the existing model using **your own domain-specific or application-specific data** to enhance the recognition performance of the document image preprocessing pipeline in your context.
## 4.1 Model Fine-Tuning
### 4.1 Model Fine-Tuning
Since the document image preprocessing pipeline comprises multiple modules, any module could potentially contribute to suboptimal performance if the overall pipeline does not meet expectations. You can analyze images with poor recognition results to identify which module is causing the issue and then refer to the corresponding fine-tuning tutorial links in the table below to perform model fine-tuning.

View File

@ -784,7 +784,7 @@ for i, res in enumerate(result["docPreprocessingResults"]):
如果文档图像预处理产线提供的默认模型权重在您的场景中,精度或速度不满意,您可以尝试利用<b>您自己拥有的特定领域或应用场景的数据</b>对现有模型进行进一步的<b>微调</b>,以提升文档图像预处理产线的在您的场景中的识别效果。
## 4.1 模型微调
### 4.1 模型微调
由于文档图像预处理产线包含若干模块,模型产线的效果如果不及预期,可能来自于其中任何一个模块。您可以对识别效果差的图片进行分析,进而确定是哪个模块存在问题,并参考以下表格中对应的微调教程链接进行模型微调。