mirror of
				https://github.com/PaddlePaddle/PaddleOCR.git
				synced 2025-10-25 23:04:56 +00:00 
			
		
		
		
	
		
			
	
	
		
			105 lines
		
	
	
		
			4.0 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
		
		
			
		
	
	
			105 lines
		
	
	
		
			4.0 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
|   | # FCENet
 | ||
|  | 
 | ||
|  | - [1. Introduction](#1) | ||
|  | - [2. Environment](#2) | ||
|  | - [3. Model Training / Evaluation / Prediction](#3) | ||
|  |     - [3.1 Training](#3-1) | ||
|  |     - [3.2 Evaluation](#3-2) | ||
|  |     - [3.3 Prediction](#3-3) | ||
|  | - [4. Inference and Deployment](#4) | ||
|  |     - [4.1 Python Inference](#4-1) | ||
|  |     - [4.2 C++ Inference](#4-2) | ||
|  |     - [4.3 Serving](#4-3) | ||
|  |     - [4.4 More](#4-4) | ||
|  | - [5. FAQ](#5) | ||
|  | 
 | ||
|  | <a name="1"></a> | ||
|  | ## 1. Introduction
 | ||
|  | 
 | ||
|  | Paper: | ||
|  | > [Fourier Contour Embedding for Arbitrary-Shaped Text Detection](https://arxiv.org/abs/2104.10442)
 | ||
|  | > Yiqin Zhu and Jianyong Chen and Lingyu Liang and Zhanghui Kuang and Lianwen Jin and Wayne Zhang
 | ||
|  | > CVPR, 2021
 | ||
|  | 
 | ||
|  | On the CTW1500 dataset, the text detection result is as follows: | ||
|  | 
 | ||
|  | |Model|Backbone|Configuration|Precision|Recall|Hmean|Download| | ||
|  | | --- | --- | --- | --- | --- | --- | --- | | ||
|  | | FCE | ResNet50_dcn | [configs/det/det_r50_vd_dcn_fce_ctw.yml](../../configs/det/det_r50_vd_dcn_fce_ctw.yml)| 88.39%|82.18%|85.27%|[trained model](https://paddleocr.bj.bcebos.com/contribution/det_r50_dcn_fce_ctw_v2.0_train.tar)| | ||
|  | 
 | ||
|  | <a name="2"></a> | ||
|  | ## 2. Environment
 | ||
|  | Please prepare your environment referring to [prepare the environment](./environment_en.md) and [clone the repo](./clone_en.md). | ||
|  | 
 | ||
|  | 
 | ||
|  | <a name="3"></a> | ||
|  | ## 3. Model Training / Evaluation / Prediction
 | ||
|  | 
 | ||
|  | The above FCE model is trained using the CTW1500 text detection public dataset. For the download of the dataset, please refer to [ocr_datasets](./dataset/ocr_datasets_en.md). | ||
|  | 
 | ||
|  | After the data download is complete, please refer to [Text Detection Training Tutorial](./detection.md) for training. PaddleOCR has modularized the code structure, so that you only need to **replace the configuration file** to train different detection models. | ||
|  | 
 | ||
|  | <a name="4"></a> | ||
|  | ## 4. Inference and Deployment
 | ||
|  | 
 | ||
|  | <a name="4-1"></a> | ||
|  | ### 4.1 Python Inference
 | ||
|  | First, convert the model saved in the FCE text detection training process into an inference model. Taking the model based on the Resnet50_vd_dcn backbone network and trained on the CTW1500 English dataset as example ([model download link](https://paddleocr.bj.bcebos.com/contribution/det_r50_dcn_fce_ctw_v2.0_train.tar)), you can use the following command to convert: | ||
|  | 
 | ||
|  | ```shell | ||
|  | python3 tools/export_model.py -c configs/det/det_r50_vd_dcn_fce_ctw.yml -o Global.pretrained_model=./det_r50_dcn_fce_ctw_v2.0_train/best_accuracy  Global.save_inference_dir=./inference/det_fce | ||
|  | ``` | ||
|  | 
 | ||
|  | FCE text detection model inference, to perform non-curved text detection, you can run the following commands: | ||
|  | 
 | ||
|  | ```shell | ||
|  | python3 tools/infer/predict_det.py --image_dir="./doc/imgs_en/img_10.jpg" --det_model_dir="./inference/det_fce/" --det_algorithm="FCE" --det_fce_box_type=quad | ||
|  | ``` | ||
|  | 
 | ||
|  | The visualized text detection results are saved to the `./inference_results` folder by default, and the name of the result file is prefixed with 'det_res'. Examples of results are as follows: | ||
|  | 
 | ||
|  |  | ||
|  | 
 | ||
|  | If you want to perform curved text detection, you can execute the following command: | ||
|  | 
 | ||
|  | ```shell | ||
|  | python3 tools/infer/predict_det.py --image_dir="./doc/imgs_en/img623.jpg" --det_model_dir="./inference/det_fce/" --det_algorithm="FCE" --det_fce_box_type=poly | ||
|  | ``` | ||
|  | 
 | ||
|  | The visualized text detection results are saved to the `./inference_results` folder by default, and the name of the result file is prefixed with 'det_res'. Examples of results are as follows: | ||
|  | 
 | ||
|  |  | ||
|  | 
 | ||
|  | **Note**: Since the CTW1500 dataset has only 1,000 training images, mainly for English scenes, the above model has very poor detection result on Chinese or curved text images. | ||
|  | 
 | ||
|  | 
 | ||
|  | <a name="4-2"></a> | ||
|  | ### 4.2 C++ Inference
 | ||
|  | 
 | ||
|  | Since the post-processing is not written in CPP, the FCE text detection model does not support CPP inference. | ||
|  | 
 | ||
|  | <a name="4-3"></a> | ||
|  | ### 4.3 Serving
 | ||
|  | 
 | ||
|  | Not supported | ||
|  | 
 | ||
|  | <a name="4-4"></a> | ||
|  | ### 4.4 More
 | ||
|  | 
 | ||
|  | Not supported | ||
|  | 
 | ||
|  | <a name="5"></a> | ||
|  | ## 5. FAQ
 | ||
|  | 
 | ||
|  | 
 | ||
|  | ## Citation
 | ||
|  | 
 | ||
|  | ```bibtex | ||
|  | @InProceedings{zhu2021fourier, | ||
|  |   title={Fourier Contour Embedding for Arbitrary-Shaped Text Detection}, | ||
|  |   author={Yiqin Zhu and Jianyong Chen and Lingyu Liang and Zhanghui Kuang and Lianwen Jin and Wayne Zhang}, | ||
|  |   year={2021}, | ||
|  |   booktitle = {CVPR} | ||
|  | } | ||
|  | ``` |