mirror of
https://github.com/PaddlePaddle/PaddleOCR.git
synced 2025-11-15 17:43:39 +00:00
commit
ad3835e231
@ -40,17 +40,19 @@ PaddleOCR文本检测算法的训练和使用请参考文档教程中[模型训
|
|||||||
PaddleOCR基于动态图开源的文本识别算法列表:
|
PaddleOCR基于动态图开源的文本识别算法列表:
|
||||||
- [x] CRNN([paper](https://arxiv.org/abs/1507.05717))[7](ppocr推荐)
|
- [x] CRNN([paper](https://arxiv.org/abs/1507.05717))[7](ppocr推荐)
|
||||||
- [x] Rosetta([paper](https://arxiv.org/abs/1910.05085))[10]
|
- [x] Rosetta([paper](https://arxiv.org/abs/1910.05085))[10]
|
||||||
- [ ] STAR-Net([paper](http://www.bmva.org/bmvc/2016/papers/paper043/index.html))[11] coming soon
|
- [x] STAR-Net([paper](http://www.bmva.org/bmvc/2016/papers/paper043/index.html))[11]
|
||||||
- [ ] RARE([paper](https://arxiv.org/abs/1603.03915v1))[12] coming soon
|
- [ ] RARE([paper](https://arxiv.org/abs/1603.03915v1))[12] coming soon
|
||||||
- [ ] SRN([paper](https://arxiv.org/abs/2003.12294))[5] coming soon
|
- [ ] SRN([paper](https://arxiv.org/abs/2003.12294))[5] coming soon
|
||||||
|
|
||||||
参考[DTRB][3](https://arxiv.org/abs/1904.01906)文字识别训练和评估流程,使用MJSynth和SynthText两个文字识别数据集训练,在IIIT, SVT, IC03, IC13, IC15, SVTP, CUTE数据集上进行评估,算法效果如下:
|
参考[DTRB][3](https://arxiv.org/abs/1904.01906)文字识别训练和评估流程,使用MJSynth和SynthText两个文字识别数据集训练,在IIIT, SVT, IC03, IC13, IC15, SVTP, CUTE数据集上进行评估,算法效果如下:
|
||||||
|
|
||||||
|模型|骨干网络|Avg Accuracy|模型存储命名|下载链接|
|
|模型|骨干网络|Avg Accuracy|模型存储命名|下载链接|
|
||||||
|-|-|-|-|-|
|
|---|---|---|---|---|
|
||||||
|Rosetta|Resnet34_vd|80.9%|rec_r34_vd_none_none_ctc|[下载链接](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_r34_vd_none_none_ctc_v2.0_train.tar)|
|
|Rosetta|Resnet34_vd|80.9%|rec_r34_vd_none_none_ctc|[下载链接](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_r34_vd_none_none_ctc_v2.0_train.tar)|
|
||||||
|Rosetta|MobileNetV3|78.05%|rec_mv3_none_none_ctc|[下载链接](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_mv3_none_none_ctc_v2.0_train.tar)|
|
|Rosetta|MobileNetV3|78.05%|rec_mv3_none_none_ctc|[下载链接](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_mv3_none_none_ctc_v2.0_train.tar)|
|
||||||
|CRNN|Resnet34_vd|82.76%|rec_r34_vd_none_bilstm_ctc|[下载链接](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_r34_vd_none_bilstm_ctc_v2.0_train.tar)|
|
|CRNN|Resnet34_vd|82.76%|rec_r34_vd_none_bilstm_ctc|[下载链接](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_r34_vd_none_bilstm_ctc_v2.0_train.tar)|
|
||||||
|CRNN|MobileNetV3|79.97%|rec_mv3_none_bilstm_ctc|[下载链接](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_mv3_none_bilstm_ctc_v2.0_train.tar)|
|
|CRNN|MobileNetV3|79.97%|rec_mv3_none_bilstm_ctc|[下载链接](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_mv3_none_bilstm_ctc_v2.0_train.tar)|
|
||||||
|
|StarNet|Resnet34_vd|84.44%|rec_r34_vd_tps_bilstm_ctc|[下载链接](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_r34_vd_tps_bilstm_ctc_v2.0_train.tar)|
|
||||||
|
|StarNet|MobileNetV3|81.42%|rec_mv3_tps_bilstm_ctc|[下载链接](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_mv3_tps_bilstm_ctc_v2.0_train.tar)|
|
||||||
|
|
||||||
PaddleOCR文本识别算法的训练和使用请参考文档教程中[模型训练/评估中的文本识别部分](./recognition.md)。
|
PaddleOCR文本识别算法的训练和使用请参考文档教程中[模型训练/评估中的文本识别部分](./recognition.md)。
|
||||||
|
|||||||
@ -352,10 +352,10 @@ Predicts of ./doc/imgs_words/ch/word_4.jpg:['0', 0.9999982]
|
|||||||
|
|
||||||
```
|
```
|
||||||
# 使用方向分类器
|
# 使用方向分类器
|
||||||
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/2.jpg" --det_model_dir="./inference/det_db/" --cls_model_dir="./inference/cls/" --rec_model_dir="./inference/rec_crnn/" --use_angle_cls=true
|
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/00018069.jpg" --det_model_dir="./inference/det_db/" --cls_model_dir="./inference/cls/" --rec_model_dir="./inference/rec_crnn/" --use_angle_cls=true
|
||||||
|
|
||||||
# 不使用方向分类器
|
# 不使用方向分类器
|
||||||
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/2.jpg" --det_model_dir="./inference/det_db/" --rec_model_dir="./inference/rec_crnn/" --use_angle_cls=false
|
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/00018069.jpg" --det_model_dir="./inference/det_db/" --rec_model_dir="./inference/rec_crnn/" --use_angle_cls=false
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
||||||
@ -364,7 +364,7 @@ python3 tools/infer/predict_system.py --image_dir="./doc/imgs/2.jpg" --det_model
|
|||||||
|
|
||||||
执行命令后,识别结果图像如下:
|
执行命令后,识别结果图像如下:
|
||||||
|
|
||||||

|

|
||||||
|
|
||||||
<a name="其他模型推理"></a>
|
<a name="其他模型推理"></a>
|
||||||
### 2. 其他模型推理
|
### 2. 其他模型推理
|
||||||
@ -381,4 +381,4 @@ python3 tools/infer/predict_system.py --image_dir="./doc/imgs_en/img_10.jpg" --d
|
|||||||
|
|
||||||
执行命令后,识别结果图像如下:
|
执行命令后,识别结果图像如下:
|
||||||
|
|
||||||
(coming soon)
|

|
||||||
|
|||||||
@ -41,17 +41,19 @@ For the training guide and use of PaddleOCR text detection algorithms, please re
|
|||||||
PaddleOCR open-source text recognition algorithms list:
|
PaddleOCR open-source text recognition algorithms list:
|
||||||
- [x] CRNN([paper](https://arxiv.org/abs/1507.05717))[7]
|
- [x] CRNN([paper](https://arxiv.org/abs/1507.05717))[7]
|
||||||
- [x] Rosetta([paper](https://arxiv.org/abs/1910.05085))[10]
|
- [x] Rosetta([paper](https://arxiv.org/abs/1910.05085))[10]
|
||||||
- [ ] STAR-Net([paper](http://www.bmva.org/bmvc/2016/papers/paper043/index.html))[11] coming soon
|
- [x] STAR-Net([paper](http://www.bmva.org/bmvc/2016/papers/paper043/index.html))[11]
|
||||||
- [ ] RARE([paper](https://arxiv.org/abs/1603.03915v1))[12] coming soon
|
- [ ] RARE([paper](https://arxiv.org/abs/1603.03915v1))[12] coming soon
|
||||||
- [ ] SRN([paper](https://arxiv.org/abs/2003.12294))[5] coming soon
|
- [ ] SRN([paper](https://arxiv.org/abs/2003.12294))[5] coming soon
|
||||||
|
|
||||||
Refer to [DTRB](https://arxiv.org/abs/1904.01906), the training and evaluation result of these above text recognition (using MJSynth and SynthText for training, evaluate on IIIT, SVT, IC03, IC13, IC15, SVTP, CUTE) is as follow:
|
Refer to [DTRB](https://arxiv.org/abs/1904.01906), the training and evaluation result of these above text recognition (using MJSynth and SynthText for training, evaluate on IIIT, SVT, IC03, IC13, IC15, SVTP, CUTE) is as follow:
|
||||||
|
|
||||||
|Model|Backbone|Avg Accuracy|Module combination|Download link|
|
|Model|Backbone|Avg Accuracy|Module combination|Download link|
|
||||||
|-|-|-|-|-|
|
|---|---|---|---|---|
|
||||||
|Rosetta|Resnet34_vd|80.9%|rec_r34_vd_none_none_ctc|[Download link](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_r34_vd_none_none_ctc_v2.0_train.tar)|
|
|Rosetta|Resnet34_vd|80.9%|rec_r34_vd_none_none_ctc|[Download link](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_r34_vd_none_none_ctc_v2.0_train.tar)|
|
||||||
|Rosetta|MobileNetV3|78.05%|rec_mv3_none_none_ctc|[Download link](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_mv3_none_none_ctc_v2.0_train.tar)|
|
|Rosetta|MobileNetV3|78.05%|rec_mv3_none_none_ctc|[Download link](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_mv3_none_none_ctc_v2.0_train.tar)|
|
||||||
|CRNN|Resnet34_vd|82.76%|rec_r34_vd_none_bilstm_ctc|[Download link](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_r34_vd_none_bilstm_ctc_v2.0_train.tar)|
|
|CRNN|Resnet34_vd|82.76%|rec_r34_vd_none_bilstm_ctc|[Download link](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_r34_vd_none_bilstm_ctc_v2.0_train.tar)|
|
||||||
|CRNN|MobileNetV3|79.97%|rec_mv3_none_bilstm_ctc|[Download link](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_mv3_none_bilstm_ctc_v2.0_train.tar)|
|
|CRNN|MobileNetV3|79.97%|rec_mv3_none_bilstm_ctc|[Download link](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_mv3_none_bilstm_ctc_v2.0_train.tar)|
|
||||||
|
|StarNet|Resnet34_vd|84.44%|rec_r34_vd_tps_bilstm_ctc|[Download link](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_r34_vd_tps_bilstm_ctc_v2.0_train.tar)|
|
||||||
|
|StarNet|MobileNetV3|81.42%|rec_mv3_tps_bilstm_ctc|[Download link](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_mv3_tps_bilstm_ctc_v2.0_train.tar)|
|
||||||
|
|
||||||
Please refer to the document for training guide and use of PaddleOCR text recognition algorithms [Text recognition model training/evaluation/prediction](./recognition_en.md)
|
Please refer to the document for training guide and use of PaddleOCR text recognition algorithms [Text recognition model training/evaluation/prediction](./recognition_en.md)
|
||||||
|
|||||||
@ -366,15 +366,15 @@ When performing prediction, you need to specify the path of a single image or a
|
|||||||
|
|
||||||
```
|
```
|
||||||
# use direction classifier
|
# use direction classifier
|
||||||
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/2.jpg" --det_model_dir="./inference/det_db/" --cls_model_dir="./inference/cls/" --rec_model_dir="./inference/rec_crnn/" --use_angle_cls=true
|
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/00018069.jpg" --det_model_dir="./inference/det_db/" --cls_model_dir="./inference/cls/" --rec_model_dir="./inference/rec_crnn/" --use_angle_cls=true
|
||||||
|
|
||||||
# not use use direction classifier
|
# not use use direction classifier
|
||||||
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/2.jpg" --det_model_dir="./inference/det_db/" --rec_model_dir="./inference/rec_crnn/"
|
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/00018069.jpg" --det_model_dir="./inference/det_db/" --rec_model_dir="./inference/rec_crnn/"
|
||||||
```
|
```
|
||||||
|
|
||||||
After executing the command, the recognition result image is as follows:
|
After executing the command, the recognition result image is as follows:
|
||||||
|
|
||||||

|

|
||||||
|
|
||||||
<a name="OTHER_MODELS"></a>
|
<a name="OTHER_MODELS"></a>
|
||||||
### 2. OTHER MODELS
|
### 2. OTHER MODELS
|
||||||
@ -391,4 +391,4 @@ python3 tools/infer/predict_system.py --image_dir="./doc/imgs_en/img_10.jpg" --d
|
|||||||
|
|
||||||
After executing the command, the recognition result image is as follows:
|
After executing the command, the recognition result image is as follows:
|
||||||
|
|
||||||
(coming soon)
|

|
||||||
|
|||||||
BIN
doc/imgs_results/img_10_east_starnet.jpg
Normal file
BIN
doc/imgs_results/img_10_east_starnet.jpg
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 352 KiB |
BIN
doc/imgs_results/system_res_00018069.jpg
Normal file
BIN
doc/imgs_results/system_res_00018069.jpg
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 121 KiB |
@ -213,16 +213,14 @@ class GridGenerator(nn.Layer):
|
|||||||
|
|
||||||
def build_P_paddle(self, I_r_size):
|
def build_P_paddle(self, I_r_size):
|
||||||
I_r_height, I_r_width = I_r_size
|
I_r_height, I_r_width = I_r_size
|
||||||
I_r_grid_x = paddle.divide(
|
I_r_grid_x = (paddle.arange(
|
||||||
paddle.arange(
|
-I_r_width, I_r_width, 2, dtype='float64') + 1.0
|
||||||
-I_r_width, I_r_width, 2, dtype='float64') + 1.0,
|
) / paddle.to_tensor(np.array([I_r_width]))
|
||||||
paddle.to_tensor(
|
|
||||||
I_r_width, dtype='float64'))
|
I_r_grid_y = (paddle.arange(
|
||||||
I_r_grid_y = paddle.divide(
|
-I_r_height, I_r_height, 2, dtype='float64') + 1.0
|
||||||
paddle.arange(
|
) / paddle.to_tensor(np.array([I_r_height]))
|
||||||
-I_r_height, I_r_height, 2, dtype='float64') + 1.0,
|
|
||||||
paddle.to_tensor(
|
|
||||||
I_r_height, dtype='float64')) # self.I_r_height
|
|
||||||
# P: self.I_r_width x self.I_r_height x 2
|
# P: self.I_r_width x self.I_r_height x 2
|
||||||
P = paddle.stack(paddle.meshgrid(I_r_grid_x, I_r_grid_y), axis=2)
|
P = paddle.stack(paddle.meshgrid(I_r_grid_x, I_r_grid_y), axis=2)
|
||||||
P = paddle.transpose(P, perm=[1, 0, 2])
|
P = paddle.transpose(P, perm=[1, 0, 2])
|
||||||
|
|||||||
@ -109,7 +109,7 @@ class CTCLabelDecode(BaseRecLabelDecode):
|
|||||||
|
|
||||||
preds_idx = preds.argmax(axis=2)
|
preds_idx = preds.argmax(axis=2)
|
||||||
preds_prob = preds.max(axis=2)
|
preds_prob = preds.max(axis=2)
|
||||||
text = self.decode(preds_idx, preds_prob)
|
text = self.decode(preds_idx, preds_prob, is_remove_duplicate=True)
|
||||||
if label is None:
|
if label is None:
|
||||||
return text
|
return text
|
||||||
label = self.decode(label)
|
label = self.decode(label)
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user