mirror of
https://github.com/PaddlePaddle/PaddleOCR.git
synced 2025-08-05 07:08:14 +00:00
6.6 KiB
6.6 KiB
PP-Structure Model list
1. Layout Analysis
model name | description | download |
---|---|---|
picodet_lcnet_x1_0_fgd_layout | The layout analysis model trained on the PubLayNet dataset, the model can recognition 5 types of areas such as Text, Title, Table, Picture and List | inference model / trained model |
picodet_lcnet_x1_0_fgd_layout_cdla | The layout analysis model trained on the CDLA dataset, the model can recognition 10 types of areas such as Table、Figure、Figure caption、Table、Table caption、Header、Footer、Reference、Equation | inference model / trained model |
picodet_lcnet_x1_0_fgd_layout_table | The layout analysis model trained on the table dataset, the model can only detect tables | inference model / trained model |
2. OCR and Table Recognition
2.1 OCR
model name | description | inference model size | download |
---|---|---|---|
en_ppocr_mobile_v2.0_table_det | Text detection model of English table scenes trained on PubTabNet dataset | 4.7M | inference model / trained model |
en_ppocr_mobile_v2.0_table_rec | Text recognition model of English table scenes trained on PubTabNet dataset | 6.9M | inference model / trained model |
If you need to use other OCR models, you can download the model in PP-OCR model_list or use the model you trained yourself to configure to det_model_dir
, rec_model_dir
field.
2.2 Table Recognition
model | description | inference model size | download |
---|---|---|---|
en_ppocr_mobile_v2.0_table_structure | English table recognition model trained on PubTabNet dataset based on TableRec-RARE | 6.8M | inference model / trained model |
en_ppstructure_mobile_v2.0_SLANet | English table recognition model trained on PubTabNet dataset based on SLANet | 9.2M | inference model / trained model |
ch_ppstructure_mobile_v2.0_SLANet | Chinese table recognition model trained on PubTabNet dataset based on SLANet | 9.3M | inference model / trained model |
3. KIE
On XFUND_zh dataset, Accuracy and time cost of different models on V100 GPU are as follows.
Model | Backbone | Task | Config | Hmean | Time cost(ms) | Download link |
---|---|---|---|---|---|---|
VI-LayoutXLM | VI-LayoutXLM-base | SER | ser_vi_layoutxlm_xfund_zh_udml.yml | 93.19% | 15.49 | trained model |
LayoutXLM | LayoutXLM-base | SER | ser_layoutxlm_xfund_zh.yml | 90.38% | 19.49 | trained model |
LayoutLM | LayoutLM-base | SER | ser_layoutlm_xfund_zh.yml | 77.31% | - | trained model |
LayoutLMv2 | LayoutLMv2-base | SER | ser_layoutlmv2_xfund_zh.yml | 85.44% | 31.46 | trained model |
VI-LayoutXLM | VI-LayoutXLM-base | RE | re_vi_layoutxlm_xfund_zh_udml.yml | 83.92% | 15.49 | trained model |
LayoutXLM | LayoutXLM-base | RE | re_layoutxlm_xfund_zh.yml | 74.83% | 19.49 | trained model |
LayoutLMv2 | LayoutLMv2-base | RE | re_layoutlmv2_xfund_zh.yml | 67.77% | 31.46 | trained model |
- Note: The above time cost information just considers inference time without preprocess or postprocess, test environment:
V100 GPU + CUDA 10.2 + CUDNN 8.1.1 + TRT 7.2.3.4
On wildreceipt dataset, the algorithm result is as follows:
Model | Backbone | Config | Hmean | Download link |
---|---|---|---|---|
SDMGR | VGG6 | configs/kie/sdmgr/kie_unet_sdmgr.yml | 86.7% | trained model |