mirror of
https://github.com/PaddlePaddle/PaddleOCR.git
synced 2025-06-26 21:24:27 +00:00
Update pubtab dataset script reference (#15799)
* Update pubtab dataset script reference Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com> * Update docs/datasets/table_datasets.en.md Co-authored-by: Wang Xin <xinwang614@gmail.com> * Update table_datasets.en.md * Update table_datasets.md --------- Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com> Co-authored-by: Wang Xin <xinwang614@gmail.com>
This commit is contained in:
parent
428f1cefd9
commit
d54dfa4c85
@ -11,9 +11,9 @@ Here are the commonly used table recognition datasets, which are being updated c
|
|||||||
|
|
||||||
| dataset | Image download link | PPOCR format annotation download link |
|
| dataset | Image download link | PPOCR format annotation download link |
|
||||||
|---|---|---|
|
|---|---|---|
|
||||||
| PubTabNet |<https://github.com/ibm-aur-nlp/PubTabNet>| jsonl format, which can be loaded directly with [pubtab_dataset.py](../../../ppocr/data/pubtab_dataset.py) |
|
| PubTabNet |<https://github.com/ibm-aur-nlp/PubTabNet>| jsonl format, which can be loaded directly with [pubtab_dataset.py](https://github.com/PaddlePaddle/PaddleOCR/blob/main/ppocr/data/pubtab_dataset.py) |
|
||||||
| TAL Table Recognition Competition Dataset |<https://ai.100tal.com/dataset>| jsonl format, which can be loaded directly with [pubtab_dataset.py](../../../ppocr/data/pubtab_dataset.py) |
|
| TAL Table Recognition Competition Dataset |<https://ai.100tal.com/dataset>| jsonl format, which can be loaded directly with [pubtab_dataset.py](https://github.com/PaddlePaddle/PaddleOCR/blob/main/ppocr/data/pubtab_dataset.py) |
|
||||||
| WTW Chinese scene table dataset |<https://github.com/wangwen-whu/WTW-Dataset>| Conversion is required to load with [pubtab_dataset.py](../../../ppocr/data/pubtab_dataset.py)|
|
| WTW Chinese scene table dataset |<https://github.com/wangwen-whu/WTW-Dataset>| Conversion is required to load with [pubtab_dataset.py](https://github.com/PaddlePaddle/PaddleOCR/blob/main/ppocr/data/pubtab_dataset.py)|
|
||||||
|
|
||||||
## 1. PubTabNet
|
## 1. PubTabNet
|
||||||
|
|
||||||
|
@ -12,9 +12,9 @@ typora-copy-images-to: images
|
|||||||
|
|
||||||
| 数据集名称 |图片下载地址| PPOCR标注下载地址 |
|
| 数据集名称 |图片下载地址| PPOCR标注下载地址 |
|
||||||
|---|---|---|
|
|---|---|---|
|
||||||
| PubTabNet |<https://github.com/ibm-aur-nlp/PubTabNet>| jsonl格式,可直接用[pubtab_dataset.py](../../../ppocr/data/pubtab_dataset.py)加载 |
|
| PubTabNet |<https://github.com/ibm-aur-nlp/PubTabNet>| jsonl格式,可直接用[pubtab_dataset.py](https://github.com/PaddlePaddle/PaddleOCR/blob/main/ppocr/data/pubtab_dataset.py)加载 |
|
||||||
| 好未来表格识别竞赛数据集 |<https://ai.100tal.com/dataset>| jsonl格式,可直接用[pubtab_dataset.py](../../../ppocr/data/pubtab_dataset.py)加载 |
|
| 好未来表格识别竞赛数据集 |<https://ai.100tal.com/dataset>| jsonl格式,可直接用[pubtab_dataset.py](https://github.com/PaddlePaddle/PaddleOCR/blob/main/ppocr/data/pubtab_dataset.py)加载 |
|
||||||
| WTW中文场景表格数据集 |<https://github.com/wangwen-whu/WTW-Dataset>| 需要进行转换后才能用[pubtab_dataset.py](../../../ppocr/data/pubtab_dataset.py)加载 |
|
| WTW中文场景表格数据集 |<https://github.com/wangwen-whu/WTW-Dataset>| 需要进行转换后才能用[pubtab_dataset.py](https://github.com/PaddlePaddle/PaddleOCR/blob/main/ppocr/data/pubtab_dataset.py)加载 |
|
||||||
|
|
||||||
## 1. PubTabNet数据集
|
## 1. PubTabNet数据集
|
||||||
|
|
||||||
|
Loading…
x
Reference in New Issue
Block a user