2025-05-20 16:49:56 +08:00
< div align = "center" >
< p >
< img width = "100%" src = "./docs/images/Banner_cn.png" alt = "PaddleOCR Banner" > < / a >
< / p >
<!-- language -->
2025-05-20 17:59:19 +08:00
中文 | [English ](./README_en.md ) | [日本語 ](./README_ja.md )
2025-05-20 16:49:56 +08:00
<!-- icon -->
[](https://github.com/PaddlePaddle/PaddleOCR)
[](https://pypi.org/project/PaddleOCR/)


[](https://www.paddleocr.ai/)
[](https://aistudio.baidu.com/community/app/91660/webUI)
[](https://aistudio.baidu.com/community/app/518494/webUI)
[](https://aistudio.baidu.com/community/app/518493/webUI)
2025-05-20 17:40:57 +08:00
2025-05-20 16:49:56 +08:00
< / div >
2020-10-13 17:49:16 +08:00
2025-05-20 16:49:56 +08:00
## 🚀 简介
2025-05-20 17:40:57 +08:00
PaddleOCR自发布以来凭借学术前沿算法和产业落地实践, 受到了产学研各方的喜爱, 并被广泛应用于众多知名开源项目, 例如: Umi-OCR、OmniParser、MinerU、RAGFlow等, 已成为广大开发者心中的开源OCR领域的首选工具。2025年5月20日, 飞桨团队发布**PaddleOCR 3.0**,全面适配**飞桨框架3.0正式版**,进一步**提升文字识别精度**,支持**多文字类型识别**和**手写体识别**,满足大模型应用对**复杂文档高精度解析**的旺盛需求,结合**文心大模型4.5 Turbo**显著提升关键信息抽取精度,并新增**对昆仑芯、昇腾等国产硬件**的支持。
2022-05-09 00:04:06 +08:00
2025-05-20 19:47:12 +08:00
PaddleOCR 3.0**新增**三大特色能力:
2025-05-20 17:59:19 +08:00
- 全场景文字识别模型[PP-OCRv5 ](docs/version3.x/algorithm/PP-OCRv5/PP-OCRv5.md ):单模型支持五种文字类型和复杂手写体识别;整体识别精度相比上一代**提升13个百分点**。
- 通用文档解析方案[PP-StructureV3 ](docs/version3.x/algorithm/PP-StructureV3/PP-StructureV3.md ):支持多场景、多版式 PDF 高精度解析,在公开评测集中**领先众多开源和闭源方案**。
- 智能文档理解方案[PP-ChatOCRv4 ](docs/version3.x/algorithm/PP-ChatOCRv4/PP-ChatOCRv4.md ): 原生支持文心大模型4.5 Turbo, 精度相比上一代**提升15个百分点**。
2025-04-28 16:18:14 +08:00
2025-05-20 16:49:56 +08:00
PaddleOCR 3.0除了提供优秀的模型库外, 还提供好学易用的工具, 覆盖模型训练、推理和服务化部署, 方便开发者快速落地AI应用。
2022-05-09 00:04:06 +08:00
< div align = "center" >
2025-05-20 16:49:56 +08:00
< p >
< img width = "100%" src = "./docs/images/Arch_cn.png" alt = "PaddleOCR Architecture" > < / a >
< / p >
2022-05-09 00:04:06 +08:00
< / div >
2020-12-16 00:44:40 +08:00
2025-05-20 16:49:56 +08:00
## 📣 最新动态
🔥🔥2025.05.20: **PaddleOCR 3.0** 正式发布,包含:
- **PP-OCRv5**: 全场景高精度文字识别
1. 🌐 单模型支持**五种**文字类型(**简体中文**、**繁体中文**、**中文拼音**、**英文**和**日文**)。
2. ✍️ 支持复杂**手写体**识别:复杂连笔、非规范字迹识别性能显著提升。
3. 🎯 整体识别精度提升 - 多种应用场景达到 SOTA 精度, 相比上一版本PP-OCRv4, 识别精度**提升13个百分点**!
- **PP-StructureV3**: 通用文档解析方案
1. 🧮 支持多场景 PDF 高精度解析,在 OmniDocBench 基准测试中**领先众多开源和闭源方案**。
2. 🧠 多项专精能力: **印章识别** 、**图表转表格**、**嵌套公式/图片的表格识别**、**竖排文本解析**及**复杂表格结构分析**等。
- **PP-ChatOCRv4**: 智能文档理解方案
2025-05-20 18:50:15 +08:00
1. 🔥 文档图像( PDF/PNG/JPG) 关键信息提取精度相比上一代**提升15个百分点**!
2025-05-20 17:40:57 +08:00
2. 💻 原生支持**文心大模型4.5 Turbo**,还兼容 PaddleNLP、Ollama、vLLM 等工具部署的大模型。
3. 🤝 集成 [PP-DocBee2 ](https://github.com/PaddlePaddle/PaddleMIX/tree/develop/paddlemix/examples/ppdocbee2 ),支持印刷文字、手写体文字、印章信息、表格、图表等常见的复杂文档信息抽取和理解的能力。
2025-05-20 16:49:56 +08:00
## ⚡ 快速开始
2025-05-20 17:59:19 +08:00
### 1. 在线体验
2025-05-20 16:49:56 +08:00
[](https://aistudio.baidu.com/community/app/91660/webUI)
[](https://aistudio.baidu.com/community/app/518494/webUI)
[](https://aistudio.baidu.com/community/app/518493/webUI)
2025-05-20 17:59:19 +08:00
### 2. 本地安装
2025-05-20 16:49:56 +08:00
2025-05-20 19:16:48 +08:00
请参考[安装指南 ](https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/develop/install/pip/linux-pip.html )完成**PaddlePaddle 3.0**的安装, 然后安装paddleocr。
2025-05-20 16:49:56 +08:00
```bash
2025-05-20 17:59:19 +08:00
# 安装 paddleocr
2025-05-20 16:49:56 +08:00
pip install paddleocr
```
2025-05-20 17:40:57 +08:00
### 3. 命令行方式推理
2025-05-20 16:49:56 +08:00
```bash
# 运行 PP-OCRv5 推理
2025-05-20 20:33:20 +08:00
paddleocr ocr -i https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_002.png --use_doc_orientation_classify False --use_doc_unwarping False
2025-05-20 16:49:56 +08:00
# 运行 PP-StructureV3 推理
2025-05-20 20:33:20 +08:00
paddleocr pp_structurev3 -i https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/pp_structure_v3_demo.png --use_doc_orientation_classify False --use_doc_unwarping False
2025-05-20 16:49:56 +08:00
2025-05-20 17:59:19 +08:00
# 运行 PP-ChatOCRv4 推理前, 需要先获得千帆KPI Key
2025-05-20 20:33:20 +08:00
paddleocr pp_chatocrv4_doc -i https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/vehicle_certificate-1.png -k 驾驶室准乘人数 --qianfan_api_key your_api_key --use_doc_orientation_classify False --use_doc_unwarping False
2025-05-20 16:49:56 +08:00
# 查看 "paddleocr ocr" 详细参数
paddleocr ocr --help
```
2025-05-20 17:40:57 +08:00
### 4. API方式推理
2025-05-20 16:49:56 +08:00
2025-05-20 17:40:57 +08:00
**4.1 PP-OCRv5 示例**
2025-05-20 16:49:56 +08:00
```python
from paddleocr import PaddleOCR
# 初始化 PaddleOCR 实例
ocr = PaddleOCR()
# 对示例图像执行 OCR 推理
result = ocr.predict("https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_002.png")
# 可视化结果并保存 json 结果
for res in result:
res.print()
res.save_to_img("output")
res.save_to_json("output")
```
< details >
2025-05-20 17:40:57 +08:00
< summary > < strong > 4.2 PP-StructureV3 示例< / strong > < / summary >
2025-05-20 16:49:56 +08:00
```python
from pathlib import Path
from paddleocr import PPStructureV3
pipeline = PPStructureV3()
# For Image
output = pipeline.predict("https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/pp_structure_v3_demo.png")
# 可视化结果并保存 json 结果
for res in output:
res.print()
res.save_to_json(save_path="output")
res.save_to_markdown(save_path="output")
# For PDF File
input_file = "./your_pdf_file.pdf"
output_path = Path("./output")
output = pipeline.predict(input_file)
markdown_list = []
markdown_images = []
for res in output:
md_info = res.markdown
markdown_list.append(md_info)
markdown_images.append(md_info.get("markdown_images", {}))
markdown_texts = pipeline.concatenate_markdown_pages(markdown_list)
mkd_file_path = output_path / f"{Path(input_file).stem}.md"
mkd_file_path.parent.mkdir(parents=True, exist_ok=True)
with open(mkd_file_path, "w", encoding="utf-8") as f:
f.write(markdown_texts)
for item in markdown_images:
if item:
for path, image in item.items():
file_path = output_path / path
file_path.parent.mkdir(parents=True, exist_ok=True)
image.save(file_path)
```
< / details >
< details >
2025-05-20 17:40:57 +08:00
< summary > < strong > 4.3 PP-ChatOCRv4 示例< / strong > < / summary >
2025-05-20 16:49:56 +08:00
```python
from paddleocr import PPChatOCRv4Doc
chat_bot_config = {
"module_name": "chat_bot",
"model_name": "ernie-3.5-8k",
"base_url": "https://qianfan.baidubce.com/v2",
"api_type": "openai",
"api_key": "api_key", # your api_key
}
retriever_config = {
"module_name": "retriever",
"model_name": "embedding-v1",
"base_url": "https://qianfan.baidubce.com/v2",
"api_type": "qianfan",
"api_key": "api_key", # your api_key
}
mllm_chat_bot_config = {
"module_name": "chat_bot",
"model_name": "PP-DocBee",
"base_url": "http://127.0.0.1:8080/", # your local mllm service url
"api_type": "openai",
"api_key": "api_key", # your api_key
}
pipeline = PPChatOCRv4Doc()
visual_predict_res = pipeline.visual_predict(
input="https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/vehicle_certificate-1.png",
use_doc_orientation_classify=False,
use_doc_unwarping=False,
use_common_ocr=True,
use_seal_recognition=True,
use_table_recognition=True,
)
visual_info_list = []
for res in visual_predict_res:
visual_info_list.append(res["visual_info"])
layout_parsing_result = res["layout_parsing_result"]
vector_info = pipeline.build_vector(
visual_info_list, flag_save_bytes_vector=True, retriever_config=retriever_config
)
mllm_predict_res = pipeline.mllm_pred(
input="vehicle_certificate-1.png",
key_list=["驾驶室准乘人数"],
mllm_chat_bot_config=mllm_chat_bot_config,
)
mllm_predict_info = mllm_predict_res["mllm_res"]
chat_result = pipeline.chat(
key_list=["驾驶室准乘人数"],
visual_info=visual_info_list,
vector_info=vector_info,
mllm_predict_info=mllm_predict_info,
chat_bot_config=chat_bot_config,
retriever_config=retriever_config,
)
print(chat_result)
```
< / details >
2025-05-20 17:59:19 +08:00
2025-05-20 18:50:15 +08:00
### 5. **国产化硬件使用**
2025-05-20 17:40:57 +08:00
- [昆仑芯安装指南 ](https://paddlepaddle.github.io/PaddleOCR/latest/version3.x/other_devices_support/paddlepaddle_install_XPU.html )
- [昇腾安装指南 ](https://paddlepaddle.github.io/PaddleOCR/latest/version3.x/other_devices_support/paddlepaddle_install_NPU.html )
2025-05-20 18:50:15 +08:00
## ⛰️ 进阶指南
2025-05-20 18:16:13 +08:00
- [PP-OCRv5 使用教程 ](https://paddlepaddle.github.io/PaddleOCR/latest/version3.x/pipeline_usage/OCR.html )
- [PP-StructureV3 使用教程 ](https://paddlepaddle.github.io/PaddleOCR/latest/version3.x/pipeline_usage/PP-StructureV3.html )
- [PP-ChatOCRv4 使用教程 ](https://paddlepaddle.github.io/PaddleOCR/latest/version3.x/pipeline_usage/PP-ChatOCRv4.html )
2025-05-20 17:59:19 +08:00
## 🔄 效果展示
2020-07-20 20:26:02 +08:00
2022-08-23 21:59:03 +08:00
< div align = "center" >
2025-05-20 16:49:56 +08:00
< p >
2025-05-20 17:59:19 +08:00
< img width = "100%" src = "./docs/images/demo.gif" alt = "PP-OCRv5 Demo" > < / a >
2025-05-20 16:49:56 +08:00
< / p >
2022-08-23 21:59:03 +08:00
< / div >
2020-12-15 15:09:24 +08:00
2025-05-20 16:49:56 +08:00
< div align = "center" >
< p >
< img width = "100%" src = "./docs/images/blue_v3.gif" alt = "PP-StructureV3 Demo" > < / a >
< / p >
< / div >
2020-06-23 17:39:50 +08:00
2025-05-20 16:49:56 +08:00
## 👩👩👧👦 开发者社区
2024-07-30 13:09:43 +08:00
2025-05-20 19:16:48 +08:00
| 扫码关注飞桨公众号 | 扫码加入技术交流群 |
| :---: | :---: |
| < img src = "https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/refs/heads/main/images/paddleocr/README/qrcode_for_paddlepaddle_official_account.jpg" width = "150" > | < img src = "https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/refs/heads/main/images/paddleocr/README/qr_code_for_the_questionnaire.jpg" width = "150" > |
2025-05-20 19:23:19 +08:00
## 🏆 使用 PaddleOCR 的优秀项目
PaddleOCR 的发展离不开社区贡献!💗衷心感谢所有开发者、合作伙伴与贡献者!
2025-05-20 17:59:19 +08:00
| 项目名称 | 简介 |
| ------------ | ----------- |
| [RAGFlow ](https://github.com/infiniflow/ragflow ) < a href = "https://github.com/infiniflow/ragflow" >< img src = "https://img.shields.io/github/stars/infiniflow/ragflow" ></ a > |基于RAG的AI工作流引擎|
| [MinerU ](https://github.com/opendatalab/MinerU ) < a href = "https://github.com/opendatalab/MinerU" >< img src = "https://img.shields.io/github/stars/opendatalab/MinerU" ></ a > |多类型文档转换Markdown工具|
| [Umi-OCR ](https://github.com/hiroi-sora/Umi-OCR ) < a href = "https://github.com/hiroi-sora/Umi-OCR" >< img src = "https://img.shields.io/github/stars/hiroi-sora/Umi-OCR" ></ a > |开源批量离线OCR软件|
| [OmniParser ](https://github.com/microsoft/OmniParser )< a href = "https://github.com/microsoft/OmniParser" >< img src = "https://img.shields.io/github/stars/microsoft/OmniParser" ></ a > |基于纯视觉的GUI智能体屏幕解析工具|
| [QAnything ](https://github.com/netease-youdao/QAnything )< a href = "https://github.com/netease-youdao/QAnything" >< img src = "https://img.shields.io/github/stars/netease-youdao/QAnything" ></ a > |基于任意内容的问答系统|
| [PDF-Extract-Kit ](https://github.com/opendatalab/PDF-Extract-Kit ) < a href = "https://github.com/opendatalab/PDF-Extract-Kit" >< img src = "https://img.shields.io/github/stars/opendatalab/PDF-Extract-Kit" ></ a > |高效复杂PDF文档提取工具包|
| [Dango-Translator ](https://github.com/PantsuDango/Dango-Translator )< a href = "https://github.com/PantsuDango/Dango-Translator" >< img src = "https://img.shields.io/github/stars/PantsuDango/Dango-Translator" ></ a > |屏幕实时翻译工具|
| [更多项目 ](./awesome_projects.md ) | |
2025-05-20 18:16:13 +08:00
## 👩👩👧👦 贡献者
< a href = "https://github.com/PaddlePaddle/PaddleOCR/graphs/contributors" >
< img src = "https://contrib.rocks/image?repo=PaddlePaddle/PaddleOCR&max=400&columns=20" width = "800" / >
< / a >
## 🌟 Star
[](https://star-history.com/#PaddlePaddle/PaddleOCR & Date)
2025-05-20 16:49:56 +08:00
## 📄 许可协议
2025-05-20 18:21:01 +08:00
本项目的发布受[Apache 2.0 license ](LICENSE )许可认证。
2025-05-20 18:16:13 +08:00
## 🎓 学术引用
```
@misc {paddleocr2020,
title={PaddleOCR, Awesome multilingual OCR toolkits based on PaddlePaddle.},
author={PaddlePaddle Authors},
howpublished = {\url{https://github.com/PaddlePaddle/PaddleOCR}},
year={2020}
}
```