PaddleOCR/README_en.md

<div align="center">
  <p>
      <img width="100%" src="./docs/images/Banner.png" alt="PaddleOCR Banner"></a>
  </p>

<!-- language -->
[中文](./readme_c.md)| English 

<!-- icon -->

[![stars](https://img.shields.io/github/stars/PaddlePaddle/PaddleOCR?color=ccf)](https://github.com/PaddlePaddle/PaddleOCR)
[![Downloads](https://img.shields.io/pypi/dm/paddleocr)](https://pypi.org/project/PaddleOCR/)
![python](https://img.shields.io/badge/python-3.8～3.12-aff.svg)
![os](https://img.shields.io/badge/os-linux%2C%20win%2C%20mac-pink.svg)
![hardware](https://img.shields.io/badge/hardware-cpu%2C%20gpu%2C%20xpu%2C%20npu-yellow.svg)

[![Website](https://img.shields.io/badge/Website-PaddleOCR-blue?logo=data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAACAAAAAgCAYAAABzenr0AAAABmmRkdj0AAAAASUVORK5CYII=)](https://www.paddleocr.ai/)
[![AI Studio](https://img.shields.io/badge/PP_OCRv5-AI_Studio-green)](https://aistudio.baidu.com/community/app/91660/webUI)
[![AI Studio](https://img.shields.io/badge/PP_StructureV3-AI_Studio-green)](https://aistudio.baidu.com/community/app/518494/webUI)
[![AI Studio](https://img.shields.io/badge/PP_ChatOCRv4-AI_Studio-green)](https://aistudio.baidu.com/community/app/518493/webUI)

</div>

## 🚀 Introduction
Since its initial release, PaddleOCR has gained widespread acclaim across academia, industry, and research communities, thanks to its cutting-edge algorithms and proven performance in real-world applications. It’s already powering popular open-source projects like Umi-OCR, OmniParser, MinerU, and RAGFlow, making it the go-to OCR toolkit for developers worldwide.

On May 20, 2025, the PaddlePaddle team unveiled PaddleOCR 3.0, fully compatible with the official release of the **PaddlePaddle 3.0** framework. This update further **boosts text-recognition accuracy**, adds support for **multiple text-type recognition** and **handwriting recognition**, and meets the growing demand from large-model applications for **high-precision parsing of complex documents**. When combined with the **ERNIE 4.5T**, it significantly enhances key-information extraction accuracy. PaddleOCR 3.0 also introduces support for domestic hardware platforms such as **KUNLUNXIN** and **Ascend**.

Three Major New Features in PaddleOCR 3.0:
- Universal-Scene Text Recognition Model [PP-OCRv5](./docs/version3.x/algorithm/PP-OCRv5/PP-OCRv5.en.md): A single model that handles five different text types plus complex handwriting. Overall recognition accuracy has increased by 13 percentage points over the previous generation. [Online Demo](https://aistudio.baidu.com/community/app/91660/webUI)

- General Document-Parsing Solution [PP-StructureV3](./docs/version3.x/algorithm/PP-StructureV3/PP-StructureV3.en.md): Delivers high-precision parsing of multi-layout, multi-scene PDFs, outperforming many open- and closed-source solutions on public benchmarks. [Online Demo](https://aistudio.baidu.com/community/app/518494/webUI)

- Intelligent Document-Understanding Solution [PP-ChatOCRv4](./docs/version3.x/algorithm/PP-ChatOCRv4/PP-ChatOCRv4.en.md): Natively powered by the WenXin large model 4.5T, achieving 15 percentage points higher accuracy than its predecessor. [Online Demo](https://aistudio.baidu.com/community/app/518493/webUI)

In addition to providing an outstanding model library, PaddleOCR 3.0 also offers user-friendly tools covering model training, inference, and service deployment, so developers can rapidly bring AI applications to production.
<div align="center">
  <p>
      <img width="100%" src="./docs/images/Arch.png" alt="PaddleOCR Architecture"></a>
  </p>
</div>


## 📣 Recent updates
🔥🔥2025.05.20: Official Release of **PaddleOCR v3.0**, including:
- **PP-OCRv5**: High-Accuracy Text Recognition Model for All Scenarios - Instant Text from Images/PDFs.
   1. 🌐 Single-model support for **five** text types - Seamlessly process **Simplified Chinese, Traditional Chinese, Simplified Chinese Pinyin, English** and **Japanse** within a single model.
   2. ✍️ Improved **handwriting recognition**: Significantly better at complex cursive scripts and non-standard handwriting.
   3. 🎯 **13-point accuracy gain** over PP-OCRv4, achieving state-of-the-art performance across a variety of real-world scenarios.

- **PP-StructureV3**: General-Purpose Document Parsing – Unleash SOTA Images/PDFs Parsing for Real-World Scenarios! 
   1. 🧮 **High-Accuracy multi-scene PDF parsing**, leading both open- and closed-source solutions on the OmniDocBench benchmark.
   2. 🧠 Specialized capabilities include **seal recognition**, **chart-to-table conversion**, **table recognition with nested formulas/images**, **vertical text document parsing**, and **complex table structure analysis**.

- **PP-ChatOCRv4**: Intelligent Document Understanding – Extract Key Information, not just text from Images/PDFs.
   1. 🔥 **15-point accuracy gain** in key-information extraction on PDF/PNG/JPG files over the previous generation.
   2. 💻 Native support for **ERINE4.5 Turbo**, with compatibility for large-model deployments via PaddleNLP, Ollama, vLLM, and more.
   3. 🤝 Integrated [PP-DocBee2](https://github.com/PaddlePaddle/PaddleMIX/tree/develop/paddlemix/examples/ppdocbee2), enabling extraction and understanding of printed text, handwriting, seals, tables, charts, and other common elements in complex documents.

<details>
   <summary><strong>The history of updates </strong></summary>


- 🔥🔥2025.03.07: Release of **PaddleOCR v2.10**, including:

  - **12 new self-developed models:**
    - **[Layout Detection series](https://paddlepaddle.github.io/PaddleX/latest/en/module_usage/tutorials/ocr_modules/layout_detection.html)**(3 models): PP-DocLayout-L, M, and S -- capable of detecting 23 common layout types across diverse document formats(papers, reports, exams, books, magazines, contracts, etc.) in English and Chinese. Achieves up to **90.4% mAP@0.5** , and lightweight features can process over 100 pages per second.
    - **[Formula Recognition series](https://paddlepaddle.github.io/PaddleX/latest/en/module_usage/tutorials/ocr_modules/formula_recognition.html)**(2 models): PP-FormulaNet-L and S -- supports recognition of 50,000+ LaTeX expressions, handling both printed and handwritten formulas. PP-FormulaNet-L offers **6% higher accuracy** than comparable models; PP-FormulaNet-S is 16x faster while maintaining similar accuracy.
    - **[Table Structure Recognition series](https://paddlepaddle.github.io/PaddleX/latest/en/module_usage/tutorials/ocr_modules/table_structure_recognition.html)**(2 models): SLANeXt_wired and SLANeXt_wireless -- newly developed models with **6% accuracy improvement** over SLANet_plus in complex table recognition.
    - **[Table Classification](https://paddlepaddle.github.io/PaddleX/latest/en/module_usage/tutorials/ocr_modules/table_classification.html)**(1 model): 
PP-LCNet_x1_0_table_cls -- an ultra-lightweight classifier for wired and wireless tables.

[Learn more](https://paddlepaddle.github.io/PaddleOCR/latest/en/update.html)

</details>

## ⚡ Quick Start
### 1. Run online demo 
[![AI Studio](https://img.shields.io/badge/PP_OCRv5-AI_Studio-green)](https://aistudio.baidu.com/community/app/91660/webUI)
[![AI Studio](https://img.shields.io/badge/PP_StructureV3-AI_Studio-green)](https://aistudio.baidu.com/community/app/518494/webUI)
[![AI Studio](https://img.shields.io/badge/PP_ChatOCRv4-AI_Studio-green)](https://aistudio.baidu.com/community/app/518493/webUI)

### 2. Installation

Install PaddlePaddle refer to [Installation Guide](https://www.paddlepaddle.org.cn/en/install/quick?docurl=/documentation/docs/en/develop/install/pip/linux-pip_en.html), after then, install the PaddleOCR toolkit.

```bash
# Install paddleocr
pip install paddleocr==3.0.0
```

### 3. Run inference by CLI
```bash
# Run PP-OCRv5 inference
paddleocr ocr -i https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_002.png --use_doc_orientation_classify False --use_doc_unwarping False --use_textline_orientation False  

# Run PP-StructureV3 inference
paddleocr pp_structurev3 -i https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/pp_structure_v3_demo.png --use_doc_orientation_classify False --use_doc_unwarping False

# Get the Qianfan API Key at first, and then run PP-ChatOCRv4 inference
paddleocr pp_chatocrv4_doc -i https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/vehicle_certificate-1.png -k 驾驶室准乘人数 --qianfan_api_key your_api_key --use_doc_orientation_classify False --use_doc_unwarping False 

# Get more information about "paddleocr ocr"
paddleocr ocr --help
```

### 4. Run inference by API
**4.1 PP-OCRv5 Example**
```python
# Initialize PaddleOCR instance
ocr = PaddleOCR(
    use_doc_orientation_classify=False,
    use_doc_unwarping=False,
    use_textline_orientation=False)

# Run OCR inference on a sample image 
result = ocr.predict(
    input="https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_002.png")

# Visualize the results and save the JSON results
for res in result:
    res.print()
    res.save_to_img("output")
    res.save_to_json("output")
```

<details>
    <summary><strong>4.2 PP-StructureV3 Example</strong></summary>

```python
from pathlib import Path
from paddleocr import PPStructureV3

pipeline = PPStructureV3()

# For Image
output = pipeline.predict(
    input="https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/pp_structure_v3_demo.png",
    use_doc_orientation_classify=False,
    use_doc_unwarping=False
    )

# Visualize the results and save the JSON results
for res in output:
    res.print() 
    res.save_to_json(save_path="output") 
    res.save_to_markdown(save_path="output")           
```

</details>

<details>
   <summary><strong>4.3 PP-ChatOCRv4 Example</strong></summary>

```python
from paddleocr import PPChatOCRv4Doc

chat_bot_config = {
    "module_name": "chat_bot",
    "model_name": "ernie-3.5-8k",
    "base_url": "https://qianfan.baidubce.com/v2",
    "api_type": "openai",
    "api_key": "api_key",  # your api_key
}

retriever_config = {
    "module_name": "retriever",
    "model_name": "embedding-v1",
    "base_url": "https://qianfan.baidubce.com/v2",
    "api_type": "qianfan",
    "api_key": "api_key",  # your api_key
}

pipeline = PPChatOCRv4Doc(
    use_doc_orientation_classify=False,
    use_doc_unwarping=False
)

visual_predict_res = pipeline.visual_predict(
    input="https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/vehicle_certificate-1.png",
    use_common_ocr=True,
    use_seal_recognition=True,
    use_table_recognition=True,
)

mllm_predict_info = None
use_mllm = False
# If a multimodal large model is used, the local mllm service needs to be started. You can refer to the documentation: https://github.com/PaddlePaddle/PaddleX/blob/release/3.0/docs/pipeline_usage/tutorials/vlm_pipelines/doc_understanding.m d performs deployment and updates the mllm_chat_bot_config configuration.
if use_mllm:
    mllm_chat_bot_config = {
        "module_name": "chat_bot",
        "model_name": "PP-DocBee",
        "base_url": "http://127.0.0.1:8080/",  # your local mllm service url
        "api_type": "openai",
        "api_key": "api_key",  # your api_key
    }

    mllm_predict_res = pipeline.mllm_pred(
        input="https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/vehicle_certificate-1.png",
        key_list=["驾驶室准乘人数"],
        mllm_chat_bot_config=mllm_chat_bot_config,
    )
    mllm_predict_info = mllm_predict_res["mllm_res"]

visual_info_list = []
for res in visual_predict_res:
    visual_info_list.append(res["visual_info"])
    layout_parsing_result = res["layout_parsing_result"]

vector_info = pipeline.build_vector(
    visual_info_list, flag_save_bytes_vector=True, retriever_config=retriever_config
)
chat_result = pipeline.chat(
    key_list=["驾驶室准乘人数"],
    visual_info=visual_info_list,
    vector_info=vector_info,
    mllm_predict_info=mllm_predict_info,
    chat_bot_config=chat_bot_config,
    retriever_config=retriever_config,
)
print(chat_result)
```

</details>

### 5. Domestic AI Accelerators
- [Huawei Ascend](https://paddlepaddle.github.io/PaddleOCR/latest/version3.x/other_devices_support/paddlepaddle_install_NPU.html)
- [KUNLUNXIN](https://paddlepaddle.github.io/PaddleOCR/latest/version3.x/other_devices_support/paddlepaddle_install_XPU.html)

## ⛰️ Advanced Tutorials
- [PP-OCRv5 Tutorial](https://paddlepaddle.github.io/PaddleOCR/latest/version3.x/pipeline_usage/OCR.html)
- [PP-StructureV3 Tutorial](https://paddlepaddle.github.io/PaddleOCR/latest/version3.x/pipeline_usage/PP-StructureV3.html)
- [PP-ChatOCRv4 Tutorial](https://paddlepaddle.github.io/PaddleOCR/latest/version3.x/pipeline_usage/PP-ChatOCRv4.html)

## 🔄 Quick Overview of Execution Results

<div align="center">
  <p>
     <img width="100%" src="./docs/images/demo.gif" alt="PP-OCRv5 Demo"></a>
  </p>
</div>

<div align="center">
  <p>
     <img width="100%" src="./docs/images/blue_v3.gif" alt="PP-StructureV3 Demo"></a>
  </p>
</div>

## 👩‍👩‍👧‍👦 Community

| PaddlePaddle WeChat official account |  Join the tech discussion group |
| :---: | :---: |
| <img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/refs/heads/main/images/paddleocr/README/qrcode_for_paddlepaddle_official_account.jpg" width="150"> | <img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/refs/heads/main/images/paddleocr/README/qr_code_for_the_questionnaire.jpg" width="150"> |


## 😃 Awesome Projects Leveraging PaddleOCR
PaddleOCR wouldn’t be where it is today without its incredible community! 💗 A massive thank you to all our longtime partners, new collaborators, and everyone who’s poured their passion into PaddleOCR — whether we’ve named you or not. Your support fuels our fire!

| Project Name | Description |
| ------------ | ----------- |
| [RAGFlow](https://github.com/infiniflow/ragflow) <a href="https://github.com/infiniflow/ragflow"><img src="https://img.shields.io/github/stars/infiniflow/ragflow"></a>|RAG engine based on deep document understanding.|
| [MinerU](https://github.com/opendatalab/MinerU) <a href="https://github.com/opendatalab/MinerU"><img src="https://img.shields.io/github/stars/opendatalab/MinerU"></a>|Multi-type Document to Markdown Conversion Tool|
| [Umi-OCR](https://github.com/hiroi-sora/Umi-OCR) <a href="https://github.com/hiroi-sora/Umi-OCR"><img src="https://img.shields.io/github/stars/hiroi-sora/Umi-OCR"></a>|Free, Open-source, Batch Offline OCR Software.|
| [OmniParser](https://github.com/microsoft/OmniParser)<a href="https://github.com/microsoft/OmniParser"><img src="https://img.shields.io/github/stars/microsoft/OmniParser"></a> |OmniParser: Screen Parsing tool for Pure Vision Based GUI Agent.|
| [QAnything](https://github.com/netease-youdao/QAnything)<a href="https://github.com/netease-youdao/QAnything"><img src="https://img.shields.io/github/stars/netease-youdao/QAnything"></a> |Question and Answer based on Anything.|
| [PDF-Extract-Kit](https://github.com/opendatalab/PDF-Extract-Kit) <a href="https://github.com/opendatalab/PDF-Extract-Kit"><img src="https://img.shields.io/github/stars/opendatalab/PDF-Extract-Kit"></a>|A powerful open-source toolkit designed to efficiently extract high-quality content from complex and diverse PDF documents.|
| [Dango-Translator](https://github.com/PantsuDango/Dango-Translator)<a href="https://github.com/PantsuDango/Dango-Translator"><img src="https://img.shields.io/github/stars/PantsuDango/Dango-Translator"></a> |Recognize text on the screen, translate it and show the translation results in real time.|
| [Learn more projects](./awesome_projects.md) | [More projects based on PaddleOCR](./awesome_projects.md)|

## 👩‍👩‍👧‍👦 Contributors

<a href="https://github.com/PaddlePaddle/PaddleOCR/graphs/contributors">
  <img src="https://contrib.rocks/image?repo=PaddlePaddle/PaddleOCR&max=400&columns=20"  width="800"/>
</a>


## 🌟 Star

[![Star History Chart](https://api.star-history.com/svg?repos=PaddlePaddle/PaddleOCR&type=Date)](https://star-history.com/#PaddlePaddle/PaddleOCR&Date)


## 📄 License
This project is released under the [Apache 2.0 license](LICENSE).

## 🎓 Citation

```
@misc{paddleocr2020,
title={PaddleOCR, Awesome multilingual OCR toolkits based on PaddlePaddle.},
author={PaddlePaddle Authors},
howpublished = {\url{https://github.com/PaddlePaddle/PaddleOCR}},
year={2020}
}
```
-												update readme_en,fix_documents (#10592)


											
										
										
											2023-08-09 22:36:43 +08:00
+								<div align="center">
-												update README (#15213)

* update README

* update README

* update
											
										
										
											2025-05-20 16:49:56 +08:00
+								  <p>
 								      <img width="100%" src="./docs/images/Banner.png" alt="PaddleOCR Banner"></a>
 								  </p>
 								<!-- language -->
-												Update README_en.md (#15247)


											
										
										
											2025-05-20 18:50:38 +08:00
+								[中文](./readme_c.md)| English
-												update README (#15213)

* update README

* update README

* update
											
										
										
											2025-05-20 16:49:56 +08:00
 								<!-- icon -->
 								[![stars](https://img.shields.io/github/stars/PaddlePaddle/PaddleOCR?color=ccf)](https://github.com/PaddlePaddle/PaddleOCR)
 								[![Downloads](https://img.shields.io/pypi/dm/paddleocr)](https://pypi.org/project/PaddleOCR/)
-												fix docs (#15312)

* fix docs

* update
											
										
										
											2025-05-22 12:02:21 +08:00
+								![python](https://img.shields.io/badge/python-3.8～3.12-aff.svg)
-												update README (#15213)

* update README

* update README

* update
											
										
										
											2025-05-20 16:49:56 +08:00
+								![os](https://img.shields.io/badge/os-linux%2C%20win%2C%20mac-pink.svg)
-												fix docs (#15312)

* fix docs

* update
											
										
										
											2025-05-22 12:02:21 +08:00
+								![hardware](https://img.shields.io/badge/hardware-cpu%2C%20gpu%2C%20xpu%2C%20npu-yellow.svg)
-												update README (#15213)

* update README

* update README

* update
											
										
										
											2025-05-20 16:49:56 +08:00
 								[![Website](https://img.shields.io/badge/Website-PaddleOCR-blue?logo=data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAACAAAAAgCAYAAABzenr0AAAABmmRkdj0AAAAASUVORK5CYII=)](https://www.paddleocr.ai/)
 								[![AI Studio](https://img.shields.io/badge/PP_OCRv5-AI_Studio-green)](https://aistudio.baidu.com/community/app/91660/webUI)
 								[![AI Studio](https://img.shields.io/badge/PP_StructureV3-AI_Studio-green)](https://aistudio.baidu.com/community/app/518494/webUI)
 								[![AI Studio](https://img.shields.io/badge/PP_ChatOCRv4-AI_Studio-green)](https://aistudio.baidu.com/community/app/518493/webUI)
-												Update README_en.md (#15227)


											
										
										
											2025-05-20 17:41:15 +08:00
-												update README (#15213)

* update README

* update README

* update
											
										
										
											2025-05-20 16:49:56 +08:00
+								</div>
-												updata en docs (#14036)

* updata en docs

* updata en docs

* updata en docs

* updata en docs
											
										
										
											2024-10-18 17:10:40 +08:00
-												update README (#15213)

* update README

* update README

* update
											
										
										
											2025-05-20 16:49:56 +08:00
+								## 🚀 Introduction
 								Since its initial release, PaddleOCR has gained widespread acclaim across academia, industry, and research communities, thanks to its cutting-edge algorithms and proven performance in real-world applications. It’s already powering popular open-source projects like Umi-OCR, OmniParser, MinerU, and RAGFlow, making it the go-to OCR toolkit for developers worldwide.
-												update readme: added PaddleOCR algorithm model challenge champion solutions (#13263)


											
										
										
											2024-07-04 19:34:35 +08:00
-												Update README_en.md (#15227)


											
										
										
											2025-05-20 17:41:15 +08:00
+								On May 20, 2025, the PaddlePaddle team unveiled PaddleOCR 3.0, fully compatible with the official release of the **PaddlePaddle 3.0** framework. This update further **boosts text-recognition accuracy**, adds support for **multiple text-type recognition** and **handwriting recognition**, and meets the growing demand from large-model applications for **high-precision parsing of complex documents**. When combined with the **ERNIE 4.5T**, it significantly enhances key-information extraction accuracy. PaddleOCR 3.0 also introduces support for domestic hardware platforms such as **KUNLUNXIN** and **Ascend**.
-												docs: Update README_en (#13545)

* docs: Update README

* docs: Update English README

* docs: Update README_en
											
										
										
											2024-07-30 14:38:54 +08:00
-												Update README_en.md (#15267)


											
										
										
											2025-05-20 20:37:41 +08:00
+								Three Major New Features in PaddleOCR 3.0:
-												fix docs (#15312)

* fix docs

* update
											
										
										
											2025-05-22 12:02:21 +08:00
+								- Universal-Scene Text Recognition Model [PP-OCRv5](./docs/version3.x/algorithm/PP-OCRv5/PP-OCRv5.en.md): A single model that handles five different text types plus complex handwriting. Overall recognition accuracy has increased by 13 percentage points over the previous generation. [Online Demo](https://aistudio.baidu.com/community/app/91660/webUI)
-												docs: Update README_en (#13545)

* docs: Update README

* docs: Update English README

* docs: Update README_en
											
										
										
											2024-07-30 14:38:54 +08:00
-												fix docs (#15312)

* fix docs

* update
											
										
										
											2025-05-22 12:02:21 +08:00
+								- General Document-Parsing Solution [PP-StructureV3](./docs/version3.x/algorithm/PP-StructureV3/PP-StructureV3.en.md): Delivers high-precision parsing of multi-layout, multi-scene PDFs, outperforming many open- and closed-source solutions on public benchmarks. [Online Demo](https://aistudio.baidu.com/community/app/518494/webUI)
-												docs: Update README (#13543)

* docs: Update README

* docs: Update English README
											
										
										
											2024-07-30 13:09:43 +08:00
-												fix docs (#15312)

* fix docs

* update
											
										
										
											2025-05-22 12:02:21 +08:00
+								- Intelligent Document-Understanding Solution [PP-ChatOCRv4](./docs/version3.x/algorithm/PP-ChatOCRv4/PP-ChatOCRv4.en.md): Natively powered by the WenXin large model 4.5T, achieving 15 percentage points higher accuracy than its predecessor. [Online Demo](https://aistudio.baidu.com/community/app/518493/webUI)
-												update readme_en,fix_documents (#10592)


											
										
										
											2023-08-09 22:36:43 +08:00
-												update README (#15213)

* update README

* update README

* update
											
										
										
											2025-05-20 16:49:56 +08:00
+								In addition to providing an outstanding model library, PaddleOCR 3.0 also offers user-friendly tools covering model training, inference, and service deployment, so developers can rapidly bring AI applications to production.
-												update readme_en,fix_documents (#10592)


											
										
										
											2023-08-09 22:36:43 +08:00
+								<div align="center">
-												update README (#15213)

* update README

* update README

* update
											
										
										
											2025-05-20 16:49:56 +08:00
+								  <p>
 								      <img width="100%" src="./docs/images/Arch.png" alt="PaddleOCR Architecture"></a>
 								  </p>
-												update readme_en,fix_documents (#10592)


											
										
										
											2023-08-09 22:36:43 +08:00
+								</div>
-												update README (#15213)

* update README

* update README

* update
											
										
										
											2025-05-20 16:49:56 +08:00
 								## 📣 Recent updates
 								🔥🔥2025.05.20: Official Release of **PaddleOCR v3.0**, including:
 								- **PP-OCRv5**: High-Accuracy Text Recognition Model for All Scenarios - Instant Text from Images/PDFs.
 . 🌐 Single-model support for **five** text types - Seamlessly process **Simplified Chinese, Traditional Chinese, Simplified Chinese Pinyin, English** and **Japanse** within a single model.
 . ✍️ Improved **handwriting recognition**: Significantly better at complex cursive scripts and non-standard handwriting.
 . 🎯 **13-point accuracy gain** over PP-OCRv4, achieving state-of-the-art performance across a variety of real-world scenarios.
 								- **PP-StructureV3**: General-Purpose Document Parsing – Unleash SOTA Images/PDFs Parsing for Real-World Scenarios!
 . 🧮 **High-Accuracy multi-scene PDF parsing**, leading both open- and closed-source solutions on the OmniDocBench benchmark.
 . 🧠 Specialized capabilities include **seal recognition**, **chart-to-table conversion**, **table recognition with nested formulas/images**, **vertical text document parsing**, and **complex table structure analysis**.
 								- **PP-ChatOCRv4**: Intelligent Document Understanding – Extract Key Information, not just text from Images/PDFs.
-												Update README_en.md (#15227)


											
										
										
											2025-05-20 17:41:15 +08:00
+. 🔥 **15-point accuracy gain** in key-information extraction on PDF/PNG/JPG files over the previous generation.
 . 💻 Native support for **ERINE4.5 Turbo**, with compatibility for large-model deployments via PaddleNLP, Ollama, vLLM, and more.
-												Update README_en.md (#15247)


											
										
										
											2025-05-20 18:50:38 +08:00
+. 🤝 Integrated [PP-DocBee2](https://github.com/PaddlePaddle/PaddleMIX/tree/develop/paddlemix/examples/ppdocbee2), enabling extraction and understanding of printed text, handwriting, seals, tables, charts, and other common elements in complex documents.
-												update README (#15213)

* update README

* update README

* update
											
										
										
											2025-05-20 16:49:56 +08:00
 								<details>
 								   <summary><strong>The history of updates </strong></summary>
 								- 🔥🔥2025.03.07: Release of **PaddleOCR v2.10**, including:
 								  - **12 new self-developed models:**
 								    - **[Layout Detection series](https://paddlepaddle.github.io/PaddleX/latest/en/module_usage/tutorials/ocr_modules/layout_detection.html)**(3 models): PP-DocLayout-L, M, and S -- capable of detecting 23 common layout types across diverse document formats(papers, reports, exams, books, magazines, contracts, etc.) in English and Chinese. Achieves up to **90.4% mAP@0.5** , and lightweight features can process over 100 pages per second.
 								    - **[Formula Recognition series](https://paddlepaddle.github.io/PaddleX/latest/en/module_usage/tutorials/ocr_modules/formula_recognition.html)**(2 models): PP-FormulaNet-L and S -- supports recognition of 50,000+ LaTeX expressions, handling both printed and handwritten formulas. PP-FormulaNet-L offers **6% higher accuracy** than comparable models; PP-FormulaNet-S is 16x faster while maintaining similar accuracy.
 								    - **[Table Structure Recognition series](https://paddlepaddle.github.io/PaddleX/latest/en/module_usage/tutorials/ocr_modules/table_structure_recognition.html)**(2 models): SLANeXt_wired and SLANeXt_wireless -- newly developed models with **6% accuracy improvement** over SLANet_plus in complex table recognition.
 								    - **[Table Classification](https://paddlepaddle.github.io/PaddleX/latest/en/module_usage/tutorials/ocr_modules/table_classification.html)**(1 model):
 								PP-LCNet_x1_0_table_cls -- an ultra-lightweight classifier for wired and wireless tables.
 								[Learn more](https://paddlepaddle.github.io/PaddleOCR/latest/en/update.html)
 								</details>
 								## ⚡ Quick Start
-												Update README_en.md (#15247)


											
										
										
											2025-05-20 18:50:38 +08:00
+								### 1. Run online demo
-												update README (#15213)

* update README

* update README

* update
											
										
										
											2025-05-20 16:49:56 +08:00
+								[![AI Studio](https://img.shields.io/badge/PP_OCRv5-AI_Studio-green)](https://aistudio.baidu.com/community/app/91660/webUI)
 								[![AI Studio](https://img.shields.io/badge/PP_StructureV3-AI_Studio-green)](https://aistudio.baidu.com/community/app/518494/webUI)
 								[![AI Studio](https://img.shields.io/badge/PP_ChatOCRv4-AI_Studio-green)](https://aistudio.baidu.com/community/app/518493/webUI)
-												Update README_en.md (#15227)


											
										
										
											2025-05-20 17:41:15 +08:00
-												update README (#15213)

* update README

* update README

* update
											
										
										
											2025-05-20 16:49:56 +08:00
+								### 2. Installation
-												Update README_en.md (#15247)


											
										
										
											2025-05-20 18:50:38 +08:00
+								Install PaddlePaddle refer to [Installation Guide](https://www.paddlepaddle.org.cn/en/install/quick?docurl=/documentation/docs/en/develop/install/pip/linux-pip_en.html), after then, install the PaddleOCR toolkit.
-												update README (#15213)

* update README

* update README

* update
											
										
										
											2025-05-20 16:49:56 +08:00
 								```bash
-												Update README_en.md (#15247)


											
										
										
											2025-05-20 18:50:38 +08:00
+								# Install paddleocr
-												fix docs (#15312)

* fix docs

* update
											
										
										
											2025-05-22 12:02:21 +08:00
+								pip install paddleocr==3.0.0
-												update README (#15213)

* update README

* update README

* update
											
										
										
											2025-05-20 16:49:56 +08:00
+								```
 								### 3. Run inference by CLI
 								```bash
 								# Run PP-OCRv5 inference
-												fix docs (#15312)

* fix docs

* update
											
										
										
											2025-05-22 12:02:21 +08:00
+								paddleocr ocr -i https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_002.png --use_doc_orientation_classify False --use_doc_unwarping False --use_textline_orientation False
-												update README (#15213)

* update README

* update README

* update
											
										
										
											2025-05-20 16:49:56 +08:00
 								# Run PP-StructureV3 inference
-												Feat/mkdocs (#15274)

* update README

* update README
											
										
										
											2025-05-20 20:33:20 +08:00
+								paddleocr pp_structurev3 -i https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/pp_structure_v3_demo.png --use_doc_orientation_classify False --use_doc_unwarping False
-												update README (#15213)

* update README

* update README

* update
											
										
										
											2025-05-20 16:49:56 +08:00
-												Update README_en.md (#15227)


											
										
										
											2025-05-20 17:41:15 +08:00
+								# Get the Qianfan API Key at first, and then run PP-ChatOCRv4 inference
-												fix docs (#15312)

* fix docs

* update
											
										
										
											2025-05-22 12:02:21 +08:00
+								paddleocr pp_chatocrv4_doc -i https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/vehicle_certificate-1.png -k 驾驶室准乘人数 --qianfan_api_key your_api_key --use_doc_orientation_classify False --use_doc_unwarping False
-												update README (#15213)

* update README

* update README

* update
											
										
										
											2025-05-20 16:49:56 +08:00
 								# Get more information about "paddleocr ocr"
 								paddleocr ocr --help
 								```
-												Update README_en.md (#15247)


											
										
										
											2025-05-20 18:50:38 +08:00
-												update README (#15213)

* update README

* update README

* update
											
										
										
											2025-05-20 16:49:56 +08:00
+								### 4. Run inference by API
-												Update README_en.md (#15247)


											
										
										
											2025-05-20 18:50:38 +08:00
+								**4.1 PP-OCRv5 Example**
-												update README (#15213)

* update README

* update README

* update
											
										
										
											2025-05-20 16:49:56 +08:00
+								```python
 								# Initialize PaddleOCR instance
-												fix docs (#15312)

* fix docs

* update
											
										
										
											2025-05-22 12:02:21 +08:00
+								ocr = PaddleOCR(
 								    use_doc_orientation_classify=False,
 								    use_doc_unwarping=False,
 								    use_textline_orientation=False)
-												update README (#15213)

* update README

* update README

* update
											
										
										
											2025-05-20 16:49:56 +08:00
 								# Run OCR inference on a sample image
-												fix docs (#15312)

* fix docs

* update
											
										
										
											2025-05-22 12:02:21 +08:00
+								result = ocr.predict(
 								    input="https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_002.png")
-												update README (#15213)

* update README

* update README

* update
											
										
										
											2025-05-20 16:49:56 +08:00
 								# Visualize the results and save the JSON results
 								for res in result:
 								    res.print()
 								    res.save_to_img("output")
 								    res.save_to_json("output")
 								```
 								<details>
 								    <summary><strong>4.2 PP-StructureV3 Example</strong></summary>
 								```python
 								from pathlib import Path
 								from paddleocr import PPStructureV3
 								pipeline = PPStructureV3()
 								# For Image
-												fix docs (#15312)

* fix docs

* update
											
										
										
											2025-05-22 12:02:21 +08:00
+								output = pipeline.predict(
 								    input="https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/pp_structure_v3_demo.png",
 								    use_doc_orientation_classify=False,
 								    use_doc_unwarping=False
 								    )
-												update README (#15213)

* update README

* update README

* update
											
										
										
											2025-05-20 16:49:56 +08:00
 								# Visualize the results and save the JSON results
 								for res in output:
 								    res.print()
-												fix docs (#15312)

* fix docs

* update
											
										
										
											2025-05-22 12:02:21 +08:00
+								    res.save_to_json(save_path="output")
 								    res.save_to_markdown(save_path="output")
-												update README (#15213)

* update README

* update README

* update
											
										
										
											2025-05-20 16:49:56 +08:00
+								```
 								</details>
 								<details>
 								   <summary><strong>4.3 PP-ChatOCRv4 Example</strong></summary>
 								```python
 								from paddleocr import PPChatOCRv4Doc
 								chat_bot_config = {
 								    "module_name": "chat_bot",
 								    "model_name": "ernie-3.5-8k",
 								    "base_url": "https://qianfan.baidubce.com/v2",
 								    "api_type": "openai",
 								    "api_key": "api_key",  # your api_key
 								}
 								retriever_config = {
 								    "module_name": "retriever",
 								    "model_name": "embedding-v1",
 								    "base_url": "https://qianfan.baidubce.com/v2",
 								    "api_type": "qianfan",
 								    "api_key": "api_key",  # your api_key
 								}
-												fix docs (#15312)

* fix docs

* update
											
										
										
											2025-05-22 12:02:21 +08:00
+								pipeline = PPChatOCRv4Doc(
 								    use_doc_orientation_classify=False,
-												update doc (#15331)


											
										
										
											2025-05-26 20:34:25 +08:00
+								    use_doc_unwarping=False
 								)
-												update README (#15213)

* update README

* update README

* update
											
										
										
											2025-05-20 16:49:56 +08:00
 								visual_predict_res = pipeline.visual_predict(
 								    input="https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/vehicle_certificate-1.png",
 								    use_common_ocr=True,
 								    use_seal_recognition=True,
 								    use_table_recognition=True,
 								)
-												update doc (#15331)


											
										
										
											2025-05-26 20:34:25 +08:00
+								mllm_predict_info = None
 								use_mllm = False
 								# If a multimodal large model is used, the local mllm service needs to be started. You can refer to the documentation: https://github.com/PaddlePaddle/PaddleX/blob/release/3.0/docs/pipeline_usage/tutorials/vlm_pipelines/doc_understanding.m d performs deployment and updates the mllm_chat_bot_config configuration.
 								if use_mllm:
 								    mllm_chat_bot_config = {
 								        "module_name": "chat_bot",
 								        "model_name": "PP-DocBee",
 								        "base_url": "http://127.0.0.1:8080/",  # your local mllm service url
 								        "api_type": "openai",
 								        "api_key": "api_key",  # your api_key
 								    }
 								    mllm_predict_res = pipeline.mllm_pred(
 								        input="https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/vehicle_certificate-1.png",
 								        key_list=["驾驶室准乘人数"],
 								        mllm_chat_bot_config=mllm_chat_bot_config,
 								    )
 								    mllm_predict_info = mllm_predict_res["mllm_res"]
-												update README (#15213)

* update README

* update README

* update
											
										
										
											2025-05-20 16:49:56 +08:00
+								visual_info_list = []
 								for res in visual_predict_res:
 								    visual_info_list.append(res["visual_info"])
 								    layout_parsing_result = res["layout_parsing_result"]
 								vector_info = pipeline.build_vector(
 								    visual_info_list, flag_save_bytes_vector=True, retriever_config=retriever_config
 								)
 								chat_result = pipeline.chat(
 								    key_list=["驾驶室准乘人数"],
 								    visual_info=visual_info_list,
 								    vector_info=vector_info,
 								    mllm_predict_info=mllm_predict_info,
 								    chat_bot_config=chat_bot_config,
 								    retriever_config=retriever_config,
 								)
 								print(chat_result)
 								```
 								</details>
-												Update README_en.md (#15247)


											
										
										
											2025-05-20 18:50:38 +08:00
+								### 5. Domestic AI Accelerators
 								- [Huawei Ascend](https://paddlepaddle.github.io/PaddleOCR/latest/version3.x/other_devices_support/paddlepaddle_install_NPU.html)
 								- [KUNLUNXIN](https://paddlepaddle.github.io/PaddleOCR/latest/version3.x/other_devices_support/paddlepaddle_install_XPU.html)
 								## ⛰️ Advanced Tutorials
 								- [PP-OCRv5 Tutorial](https://paddlepaddle.github.io/PaddleOCR/latest/version3.x/pipeline_usage/OCR.html)
 								- [PP-StructureV3 Tutorial](https://paddlepaddle.github.io/PaddleOCR/latest/version3.x/pipeline_usage/PP-StructureV3.html)
 								- [PP-ChatOCRv4 Tutorial](https://paddlepaddle.github.io/PaddleOCR/latest/version3.x/pipeline_usage/PP-ChatOCRv4.html)
 								## 🔄 Quick Overview of Execution Results
 								<div align="center">
 								  <p>
 								     <img width="100%" src="./docs/images/demo.gif" alt="PP-OCRv5 Demo"></a>
 								  </p>
 								</div>
 								<div align="center">
 								  <p>
 								     <img width="100%" src="./docs/images/blue_v3.gif" alt="PP-StructureV3 Demo"></a>
 								  </p>
 								</div>
 								## 👩‍👩‍👧‍👦 Community
-												update README (#15213)

* update README

* update README

* update
											
										
										
											2025-05-20 16:49:56 +08:00
-												Update README_en.md (#15258)


											
										
										
											2025-05-20 19:23:54 +08:00
+								| PaddlePaddle WeChat official account |  Join the tech discussion group |
 								| :---: | :---: |
 								| <img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/refs/heads/main/images/paddleocr/README/qrcode_for_paddlepaddle_official_account.jpg" width="150"> | <img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/refs/heads/main/images/paddleocr/README/qr_code_for_the_questionnaire.jpg" width="150"> |
-												update README (#15213)

* update README

* update README

* update
											
										
										
											2025-05-20 16:49:56 +08:00
+								## 😃 Awesome Projects Leveraging PaddleOCR
-												Update README_en.md (#15261)

* Update README_en.md

* Update README_en.md

---------

Co-authored-by: Yi Liu <liuyi.ntu@gmail.com>
											
										
										
											2025-05-20 19:33:05 +08:00
+								PaddleOCR wouldn’t be where it is today without its incredible community! 💗 A massive thank you to all our longtime partners, new collaborators, and everyone who’s poured their passion into PaddleOCR — whether we’ve named you or not. Your support fuels our fire!
-												update README (#15213)

* update README

* update README

* update
											
										
										
											2025-05-20 16:49:56 +08:00
+								| Project Name | Description |
 								| ------------ | ----------- |
 								| [RAGFlow](https://github.com/infiniflow/ragflow) <a href="https://github.com/infiniflow/ragflow"><img src="https://img.shields.io/github/stars/infiniflow/ragflow"></a>|RAG engine based on deep document understanding.|
 								| [MinerU](https://github.com/opendatalab/MinerU) <a href="https://github.com/opendatalab/MinerU"><img src="https://img.shields.io/github/stars/opendatalab/MinerU"></a>|Multi-type Document to Markdown Conversion Tool|
 								| [Umi-OCR](https://github.com/hiroi-sora/Umi-OCR) <a href="https://github.com/hiroi-sora/Umi-OCR"><img src="https://img.shields.io/github/stars/hiroi-sora/Umi-OCR"></a>|Free, Open-source, Batch Offline OCR Software.|
 								| [OmniParser](https://github.com/microsoft/OmniParser)<a href="https://github.com/microsoft/OmniParser"><img src="https://img.shields.io/github/stars/microsoft/OmniParser"></a> |OmniParser: Screen Parsing tool for Pure Vision Based GUI Agent.|
 								| [QAnything](https://github.com/netease-youdao/QAnything)<a href="https://github.com/netease-youdao/QAnything"><img src="https://img.shields.io/github/stars/netease-youdao/QAnything"></a> |Question and Answer based on Anything.|
 								| [PDF-Extract-Kit](https://github.com/opendatalab/PDF-Extract-Kit) <a href="https://github.com/opendatalab/PDF-Extract-Kit"><img src="https://img.shields.io/github/stars/opendatalab/PDF-Extract-Kit"></a>|A powerful open-source toolkit designed to efficiently extract high-quality content from complex and diverse PDF documents.|
 								| [Dango-Translator](https://github.com/PantsuDango/Dango-Translator)<a href="https://github.com/PantsuDango/Dango-Translator"><img src="https://img.shields.io/github/stars/PantsuDango/Dango-Translator"></a> |Recognize text on the screen, translate it and show the translation results in real time.|
 								| [Learn more projects](./awesome_projects.md) | [More projects based on PaddleOCR](./awesome_projects.md)|
-												Update README_en.md (#15247)


											
										
										
											2025-05-20 18:50:38 +08:00
+								## 👩‍👩‍👧‍👦 Contributors
-												update readme_en,fix_documents (#10592)


											
										
										
											2023-08-09 22:36:43 +08:00
-												Update README_en.md (#15247)


											
										
										
											2025-05-20 18:50:38 +08:00
+								<a href="https://github.com/PaddlePaddle/PaddleOCR/graphs/contributors">
 								  <img src="https://contrib.rocks/image?repo=PaddlePaddle/PaddleOCR&max=400&columns=20"  width="800"/>
 								</a>
-												update readme_en,fix_documents (#10592)


											
										
										
											2023-08-09 22:36:43 +08:00
-												Update README_en.md (#15247)


											
										
										
											2025-05-20 18:50:38 +08:00
+								## 🌟 Star
 								[![Star History Chart](https://api.star-history.com/svg?repos=PaddlePaddle/PaddleOCR&type=Date)](https://star-history.com/#PaddlePaddle/PaddleOCR&Date)
-												update readme_en,fix_documents (#10592)


											
										
										
											2023-08-09 22:36:43 +08:00
 								## 📄 License
-												Update README_en.md (#15247)


											
										
										
											2025-05-20 18:50:38 +08:00
+								This project is released under the [Apache 2.0 license](LICENSE).
-												docs: Update README (#13543)

* docs: Update README

* docs: Update English README
											
										
										
											2024-07-30 13:09:43 +08:00
-												Update README_en.md (#15247)


											
										
										
											2025-05-20 18:50:38 +08:00
+								## 🎓 Citation
 								```
 								@misc{paddleocr2020,
 								title={PaddleOCR, Awesome multilingual OCR toolkits based on PaddlePaddle.},
 								author={PaddlePaddle Authors},
 								howpublished = {\url{https://github.com/PaddlePaddle/PaddleOCR}},
 								year={2020}
 								}
 								```