English | [简体中文](./README_cn.md) | [繁體中文](./README_tcn.md) | [日本語](./README_ja.md) | [한국어](./README_ko.md) | [Français](./README_fr.md) | [Русский](./README_ru.md) | [Español](./README_es.md) | [العربية](./README_ar.md)
[](https://github.com/PaddlePaddle/PaddleOCR)
[](https://arxiv.org/abs/2507.05595)
[](https://pepy.tech/project/paddleocr)
[](https://pepy.tech/project/paddleocr)
[](https://github.com/PaddlePaddle/PaddleOCR/network/dependents)



[](./LICENSE)
[](https://deepwiki.com/PaddlePaddle/PaddleOCR)
**PaddleOCR is an industry-leading, production-ready OCR and document AI engine, offering end-to-end solutions from text extraction to intelligent document understanding**
# PaddleOCR
[](https://www.paddlepaddle.org.cn/en)
[](#)
[](#)
[](#)
[](#)
> [!TIP]
> PaddleOCR now provides an MCP server that supports integration with Agent applications like Claude Desktop. For details, please refer to [PaddleOCR MCP Server](https://paddlepaddle.github.io/PaddleOCR/latest/en/version3.x/deployment/mcp_server.html).
>
> The PaddleOCR 3.0 Technical Report is now available. See details at: [PaddleOCR 3.0 Technical Report](https://arxiv.org/abs/2507.05595)
**PaddleOCR** converts documents and images into **structured, AI-friendly data** (like JSON and Markdown) with **industry-leading accuracy**—powering AI applications for everyone from indie developers and startups to large enterprises worldwide. With over **50,000 stars** and deep integration into leading projects like **MinerU, RAGFlow, and OmniParser**, PaddleOCR has become the **premier solution** for developers building intelligent document applications in the **AI era**.
### PaddleOCR 3.0 Core Features
[](https://aistudio.baidu.com/community/app/91660/webUI)
[](https://aistudio.baidu.com/community/app/518494/webUI)
[](https://aistudio.baidu.com/community/app/518493/webUI)
[](https://www.modelscope.cn/organization/PaddlePaddle)
[](https://huggingface.co/PaddlePaddle)
- **PP-OCRv5 — Universal Scene Text Recognition**
**Single model supports five text types** (Simplified Chinese, Traditional Chinese, English, Japanese, and Pinyin) with **13% accuracy improvement**. Solves multilingual mixed document recognition challenges.
- **PP-StructureV3 — Complex Document Parsing**
Intelligently converts complex PDFs and document images into **Markdown and JSON files that preserve original structure**. **Outperforms** numerous commercial solutions in public benchmarks. **Perfectly maintains document layout and hierarchical structure**.
- **PP-ChatOCRv4 — Intelligent Information Extraction**
Natively integrates ERNIE 4.5 to **precisely extract key information** from massive documents, with 15% accuracy improvement over previous generation. Makes documents "**understand**" your questions and provide accurate answers.
In addition to providing an outstanding model library, PaddleOCR 3.0 also offers user-friendly tools covering model training, inference, and service deployment, so developers can rapidly bring AI applications to production.