This project provides a lightweight [Model Context Protocol (MCP)](https://modelcontextprotocol.io/introduction) server designed to integrate the powerful capabilities of PaddleOCR into a compatible MCP Host.
### Key Features
- **Currently Supported Pipelines**
- **OCR**: Performs text detection and recognition on images and PDF files.
- **PP-StructureV3**: Recognizes and extracts text blocks, titles, paragraphs, images, tables, and other layout elements from an image or PDF file, converting the input into a Markdown document.
- **Supports the following working modes**:
- **Local**: Runs the PaddleOCR pipeline directly on your machine using the installed Python library.
- **AI Studio**: Calls cloud services provided by the Paddle AI Studio community.
- **Self-hosted**: Calls a PaddleOCR service that you deploy yourself (serving).
You can check whether the installation was successful by running the following command:
```bash
paddleocr_mcp --help
```
If running the above command prints the help information, the installation was successful. This project depends on the `python-magic` library. If you see an error like the following when executing the command:
```
...
ImportError: failed to find libmagic. Check your installation
```
it is likely due to missing underlying libraries required by `python-magic`. Please refer to the [official python-magic documentation](https://github.com/ahupp/python-magic?tab=readme-ov-file#installation) to install the necessary dependencies.
In addition, some [working modes](#32-working-modes-explained) may require extra dependencies. For more details, please refer to the following sections.
This section guides you through a quick setup using **Claude for Desktop** as the MCP Host and the **Local Python Library** mode. Please refer to [3. Configuration](#3-configuration) for other working modes and more configuration options.
- Refer to the [PaddleOCR Installation Guide](../installation.en.md) to install the PaddlePaddle framework and PaddleOCR. **It is strongly recommended to install them in a separate virtual environment** to avoid dependency conflicts.
Open the `claude_desktop_config.json` file and add the configuration by referring to [5.2 Local Python Library Configuration](#52-local-python-library-configuration)
1. Visit the [Paddle AI Studio community](https://aistudio.baidu.com/pipeline/mine) and log in. **Please note that AI Studio currently requires users to bind a mainland China phone number.** If you do not meet this requirement, please consider using an alternative working mode.
2. In the "PaddleX Pipeline" section under "More" on the left, navigate to [Create Pipeline] - [OCR] - [General OCR] - [Deploy Directly] - [Text Recognition Module, select PP-OCRv5_server_rec] - [Start Deployment].
3. Once deployed, obtain your **Service Base URL** (e.g., `https://xxxxxx.aistudio-hub.baidu.com`).
4. Get your **Access Token** from [this page](https://aistudio.baidu.com/index/accessToken).
- In addition to using the platform's preset model solutions, you can also train and deploy custom models on the platform.
#### Mode 2: Local Python Library (`local`)
This mode runs the model directly on your local machine and has certain requirements for the local environment and computer performance. It relies on the installed `paddleocr` inference package.
| `PADDLEOCR_MCP_PIPELINE` | `--pipeline` | `str` | The pipeline to run | `"OCR"`, `"PP-StructureV3"` | `"OCR"` |
| `PADDLEOCR_MCP_PPOCR_SOURCE` | `--ppocr_source` | `str` | The source of PaddleOCR capabilities | `"local"`, `"aistudio"`, `"self_hosted"` | `"local"` |
| `PADDLEOCR_MCP_SERVER_URL` | `--server_url` | `str` | Base URL of the underlying service (required for `aistudio` or `self_hosted` mode) | - | `None` |
| `PADDLEOCR_MCP_AISTUDIO_ACCESS_TOKEN` | `--aistudio_access_token` | `str` | AI Studio authentication token (required for `aistudio` mode) | - | `None` |
- Replace `<your-server-url>` with your AI Studio **Service Base URL**, e.g., `https://xxxxx.aistudio-hub.baidu.com`. Do not include endpoint paths (like `/ocr`).
- Replace `<your-access-token>` with your **Access Token**.
-`PADDLEOCR_MCP_PIPELINE_CONFIG` is optional. If not set, the default pipeline configuration is used. To adjust settings, such as changing models, refer to the [PaddleOCR and PaddleX documentation](../paddleocr_and_paddlex.en.md), export a pipeline configuration file, and set `PADDLEOCR_MCP_PIPELINE_CONFIG` to its absolute path.
- **OCR Pipeline**: If you are running in a CPU environment, it is recommended to switch to the `mobile` series models for better performance. You can change the detection and recognition models in your pipeline configuration file to `text_detection_model_name="PP-OCRv5_mobile_det"` and `text_recognition_model_name="PP-OCRv5_mobile_rec"` respectively.
- **PP-StructureV3 Pipeline**: Due to its model complexity, using this pipeline in an environment without a GPU is not recommended.
1. In `local` mode, the currently provided tools cannot handle Base64-encoded PDF document input.
2. In `local` mode, the currently provided tools do not infer the file type based on the model-specified `file_type` and may fail to process some complex URLs.
3. For the PP-StructureV3 pipeline, if the input file contains images, the returned result may significantly increase token usage. If image content is not needed, you can explicitly exclude it through the prompt to reduce resource consumption.