update demo into mcp readme (#15954)

This commit is contained in:
Yiiii0 2025-07-03 09:38:19 -04:00 committed by GitHub
parent 38320f475c
commit 24b098cdc2
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
2 changed files with 77 additions and 2 deletions

View File

@ -5,7 +5,7 @@
This project provides a lightweight [Model Context Protocol (MCP)](https://modelcontextprotocol.io/introduction) server designed to integrate PaddleOCR capabilities into various LLM applications.
## Key Features
### Key Features
- **Currently Supported Tools**
- **OCR**: Performs text detection and recognition on images and PDF files.
@ -15,6 +15,43 @@ This project provides a lightweight [Model Context Protocol (MCP)](https://model
- **AI Studio Community Service**: Invokes services hosted on the [PaddlePaddle AI Studio Community](https://aistudio.baidu.com/pipeline/mine). This is suitable for quick testing, prototyping, or no-code scenarios.
- **Self-hosted Service**: Invokes the user's self-hosted PaddleOCR services. This mode offers the advantages of serving and high flexibility. It is suitable for scenarios requiring customized service configurations, as well as those with strict data privacy requirements. **Currently, only the basic serving solution is supported.**
## Examples:
The following showcases creative use cases built with PaddleOCR MCP server combined with other tools:
#### Demo 1: In Claude for Desktop, extract handwritten content from images and save to note-taking software Notion. The PaddleOCR MCP server extracts text, formulas and other information from images while preserving document structure.
<div align="center">
<img width="65%" src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/main/images/paddleocr/mcp_demo/note_to_notion.gif" alt="note_to_notion">
<img width="30%" src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/main/images/paddleocr/mcp_demo/note.jpg" alt="note">
</div>
- Note: In addition to the PaddleOCR MCP server, this demo also uses the [Notion MCP server](https://developers.notion.com/docs/mcp).
------------
#### Demo 2: In VSCode, convert handwritten ideas or pseudocode into runnable Python scripts that comply with project coding standards with one click, and upload them to GitHub repositories. The PaddleOCR MCP server extracts high-accuracy handwritten code from images for subsequent processing.
<div align="center">
<img width="70%" img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/main/images/paddleocr/mcp_demo/code_to_github.gif" alt="code_to_github">
</div>
- In addition to the PaddleOCR MCP server, this demo also uses the [filesystem MCP server](https://github.com/modelcontextprotocol/servers/tree/main/src/filesystem).
------------
#### Demo 3: In Claude for Desktop, convert PDF documents or images containing complex tables, formulas, handwritten text and other content into locally editable files.
- Convert complex PDF documents with tables and watermarks to editable doc/Word format:
<div align="center">
<img width="70%" img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/main/images/paddleocr/mcp_demo/pdf_to_file.gif" alt="pdf_to_file">
</div>
- Convert images containing formulas and tables to editable csv/Excel format:
<div align="center">
<img width="70%" img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/00136903a4d0b5f11bd978cb0ef5d3c44f3aa5e9/images/paddleocr/mcp_demo/table_to_excel1.png" alt="table_to_excel1">
<img width="50%" img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/00136903a4d0b5f11bd978cb0ef5d3c44f3aa5e9/images/paddleocr/mcp_demo/table_to_excel2.png" alt="table_to_excel2">
<img width="45%" img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/00136903a4d0b5f11bd978cb0ef5d3c44f3aa5e9/images/paddleocr/mcp_demo/table_to_excel3.png" alt="table_to_excel3">
</div>
- In addition to the PaddleOCR MCP server, this demo also uses the [filesystem MCP server](https://github.com/modelcontextprotocol/servers/tree/main/src/filesystem).
------------
### Table of Contents
- [Table of Contents](#table-of-contents)

View File

@ -5,7 +5,7 @@
本项目提供轻量级的 [Model Context ProtocolMCP](https://modelcontextprotocol.io/introduction) 服务器,旨在将 PaddleOCR 的能力集成到各种大模型应用中。
主要功能如下:
### 主要功能如下:
- **当前支持的工具**
- **OCR**:对图像和 PDF 文件进行文本检测与识别。
@ -15,6 +15,44 @@
- **星河社区服务**:调用托管在 [飞桨星河社区](https://aistudio.baidu.com/pipeline/mine) 的服务。此模式适合快速体验功能、快速验证方案等,也适用于零代码开发场景。
- **自托管服务**:调用用户自托管的 PaddleOCR 服务。此模式具备服务化部署优势及高度灵活性,适用于需要自定义服务配置的场景,同时也适用于对数据隐私有严格要求的场景。**目前暂时只支持基础服务化部署方案。**
## 示例:
以下展示了使用 PaddleOCR MCP 服务器结合其他工具搭建的创意案例:
#### Demo 1在 Claude for Desktop 中,提取图像中的手写内容,并存到笔记软件 Notion。PaddleOCR MCP 服务器从图像中提取了文字、公式等信息,并保留了文档的结构。
<div align="center">
<img width="65%" src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/main/images/paddleocr/mcp_demo/note_to_notion.gif" alt="note_to_notion">
<img width="30%" src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/main/images/paddleocr/mcp_demo/note.jpg" alt="note">
</div>
- 注: 除 PaddleOCR MCP 服务器外,此 demo 还使用 [Notion MCP 服务器](https://developers.notion.com/docs/mcp)。
------------
#### Demo 2在 VSCode 中,根据手写思路或伪代码一键转换为可运行并符合项目代码风格规范的 Python 脚本,并将其上传到 GitHub 仓库中。PaddleOCR MCP 服务器从图像中提取高准确率达手写代码供后续步骤使用。
<div align="center">
<img width="70%" img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/main/images/paddleocr/mcp_demo/code_to_github.gif" alt="code_to_github">
</div>
- 除 PaddleOCR MCP 服务器外,此 demo 还使用 [filesystem MCP 服务器](https://github.com/modelcontextprotocol/servers/tree/main/src/filesystem)。
------------
#### Demo 3在 Claude for Desktop 中,将含有复杂表格、公式、手写文字等内容的 PDF 文档或图片转存为本地可编辑文件。
- 含表格水印复杂文档PDF 转为 doc/Word 可编辑格式:
<div align="center">
<img width="70%" img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/main/images/paddleocr/mcp_demo/pdf_to_file.gif" alt="pdf_to_file">
</div>
- 含公式表格图片转为 csv/Excel 可编辑格式:
<div align="center">
<img width="70%" img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/00136903a4d0b5f11bd978cb0ef5d3c44f3aa5e9/images/paddleocr/mcp_demo/table_to_excel1.png" alt="table_to_excel1">
<img width="50%" img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/00136903a4d0b5f11bd978cb0ef5d3c44f3aa5e9/images/paddleocr/mcp_demo/table_to_excel2.png" alt="table_to_excel2">
<img width="45%" img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/00136903a4d0b5f11bd978cb0ef5d3c44f3aa5e9/images/paddleocr/mcp_demo/table_to_excel3.png" alt="table_to_excel3">
</div>
- 除 PaddleOCR MCP 服务器外,此 demo 还使用 [filesystem MCP 服务器](https://github.com/modelcontextprotocol/servers/tree/main/src/filesystem)。
------------
### 目录
- [目录](#目录)