mirror of
https://github.com/microsoft/markitdown.git
synced 2025-06-26 22:00:21 +00:00
Basic SSE MCP Server for MarkItDown (#1155)
* Added an initial minimal MCP server for MarkItDown * Added STDIO default option. * Added a Dockerfile, and updated the README accordingly. Also added instructions for Claude Desktop * Pin mcp version.
This commit is contained in:
parent
c1f9a323ee
commit
3ca57986ef
26
packages/markitdown-mcp/Dockerfile
Normal file
26
packages/markitdown-mcp/Dockerfile
Normal file
@ -0,0 +1,26 @@
|
||||
FROM python:3.13-slim-bullseye
|
||||
|
||||
ENV DEBIAN_FRONTEND=noninteractive
|
||||
ENV EXIFTOOL_PATH=/usr/bin/exiftool
|
||||
ENV FFMPEG_PATH=/usr/bin/ffmpeg
|
||||
|
||||
# Runtime dependency
|
||||
RUN apt-get update && apt-get install -y --no-install-recommends \
|
||||
ffmpeg \
|
||||
exiftool
|
||||
|
||||
# Cleanup
|
||||
RUN rm -rf /var/lib/apt/lists/*
|
||||
|
||||
COPY . /app
|
||||
RUN pip --no-cache-dir install /app
|
||||
|
||||
WORKDIR /workdir
|
||||
|
||||
# Default USERID and GROUPID
|
||||
ARG USERID=nobody
|
||||
ARG GROUPID=nogroup
|
||||
|
||||
USER $USERID:$GROUPID
|
||||
|
||||
ENTRYPOINT [ "markitdown-mcp" ]
|
134
packages/markitdown-mcp/README.md
Normal file
134
packages/markitdown-mcp/README.md
Normal file
@ -0,0 +1,134 @@
|
||||
# MarkItDown-MCP
|
||||
|
||||
[](https://pypi.org/project/markitdown/)
|
||||

|
||||
[](https://github.com/microsoft/autogen)
|
||||
|
||||
The `markitdown-mcp` package provides a lightweight STDIO and SSE MCP server for calling MarkItDown.
|
||||
|
||||
It exposes one tool: `convert_to_markdown(uri)`, where uri can be any `http:`, `https:`, `file:`, or `data:` URI.
|
||||
|
||||
## Installation
|
||||
|
||||
To install the package, use pip:
|
||||
|
||||
```bash
|
||||
pip install markitdown-mcp
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
To run the MCP server, ussing STDIO (default) use the following command:
|
||||
|
||||
|
||||
```bash
|
||||
markitdown-mcp
|
||||
```
|
||||
|
||||
To run the MCP server, ussing SSE use the following command:
|
||||
|
||||
```bash
|
||||
markitdown-mcp --sse --host 127.0.0.1 --port 3001
|
||||
```
|
||||
|
||||
## Running in Docker
|
||||
|
||||
To run `markitdown-mcp` in Docker, build the Docker image using the provided Dockerfile:
|
||||
```bash
|
||||
docker build -t markitdown-mcp:latest .
|
||||
```
|
||||
|
||||
And run it using:
|
||||
```bash
|
||||
docker run -it --rm markitdown-mcp:latest
|
||||
```
|
||||
This will be sufficient for remote URIs. To access local files, you need to mount the local directory into the container. For example, if you want to access files in `/home/user/data`, you can run:
|
||||
|
||||
```bash
|
||||
docker run -it --rm -v /home/user/data:/workdir markitdown-mcp:latest
|
||||
```
|
||||
|
||||
Once mounted, all files under data will be accessible under `/workdir` in the container. For example, if you have a file `example.txt` in `/home/user/data`, it will be accessible in the container at `/workdir/example.txt`.
|
||||
|
||||
## Accessing from Claude Desktop
|
||||
|
||||
It is recommended to use the Docker image when running the MCP server for Claude Desktop.
|
||||
|
||||
Follow [these instrutions](https://modelcontextprotocol.io/quickstart/user#for-claude-desktop-users) to access Claude's `claude_desktop_config.json` file.
|
||||
|
||||
Edit it to include the following JSON entry:
|
||||
|
||||
```json
|
||||
{
|
||||
"mcpServers": {
|
||||
"markitdown": {
|
||||
"command": "docker",
|
||||
"args": [
|
||||
"run",
|
||||
"--rm",
|
||||
"-i",
|
||||
"markitdown-mcp:latest"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
If you want to mount a directory, adjust it accordingly:
|
||||
|
||||
```json
|
||||
{
|
||||
"mcpServers": {
|
||||
"markitdown": {
|
||||
"command": "docker",
|
||||
"args": [
|
||||
"run",
|
||||
"--rm",
|
||||
"-i",
|
||||
"-v",
|
||||
"/home/user/data:/workdir",
|
||||
"markitdown-mcp:latest"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Debugging
|
||||
|
||||
To debug the MCP server you can use the `mcpinspector` tool.
|
||||
|
||||
```bash
|
||||
npx @modelcontextprotocol/inspector
|
||||
```
|
||||
|
||||
You can then connect to the insepctor through the specified host and port (e.g., `http://localhost:5173/`).
|
||||
|
||||
If using STDIO:
|
||||
* select `STDIO` as the transport type,
|
||||
* input `markitdown-mcp` as the command, and
|
||||
* click `Connect`
|
||||
|
||||
If using SSE:
|
||||
* select `SSE` as the transport type,
|
||||
* input `http://127.0.0.1:3001/sse` as the URL, and
|
||||
* click `Connect`
|
||||
|
||||
Finally:
|
||||
* click the `Tools` tab,
|
||||
* click `List Tools`,
|
||||
* click `convert_to_markdown`, and
|
||||
* run the tool on any valid URI.
|
||||
|
||||
## Security Considerations
|
||||
|
||||
The server does not support authentication, and runs with the privileges if the user running it. For this reason, when running in SSE mode, it is recommended to run the server bound to `localhost` (default).
|
||||
|
||||
|
||||
## Trademarks
|
||||
|
||||
This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft
|
||||
trademarks or logos is subject to and must follow
|
||||
[Microsoft's Trademark & Brand Guidelines](https://www.microsoft.com/en-us/legal/intellectualproperty/trademarks/usage/general).
|
||||
Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship.
|
||||
Any use of third-party trademarks or logos are subject to those third-party's policies.
|
69
packages/markitdown-mcp/pyproject.toml
Normal file
69
packages/markitdown-mcp/pyproject.toml
Normal file
@ -0,0 +1,69 @@
|
||||
[build-system]
|
||||
requires = ["hatchling"]
|
||||
build-backend = "hatchling.build"
|
||||
|
||||
[project]
|
||||
name = "markitdown-mcp"
|
||||
dynamic = ["version"]
|
||||
description = 'An MCP server for the "markitdown" library.'
|
||||
readme = "README.md"
|
||||
requires-python = ">=3.10"
|
||||
license = "MIT"
|
||||
keywords = []
|
||||
authors = [
|
||||
{ name = "Adam Fourney", email = "adamfo@microsoft.com" },
|
||||
]
|
||||
classifiers = [
|
||||
"Development Status :: 4 - Beta",
|
||||
"Programming Language :: Python",
|
||||
"Programming Language :: Python :: 3.10",
|
||||
"Programming Language :: Python :: 3.11",
|
||||
"Programming Language :: Python :: 3.12",
|
||||
"Programming Language :: Python :: 3.13",
|
||||
"Programming Language :: Python :: Implementation :: CPython",
|
||||
"Programming Language :: Python :: Implementation :: PyPy",
|
||||
]
|
||||
dependencies = [
|
||||
"mcp~=1.5.0",
|
||||
"markitdown[all]>=0.1.1,<0.2.0",
|
||||
]
|
||||
|
||||
[project.urls]
|
||||
Documentation = "https://github.com/microsoft/markitdown#readme"
|
||||
Issues = "https://github.com/microsoft/markitdown/issues"
|
||||
Source = "https://github.com/microsoft/markitdown"
|
||||
|
||||
[tool.hatch.version]
|
||||
path = "src/markitdown_mcp/__about__.py"
|
||||
|
||||
[project.scripts]
|
||||
markitdown-mcp = "markitdown_mcp.__main__:main"
|
||||
|
||||
[tool.hatch.envs.types]
|
||||
extra-dependencies = [
|
||||
"mypy>=1.0.0",
|
||||
]
|
||||
[tool.hatch.envs.types.scripts]
|
||||
check = "mypy --install-types --non-interactive {args:src/markitdown_mcp tests}"
|
||||
|
||||
[tool.coverage.run]
|
||||
source_pkgs = ["markitdown-mcp", "tests"]
|
||||
branch = true
|
||||
parallel = true
|
||||
omit = [
|
||||
"src/markitdown_mcp/__about__.py",
|
||||
]
|
||||
|
||||
[tool.coverage.paths]
|
||||
markitdown-mcp = ["src/markitdown_mcp", "*/markitdown-mcp/src/markitdown_mcp"]
|
||||
tests = ["tests", "*/markitdown-mcp/tests"]
|
||||
|
||||
[tool.coverage.report]
|
||||
exclude_lines = [
|
||||
"no cov",
|
||||
"if __name__ == .__main__.:",
|
||||
"if TYPE_CHECKING:",
|
||||
]
|
||||
|
||||
[tool.hatch.build.targets.sdist]
|
||||
only-include = ["src/markitdown_mcp"]
|
4
packages/markitdown-mcp/src/markitdown_mcp/__about__.py
Normal file
4
packages/markitdown-mcp/src/markitdown_mcp/__about__.py
Normal file
@ -0,0 +1,4 @@
|
||||
# SPDX-FileCopyrightText: 2024-present Adam Fourney <adamfo@microsoft.com>
|
||||
#
|
||||
# SPDX-License-Identifier: MIT
|
||||
__version__ = "0.0.1a3"
|
9
packages/markitdown-mcp/src/markitdown_mcp/__init__.py
Normal file
9
packages/markitdown-mcp/src/markitdown_mcp/__init__.py
Normal file
@ -0,0 +1,9 @@
|
||||
# SPDX-FileCopyrightText: 2024-present Adam Fourney <adamfo@microsoft.com>
|
||||
#
|
||||
# SPDX-License-Identifier: MIT
|
||||
|
||||
from .__about__ import __version__
|
||||
|
||||
__all__ = [
|
||||
"__version__",
|
||||
]
|
83
packages/markitdown-mcp/src/markitdown_mcp/__main__.py
Normal file
83
packages/markitdown-mcp/src/markitdown_mcp/__main__.py
Normal file
@ -0,0 +1,83 @@
|
||||
import sys
|
||||
from typing import Any
|
||||
from mcp.server.fastmcp import FastMCP
|
||||
from starlette.applications import Starlette
|
||||
from mcp.server.sse import SseServerTransport
|
||||
from starlette.requests import Request
|
||||
from starlette.routing import Mount, Route
|
||||
from mcp.server import Server
|
||||
from markitdown import MarkItDown
|
||||
import uvicorn
|
||||
|
||||
# Initialize FastMCP server for MarkItDown (SSE)
|
||||
mcp = FastMCP("markitdown")
|
||||
|
||||
|
||||
@mcp.tool()
|
||||
async def convert_to_markdown(uri: str) -> str:
|
||||
"""Convert a resource described by an http:, https:, file: or data: URI to markdown"""
|
||||
return MarkItDown().convert_uri(uri).markdown
|
||||
|
||||
|
||||
def create_starlette_app(mcp_server: Server, *, debug: bool = False) -> Starlette:
|
||||
sse = SseServerTransport("/messages/")
|
||||
|
||||
async def handle_sse(request: Request) -> None:
|
||||
async with sse.connect_sse(
|
||||
request.scope,
|
||||
request.receive,
|
||||
request._send,
|
||||
) as (read_stream, write_stream):
|
||||
await mcp_server.run(
|
||||
read_stream,
|
||||
write_stream,
|
||||
mcp_server.create_initialization_options(),
|
||||
)
|
||||
|
||||
return Starlette(
|
||||
debug=debug,
|
||||
routes=[
|
||||
Route("/sse", endpoint=handle_sse),
|
||||
Mount("/messages/", app=sse.handle_post_message),
|
||||
],
|
||||
)
|
||||
|
||||
|
||||
# Main entry point
|
||||
def main():
|
||||
import argparse
|
||||
|
||||
mcp_server = mcp._mcp_server
|
||||
|
||||
parser = argparse.ArgumentParser(description="Run MCP SSE-based MarkItDown server")
|
||||
|
||||
parser.add_argument(
|
||||
"--sse",
|
||||
action="store_true",
|
||||
help="Run the server with SSE transport rather than STDIO (default: False)",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--host", default=None, help="Host to bind to (default: 127.0.0.1)"
|
||||
)
|
||||
parser.add_argument(
|
||||
"--port", type=int, default=None, help="Port to listen on (default: 3001)"
|
||||
)
|
||||
args = parser.parse_args()
|
||||
|
||||
if not args.sse and (args.host or args.port):
|
||||
parser.error("Host and port arguments are only valid when using SSE transport.")
|
||||
sys.exit(1)
|
||||
|
||||
if args.sse:
|
||||
starlette_app = create_starlette_app(mcp_server, debug=True)
|
||||
uvicorn.run(
|
||||
starlette_app,
|
||||
host=args.host if args.host else "127.0.0.1",
|
||||
port=args.port if args.port else 3001,
|
||||
)
|
||||
else:
|
||||
mcp.run()
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
0
packages/markitdown-mcp/src/markitdown_mcp/py.typed
Normal file
0
packages/markitdown-mcp/src/markitdown_mcp/py.typed
Normal file
3
packages/markitdown-mcp/tests/__init__.py
Normal file
3
packages/markitdown-mcp/tests/__init__.py
Normal file
@ -0,0 +1,3 @@
|
||||
# SPDX-FileCopyrightText: 2024-present Adam Fourney <adamfo@microsoft.com>
|
||||
#
|
||||
# SPDX-License-Identifier: MIT
|
Loading…
x
Reference in New Issue
Block a user