* Have the MarkItdown MCP server read MARKITDOWN_ENABLE_PLUGINS from os.environ
* Update the Dockerfile to enable plugins. No puglins are installed by default.
* update markdown
* Update and install Python version suggestions
* Update README with prerequisites.
---------
Co-authored-by: Lucas Liu <lucas@LucasdeMacBook-Pro.local>
Co-authored-by: afourney <adamfo@microsoft.com>
* refactor: remove unused imports
* fix: replace NotImplemented with NotImplementedError
* refactor: resolve E722 (do not use bare 'except')
* refactor: remove unused variable
* refactor: remove unused imports
* refactor: ignore unused imports that will be used in the future
* refactor: resolve W293 (blank line contains whitespace)
* refactor: resolve F541 (f-string is missing placeholders)
---------
Co-authored-by: afourney <adamfo@microsoft.com>
* feat: Add CSV to Markdown table converter
- Add new CsvConverter class to convert CSV files to Markdown tables\n- Support text/csv and application/csv MIME types\n- Preserve table structure with headers and data rows\n- Handle edge cases like empty cells and mismatched columns\n- Fix Azure Document Intelligence dependency handling\n- Register CsvConverter in MarkItDown class
----
Thanks also to @benny123tw who submitted a very similar PR in #1171
* feat: math equation rendering in .docx files
* fix: import fix on .docx pre processing
* test: add test cases for docx equation rendering
* docs: add ThirdPartyNotices.md
* refactor: reformatted with black
* Make it easier to use AzureKeyCredentials with Azure Doc Intelligence
* Fixed mypy type error.
* Added more fine-grained options over types.
* Pass doc intel options further up the stack.
* Added an initial minimal MCP server for MarkItDown
* Added STDIO default option.
* Added a Dockerfile, and updated the README accordingly. Also added instructions for Claude Desktop
* Pin mcp version.
* optional reserve base64 string in markdown _CustomMarkdownify and pptx
* add other converter para support
* fix linter
* Use *kwarg to pass keep_data_uri para.
* Add module cli vector tests
* Fixed formatting, and adjusted tests.
Adjusts warning filters to be more contextual
Updates dependencies for magika and youtube-transcript-api
Updates the version to 0.1.0a5 in __about__.py
* Refactored tests.
* Fixed CI errors, and included misc tests.
* Omit mskanji from streaminfo test.
* Omit mskanji from no hints test.
* Log results of debugging in comments (linked to Magika issue)
* Added docs as to when to use misc tests.
* refactor(docker): remove unnecessary root user
The USER root directive isn't needed directly after FROM
Signed-off-by: Sebastian Yaghoubi <sebastianyaghoubi@gmail.com>
* fix(docker): use generic nobody nogroup default instead of uid gid
Signed-off-by: Sebastian Yaghoubi <sebastianyaghoubi@gmail.com>
* fix(docker): build app from source locally instead of installing package
Signed-off-by: Sebastian Yaghoubi <sebastianyaghoubi@gmail.com>
* fix(docker): use correct files in dockerignore
Signed-off-by: Sebastian Yaghoubi <sebastianyaghoubi@gmail.com>
* chore(docker): dont install recommended packages with git
Signed-off-by: Sebastian Yaghoubi <sebastianyaghoubi@gmail.com>
* fix(docker): run apt as non-interactive
Signed-off-by: Sebastian Yaghoubi <sebastianyaghoubi@gmail.com>
* Update Dockerfile to new package structure, and fix streaming bugs.
---------
Signed-off-by: Sebastian Yaghoubi <sebastianyaghoubi@gmail.com>
Co-authored-by: afourney <adamfo@microsoft.com>
* Sort PPTX shapes to be read in top-to-bottom, left-to-right order
Referenced from 39bef65b31/pptx2md/parser.py (L249)
* Update README.md
* Fixed formatting.
* Added missing import