Commit Graph

  • da7bcea527
    docs: rephrase sentence (#1278) main onefloid 2025-06-04 06:09:25 +02:00
  • 3bfb821c09
    Have the MarkItDown MCP server read MARKITDOWN_ENABLE_PLUGINS from ENV (#1273) afourney 2025-06-03 09:35:33 -07:00
  • 62b72284fe
    pin onnxruntime on Windows (#1274) Tomasz Kalinowski 2025-05-28 16:13:51 -04:00
  • 1dd3c83339
    Promoting 0.1.2a1 to 0.1.2 (#1272) v0.1.2 afourney 2025-05-28 10:04:42 -07:00
  • 9dc982a3b1
    Small changes to favor streamable HTTP over deprecated SSE (#1264) afourney 2025-05-23 11:39:41 -07:00
  • effde4767b
    Preparing a pre-release of 0.1.2 (#1260) v0.1.2a1 afourney 2025-05-21 15:24:56 -07:00
  • 04bf831209
    docs: fix typos (#1201) rtpacks 2025-05-22 06:12:22 +08:00
  • 9fd680c366
    support streamable http mcp (#1245) Betula-L 2025-05-22 05:34:50 +08:00
  • 38261fd31c
    Update Python version requirement and add .cursorrules to .gitignore (#1249) 一I 2025-05-22 01:47:29 +08:00
  • 131f0c7739
    feat: add Document Intelligence API version selection via kwargs (#1253) Yi-Cheng Wang 2025-05-22 01:22:08 +08:00
  • 56f7579ce2
    FIX YouTube transcript errors (#1241) JoshClark-git 2025-05-21 10:17:57 -07:00
  • cb421cf9ea
    Chore: Make linter happy (#1256) t3tra 2025-05-22 02:02:16 +09:00
  • 39e7252940
    fix: python.lang.security.use-defused-xml-parse.use-defused-xml-parse-packages-markitdown-src-markitdown-converter_utils-docx-math-omml.py (#1251) kira-offgrid 2025-05-21 22:27:21 +05:30
  • bbcf876b18
    Switched from the stdlib minidom parser to defusedxml. (#1259) afourney 2025-05-21 09:47:14 -07:00
  • 041be54471
    Update README.md (#1187) createcentury 2025-04-14 01:31:40 +09:00
  • ebe2684b3d
    chore: fix typo in README.md (#1175) lentil32 2025-04-14 01:29:16 +09:00
  • 8576f1d915
    Add CSV to Markdown table conversion - fixes #1144 (#1176) Turdıbek 2025-04-13 21:19:00 +05:00
  • 3fcd48cdfc
    feat: render math equations in .docx documents (#1160) Sathindu 2025-03-28 18:36:38 -04:00
  • 9e067c42b6
    Make it easier to use AzureKeyCredentials with Azure Doc Intelligence (#1151) afourney 2025-03-26 10:44:11 -07:00
  • 9a951055f0
    Update readme to point to the mcp package. (#1158) afourney 2025-03-25 15:00:04 -07:00
  • 73b9d57312
    Update badges (#1157) afourney 2025-03-25 14:52:24 -07:00
  • 3ca57986ef
    Basic SSE MCP Server for MarkItDown (#1155) afourney 2025-03-25 14:38:22 -07:00
  • c1f9a323ee
    Bump version. (#1154) v0.1.1 afourney 2025-03-24 23:26:30 -07:00
  • e928b43afb
    convert_url renamed to convert_uri, and now handles data and file URIs (#1153) afourney 2025-03-24 21:43:04 -07:00
  • 2ffe6ea591
    Bump version. (#1150) v0.1.0 afourney 2025-03-22 11:21:32 -07:00
  • efc55b260d
    Bump version and resolve a console encoding error. (#1149) v0.1.0a6 afourney 2025-03-21 09:27:25 -07:00
  • 52432bd228
    Add support for preserving base64 encoded images (#1140) Yuzhong Zhang 2025-03-21 09:50:23 +08:00
  • c0a511ecff
    Updated docx file to include an image. (#1146) afourney 2025-03-20 12:25:56 -07:00
  • cd6aa41361
    Adjust warning filters and update dependencies (#1143) v0.1.0a5 afourney 2025-03-19 22:09:14 -07:00
  • 716f74dcb9
    Consider anything with a charset as plain text-convertible. (#1142) afourney 2025-03-19 20:46:35 -07:00
  • a93e0567e6
    EPub Support. Adapted #123 to not use epublib. (#1131) v0.1.0a4 afourney 2025-03-17 07:48:15 -07:00
  • c5f70b904f
    Have magika read from the stream. (#1136) afourney 2025-03-17 07:39:19 -07:00
  • 53834fdd24
    Investigate and silence warnings. (#1133) afourney 2025-03-15 23:41:35 -07:00
  • 5c565b7d79
    Fix remaining mypy errors. (#1132) afourney 2025-03-15 23:12:48 -07:00
  • a78857bd43
    Added epub test file. (#1130) afourney 2025-03-15 18:34:51 -07:00
  • 09df7fe8df
    Small fixes for autogen integration. (#1124) afourney 2025-03-12 19:18:11 -07:00
  • a62d8edb13 Small fixes for autogen integration. v0.1.0a3 Adam Fourney 2025-03-12 19:14:35 -07:00
  • 6a9f09b153 Updated Magika dependency. Adam Fourney 2025-03-12 16:15:33 -07:00
  • 0b815fb916
    Bumping version to 0.1.0a2 (#1123) afourney 2025-03-12 11:44:19 -07:00
  • de2c56ffbc Bumping version to 0.1.0a2 v0.1.0a2 Adam Fourney 2025-03-12 11:42:00 -07:00
  • 12620f1545
    Handle not supported plot type in pptx (#1122) Emanuele Meazzo 2025-03-12 19:26:23 +01:00
  • 5f75e16d20
    Refactored tests. (#1120) afourney 2025-03-12 11:08:06 -07:00
  • 75140a90e2
    fix: correct f-string formatting in FileConversionException (#1121) yushihang 2025-03-13 01:15:09 +08:00
  • af1be36e0c
    Added CLI options for extension, mimetypes, and charset. (#1115) afourney 2025-03-11 13:16:33 -07:00
  • 2a2ccc86aa Added mimetypes to _rss_converter Adam Fourney 2025-03-10 16:17:41 -07:00
  • 2e51ba22e7 Enhance type guessing. Adam Fourney 2025-03-10 16:05:41 -07:00
  • 8f8e58c9bb
    Minimize guesses when guesses are compatible. (#1114) afourney 2025-03-10 15:30:44 -07:00
  • 8e73a325c6
    Switch from puremagic to magika. (#1108) afourney 2025-03-10 12:49:52 -07:00
  • 2405f201af
    fix typo in well-known path list (#1109) Mohit Agarwal 2025-03-09 09:02:44 +05:30
  • f17bc21c9d If files use zip packaging, be smarter about inspecting their types. zip_formats Adam Fourney 2025-03-07 23:06:56 -08:00
  • 99d8e562db
    Fix exiftool in well-known paths. (#1106) afourney 2025-03-07 21:47:20 -08:00
  • 515fa854bf
    feat(docker): improve dockerfile build (#220) Sebastian Yaghoubi 2025-03-07 20:07:40 -08:00
  • e58bc486ee Added missing comma. v0.0.2 v0.0.X Adam Fourney 2025-03-07 16:18:47 -08:00
  • 81ef601c09
    Removed deprecation and other warnings. (#1105) afourney 2025-03-07 16:17:03 -08:00
  • 518b12c1fb
    Addresses #1068 (#1101) afourney 2025-03-07 15:46:30 -08:00
  • 0229ff6cb7
    feat: sort pptx shapes to be parsed in top-to-bottom, left-to-right order (#1104) Richard Ye 2025-03-07 18:45:14 -05:00
  • da73d64bfa Initial work to port #55 to MarkItDown 0.1.X onenote Adam Fourney 2025-03-06 13:17:58 -08:00
  • 82d84e3edd
    Fixed formatting. (#1098) afourney 2025-03-05 23:30:29 -08:00
  • 36c4bc9ec3
    Fixed deepcopy failure when passing llm_client (#1089) scalabreseGD 2025-03-06 08:25:37 +01:00
  • 80baa5db18
    fix(README): correct pip install command formatting (#1090) Andrea Pietrobon 2025-03-06 08:21:10 +01:00
  • 00a65e8f8b Fixed version in README. Adam Fourney 2025-03-05 23:10:21 -08:00
  • 6bedf6d950
    Fixed version. (#1097) v0.1.0a1 afourney 2025-03-05 22:52:52 -08:00
  • 9380112892
    Fixed loading of plugins. (#1096) afourney 2025-03-05 22:24:08 -08:00
  • 784c293579 Bump plugin version. Adam Fourney 2025-03-05 21:55:20 -08:00
  • 8eaf5a1da9 Clean up README.md v0.0.1 Adam Fourney 2025-03-05 21:35:08 -08:00
  • 38c924793c
    Bump version (#1095) afourney 2025-03-05 21:30:56 -08:00
  • 70e9f8c3c0
    Bump version. (#1094) afourney 2025-03-05 21:26:06 -08:00
  • e921497f79
    Update converter API, user streams rather than file paths (#1088) afourney 2025-03-05 21:16:55 -08:00
  • 1d2f231146
    Fixed property name (#1085) afourney 2025-03-03 09:45:36 -08:00
  • c5cd659f63
    Exploring ways to allow Optional dependencies (#1079) afourney 2025-03-03 09:06:19 -08:00
  • f01c6c5277
    Exceptions should subclass Exception not BaseException. (#1082) afourney 2025-02-28 16:28:35 -08:00
  • 43bd79adc9
    Print and log better exceptions when file conversions fail. (#1080) afourney 2025-02-28 16:07:47 -08:00
  • 9182923375
    Don't have ZipConverter accept OOXML files. This will never yield a good result. (#1078) afourney 2025-02-28 09:54:19 -08:00
  • 9a19fdd134
    Make sure extensions are unique in MarkItDown's convert methods. (#1076) afourney 2025-02-28 07:43:03 -08:00
  • b9526d5e47
    Bump version. (#1075) afourney 2025-02-28 07:30:46 -08:00
  • 326d17b802 Bump version. v0.0.1a5 Adam Fourney 2025-02-28 07:29:12 -08:00
  • 519fe172aa
    Unable to convert HTML to Markdown (#1072) Hieu Lam 2025-02-28 15:57:41 +07:00
  • e82e0c1372
    Add Support For PPTX Shape Groups (Fix in code design to not miss out on slide content) (#331) Matthew Powers 2025-02-28 02:21:51 -05:00
  • a394cc7c27
    fix: Implement retry logic for YouTube transcript fetching and fix URL decoding issue (#1035) Nima Akbarzadeh 2025-02-28 08:17:54 +01:00
  • a87fbf01ee
    add necessary imports (#861) tanreinama 2025-02-28 16:16:09 +09:00
  • d0ed74fdf4
    Fix UnboundLocalError in MarkItDown._convert (#1038) André Menezes 2025-02-28 07:11:27 +00:00
  • e4b419ba40
    Pin Markdownify version. (#1069) afourney 2025-02-27 23:09:33 -08:00
  • 4e0a10ecf3 ran unit tests locally kennyzhang/add-file-object-support Kenny Zhang 2025-02-27 16:44:50 -05:00
  • 950b135da6 formatting Kenny Zhang 2025-02-27 15:08:10 -05:00
  • b671345bb9 updated readme Kenny Zhang 2025-02-27 15:07:46 -05:00
  • d9a92f7f06 added file obj unit tests for rss and json Kenny Zhang 2025-02-27 15:05:29 -05:00
  • db0c8acbaf added file obj support to rss and plain text converters Kenny Zhang 2025-02-27 14:55:49 -05:00
  • 08330c2ac3 added core unit tests for file obj support Kenny Zhang 2025-02-27 11:27:05 -05:00
  • 4afc1fe886 added non-binary example to README Kenny Zhang 2025-02-21 13:31:37 -05:00
  • b0044720da updated docs Kenny Zhang 2025-02-20 16:56:47 -05:00
  • 07a28d4f00 black formatting Kenny Zhang 2025-02-20 16:49:37 -05:00
  • b8b3897952 modify ext guesser Kenny Zhang 2025-02-20 16:47:37 -05:00
  • 395ce2d301 close file object after using Kenny Zhang 2025-02-20 13:54:51 -05:00
  • 808401a331 added conversion path for file object in central class Kenny Zhang 2025-02-19 17:02:51 -05:00
  • e75f3f6f5b local path inputs to MarkitDown class adhere to new converterinput structure Kenny Zhang 2025-02-19 15:16:45 -05:00
  • 8e950325d2 refactored remaining converters Kenny Zhang 2025-02-19 14:01:43 -05:00
  • 096fef3d5f refactored more converters to support input class Kenny Zhang 2025-02-19 13:34:28 -05:00
  • 52cbff061a begin refactoring converter classes Kenny Zhang 2025-02-19 11:48:00 -05:00
  • 0027e6d425 added wrapper class for converter file input Kenny Zhang 2025-02-18 12:44:18 -05:00
  • 63a7bafadd removed redundant priority setting Kenny Zhang 2025-02-18 12:18:49 -05:00