mirror of
https://github.com/Unstructured-IO/unstructured.git
synced 2025-09-16 03:48:33 +00:00

**Summary** The `content_type` argument received by `partition()` from the API is sometimes unreliable for MS-Office 2007+ MIME-types. What we've observed is that it gets the MS-Office bit right but falls down on distinguishing PPTX from DOCX or XLSX. Confirmation of these types is simple, fast, and reliable. Confirm all MS-Office `content_type` argument values asserted by callers of `detect_filetype()` and correct swapped values.