mirror of
https://github.com/Unstructured-IO/unstructured.git
synced 2025-08-05 07:16:26 +00:00

**Summary** `partition_msg()` previously used the `msg_parser` library for parsing Outlook MSG email files (.msg files). The `msg_parser` library is unmaintained and has several major shortcomings such as not being able to parse MSG files with 8-bit encoded strings and not reliably extracting attachments. Use the new and permissively licenced `python-oxmsg` library instead. **Additional Context** For reviewability purposes, this PR temporarily places the new `partition_msg()` implementation in `new_msg.py` and references that implementation from `msg.py`. `new_msg.py` will be renamed to `msg.py` in a closely following PR. This avoids a very messy interleaving of hunks in a diff between the old and re-written `partition_msg()` implementation. Fixes #2481 Fixes #3006
5 lines
52 B
Plaintext
5 lines
52 B
Plaintext
-c ./deps/constraints.txt
|
|
-c base.txt
|
|
|
|
python-oxmsg
|