mirror of
https://github.com/Unstructured-IO/unstructured.git
synced 2025-10-16 18:44:58 +00:00

Reviewers: I recommend reviewing commit-by-commit or just looking at the final version of `partition/docx.py` as View File. This refactor solves a few problems but mostly lays the groundwork to allow us to refine further aspects such as page-break detection, list-item detection, and moving python-docx internals upstream to that library so our work doesn't depend on that domain-knowledge.
8 lines
148 B
Python
8 lines
148 B
Python
# pyright: reportPrivateUsage=false
|
|
|
|
from typing import Union
|
|
|
|
from lxml import etree
|
|
|
|
def parse_xml(xml: Union[str, bytes]) -> etree._Element: ...
|