Steve Canny 03e0ed3519
rfctr(docx): DOCX emits std minified .text_as_html (#3545)
**Summary**
Eliminate historical "idiosyncracies" of `table.metadata.text_as_html`
HTML introduced by `partition_docx()`. Produce minified `.text_as_html`
consistent with that formed by chunking.

**Additional Context**
- nested tables appear as their extracted text in the parent cell (no
nested `<table>` elements in `.text_as_html`).
- DOCX `.text_as_html` is minified (no extra whitespace or thead, tbody,
tfoot elements).
2024-08-21 18:54:21 +00:00
..
2024-08-06 19:21:43 +00:00
2024-08-06 19:21:43 +00:00