unstructured/test_unstructured
Christine Straub 4a3176885f
Fix/1057 etree parser error xlsx (#1094)
* feat: add functionality to check if a string contains any emoji characters

* feat: add functionality to switch `html` text parser based on whether the `html` text contains emoji

* chore: add `beautifulsoup4` and `emoji` packages to `requirements/base.in` for general use

* chore: update changelog & version

* chore: update changelog & version

* chore: update dependencies

* test: update `EXPECTED_XLS_TEXT_LEN` for `test_auto_partition_xls_from_filename`

* chore: update changelog & version

* feat: add functionality to switch html text parser based on whether the html text contains emoji

* chore: update changelog & version

* fix lint errors

* test: revert the `EXPECTED_XLS_TEXT_LEN` value back

* feat: always use `soupparser_fromstring` to parse `html text`

* fix lint error
2023-08-13 12:20:33 -07:00
..