mirror of
https://github.com/Unstructured-IO/unstructured.git
synced 2025-12-12 15:42:19 +00:00
Closes #1870 Defining both `languages` and `ocr_languages` raises a ValueError, but the api defaults to `ocr_languages` being an empty string, so if users define `languages` they are automatically hitting the ValueError. This fix checks if `ocr_languages` is an empty string and converts it to `None` to avoid this. ### Testing On the main branch, the following will raise the ValueError, but it will correctly partition on this branch ``` from unstructured.partition.auto import partition filename = "example-docs/category-level.docx" elements = partition(filename,languages=['spa'],ocr_languages="") elements[0].metadata.languages ``` --------- Co-authored-by: yuming <305248291@qq.com> Co-authored-by: Yuming Long <63475068+yuming-long@users.noreply.github.com> Co-authored-by: Austin Walker <awalk89@gmail.com>