unstructured/test_unstructured
John 125b63cd7c
refactor: extract language helper functions (#2370)
This PR is one in a series of PRs for refactoring and fixing the
`languages` parameter so it can address incorrect input by users. #2293

Refactor `_convert_language_code_to_pytesseract_lang_code` and extract
`_get_iso639_language_object` to its own function


```
from unstructured.partition.lang import _convert_language_code_to_pytesseract_lang_code as convert
convert("English") # this will raise an error on both main and this branch
convert("en") # this will return "eng" on both branches
```
2024-01-16 17:51:03 +00:00
..