mirror of
https://github.com/Unstructured-IO/unstructured.git
synced 2025-12-09 14:04:05 +00:00
This refactor removes `_convert_to_standard_langcode` and replaces it
with calling `_get_iso639_language_object` with a string slice.
Use of TESSERACT_LANGUAGES_AND_CODES, which was added to
`_convert_to_standard_langcode` previously, is moved to the relevant
part where `_convert_to_standard_langcode` was previously called.
If/else statements replace the list comprehension for readability and
`langdetect_langs.append("zho")` replaces
`_convert_to_standard_langcode("zh")` since that always returned
`"zho"`.