mirror of
https://github.com/Unstructured-IO/unstructured.git
synced 2025-09-06 15:23:37 +00:00

This PR changes how we download NLTK data to use the native nltk downloader. We had moved to our own hosted NLTK dataset because of this CVE: https://nvd.nist.gov/vuln/detail/CVE-2024-39705 Ref: https://github.com/Unstructured-IO/unstructured/pull/3361 Latest versions of NLTK have fixed this issue: https://github.com/nltk/nltk/blob/develop/ChangeLog