mirror of
https://github.com/Unstructured-IO/unstructured.git
synced 2025-07-03 07:05:20 +00:00

Bumps unstructured-inference==05.23 to pull in @christinestraub's fix: https://github.com/Unstructured-IO/unstructured-inference/pull/198 , so embedded Images in PDF's are now included in partition results ("hi_res"). From the perspective of elements with clean text, this is not a big win as a lot of the images have OCR garbage. However, it is important to preserve image elements for other downstream use cases, so overall this is a step forward.