mirror of
https://github.com/Unstructured-IO/unstructured.git
synced 2025-06-27 02:30:08 +00:00

Closes issue #521. Implements the same logic as unstructured-inference/PR #136 for the ocr_only strategy. * Add functionality to convert a PDF in small chunks of pages at a time * Add functionality to write images to computer storage temporarily instead of keeping them in memory * Set the file's current position to the beginning after reading the file in convert_to_bytes
3.4 MiB
3.4 MiB