3 Commits

Author SHA1 Message Date
Matt Robinson
894a190001
enhancement: check for copy protection on PDFs and fallback to hi res when necessary (#514)
* function to check if pdf is extractable

* add fallback logic for unextractable pdfs

* tests for docs with copy protection

* add test for unprocessable pdf

* update docs

* changelog and version

* update logic for images; reset file before proceeding

* 3 files for api tests

* docs update
2023-04-21 21:35:43 +00:00
Sebastian Laverde Alfonso
ba59ad6b3a
chore: add copy-protected pdf to sample-docs (#512) 2023-04-21 18:02:38 +00:00
Matt Robinson
30b5a4da65
fix: parsing for files with message/rfc822 MIME type; dir for unsupported files (#358)
Adds the ability to process files with a message/rfc822 MIME type, which previously caused failures for example-docs/fake-email-header.eml.
2023-03-10 15:10:39 -08:00