mirror of
https://github.com/deepset-ai/haystack.git
synced 2025-07-23 00:42:28 +00:00

* Add log lines for PDF conversion and make skipping more explicit in DocumentSplitter * Add logging statement for PDFMinerToDocument as well * Add tests * Remove unused line * Remove unused line * add reno * Add in PDF file * Update checks in PDF converters and add tests for document splitter * Revert * Remove line * Fix comment * Make mypy happy * Make mypy happy
8 lines
426 B
YAML
8 lines
426 B
YAML
---
|
|
features:
|
|
- |
|
|
Add warning logs to the PDFMinerToDocument and PyPDFToDocument to indicate when a processed PDF file has no content.
|
|
This can happen if the PDF file is a scanned image.
|
|
Also added an explicit check and warning message to the DocumentSplitter that warns the user that empty Documents are skipped.
|
|
This behavior was already occurring, but now its clearer through logs that this is happening.
|