1 Commits

Author SHA1 Message Date
Tanay Soni
974b37eded
Add PreProcessor to simplify splitting and cleaning of docs (#473)
* Add PreProcessing

* Adjust PDF conversion tests

* Add tests for Preprocessing

* Add requirement

* Fix tests

* Ignore decoding errors for TextConverter

* Rename split_size to split_length

* Adjust tests

Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>
2020-10-15 10:42:08 +02:00