Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
Updated 2025-06-26 22:27:05 +00:00
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
Updated 2025-06-23 12:15:26 +00:00
Python tool for converting files and office documents to Markdown.
Updated 2025-06-04 04:09:25 +00:00