mirror of
https://github.com/ocrmypdf/OCRmyPDF.git
synced 2025-09-26 00:24:35 +00:00

Tesseract 3.05.01 backported the textonly_pdf=1 which allows the use of this superior PDF renderer prior to 4.00 alpha. This means that the tess4 name is no longer accurate, so call it a sandwich because of its merge-preserve characteristic. Preserve the tess4 name. Fix the documentation and tests to reflect this. Make it the default, because it’s better. It does not have the issues the “tesseract” renderer does prior to Tess 3.05.00 with rendering PDFs that Ghostscript corrupts, and it produces better output without re-rastering. Deprecate some old stuff to avoid the test suite growing obscenely large.