OCRmyPDF

mirror of https://github.com/ocrmypdf/OCRmyPDF.git synced 2025-08-03 06:12:30 +00:00

Author	SHA1	Message	Date
James R. Barlow	f82cb002bc	Try automatic versioning with setuptools_scm	2016-01-19 13:27:18 -08:00
James R. Barlow	6af0815681	Bump version	2016-01-09 18:45:06 -08:00
James R. Barlow	424b4b33b1	Just go right ahead and demand Python 3.4	2016-01-04 12:56:51 -08:00
James R. Barlow	e510f89792	Python 2 warning message	2015-12-21 09:38:38 -08:00
James R. Barlow	79b3472b26	All tests passed, bump version	2015-12-04 04:31:01 -08:00
James R. Barlow	281eafada0	bump to v3.0 and move repos	2015-09-05 00:53:14 -07:00
James R. Barlow	c14e10128a	Bump version to -rc9	2015-08-29 16:43:22 -07:00
James R. Barlow	2ce6834be4	Bump to -rc8	2015-08-24 01:25:01 -07:00
James R. Barlow	aab08bfcc7	Fix requirements.txt problem	2015-08-23 12:30:40 -07:00
James R. Barlow	ee7f008ff5	Require unpaper 6.1; no messing around with broken versions	2015-08-22 01:51:08 -07:00
James R. Barlow	4f3673d14d	Update notes for -rc6	2015-08-22 00:40:07 -07:00
James R. Barlow	9dad40b5a3	Major overhaul of the Dockerfile Switched from Ubuntu to debian:stretch because stretch has more recent versions of our binary packages and starts smaller. In particular, stretch has both pillow==2.9.0 and reportlab==3.2.0 available as system packages which saves the considerable hassle of install a toolchain. Instead, a pyvenv is set up with access to system's site-packages (note: needs two steps), making the binary-dependent packages available. Then the remaining packages are installed into the pyvenv with --no-cache-dir to avoid saving files. And there we are. Image is still very large (>500 MB), but programs like reportlab require font rendering capabilities so they pull in large portions of the Linux graphics stack. Not much will shrink that.	2015-08-20 01:25:31 -07:00
James R. Barlow	8e2d690cb0	Rework Dockerfile, setup.py to work with wheels for better cache use	2015-08-19 13:43:32 -07:00
James R. Barlow	2dff3e07ce	Drop libxml2 dependency It seems that Python's internal XML parser is good enough to do the job.	2015-08-17 15:26:07 -07:00
James R. Barlow	53c88093ad	Bump to -rc5	2015-08-16 02:19:04 -07:00
James R. Barlow	30072e0c70	Pillow sucks Far from being fluffy or friendly, Pillow silently allows installation of itself without support for major image types. Reportlab calls for pillow 2.4.0. On Ubuntu 14.04 LTS this will trigger an upgrade of pillow that will be built without JPEG or ZLIB so it is effectively neutered, and unfortunately Pillow will not detect this situation at install time and guide users to a resolution. Instead, you see nasty stack traces. So add a run-time check to ensure that Pillow is sane and capable of JPEG and PNG support since both may be used internally.	2015-08-16 00:54:03 -07:00
James R. Barlow	eb04a890b2	Relax Pillow requirement for Ubuntu 14.04 LTS	2015-08-15 15:55:56 -07:00
James R. Barlow	0c53adb04f	setup: rollback lxml version to 3.3.3 - that's the latest in Ubuntu 14.04	2015-08-15 15:25:58 -07:00
James R. Barlow	87aeeacb04	Fix erroneous instruction to "apt-get install tesseract" Should be tesseract-ocr	2015-08-15 15:17:38 -07:00
James R. Barlow	f6f4705ea3	Remove Java from setup.py	2015-08-14 00:44:56 -07:00
James R. Barlow	11dd9f14c3	setup.py: block unsafe 'upload', say to use twine instead	2015-08-09 14:16:30 -07:00
James R. Barlow	16d24f1166	Bump version to -rc4	2015-08-05 23:26:38 -07:00
James R. Barlow	a036de318e	Replace mupdf and poppler with qpdf Drop two dependencies and replace them with one that does the job of both. Smells like progress. mupdf does PDF file repair and rendering poppler does rendering and page splitting qpdf does PDF file repair and page splitting ghostscript does PDF file repair, rendering, and page splitting (sort of) So we use qpdf. Ghostscript's page splitting is supposed is less efficient because it reprints the page (PDF -> Postscript -> PDF) and possibly loses quality. qpdf's library could be used to improve performance. This causes a slight performance regression: py.test tests/test_main.py::test_maximum_options went from 187 seconds up to 192. This is likely due to O(n) serialized invocations of qpdf compared to a single serialized call to pdfseparate. Could improve on this situation by using the example code in qpdf: pdf-split-pages.cc or create marker files in split_pages() and then write a new @transform function that would split pages on each CPU. Probably not worth it, overall, unless this causes problems on files with hundreds of pages.	2015-07-30 04:16:35 -07:00
James R. Barlow	9918c4020e	Use img2pdf in test case because it does a better job	2015-07-30 03:35:56 -07:00
James R. Barlow	47e50f82c4	setup.py: allow mutool 1.7	2015-07-28 13:37:32 -07:00
James R. Barlow	27ecdfbba8	More fixes to error cases in setup.py	2015-07-28 13:05:23 -07:00
James R. Barlow	6901550065	Fix some installer issues	2015-07-28 12:41:24 -07:00
James R. Barlow	b9d7687fa0	Fixes: clarify install instructions and reactivate external program checks	2015-07-28 05:44:15 -07:00
James R. Barlow	9e0c443c2f	-rc2: because pypi won't accept -rc1	2015-07-28 04:55:10 -07:00
James R. Barlow	6a160d22fe	Update release notes, add copyrights	2015-07-28 04:36:58 -07:00
James R. Barlow	03f7c9bf07	setup.py: Only do program checks when installing	2015-07-27 02:14:51 -07:00
James R. Barlow	d5f4862749	setup.py: check for third party program requirements	2015-07-27 01:45:17 -07:00
James R. Barlow	2c1b5e100b	Test cases for pageinfo; complain about inline images	2015-07-26 18:18:41 -07:00
James R. Barlow	d3088829af	More packaging changes: move jhove, fix console script	2015-07-26 01:52:08 -07:00
James R. Barlow	9aaaba1714	Packaging stuff	2015-07-25 23:45:13 -07:00

1 2

85 Commits