2676 Commits

Author SHA1 Message Date
James R. Barlow
10aadefd6a Document return codes 2018-04-14 00:18:58 -07:00
James R. Barlow
e75b6280fd Try reading compressed data directly to see if Leptonica will add predictor
Turns out it does not transcode at all in this case, so probably going
to revert to transcoding PNG -> PNG. However if pngquant or similar is
done, this API will still be useful.
2018-04-13 23:55:23 -07:00
James R. Barlow
8c4023165a Release L_COMP_DATA properly 2018-04-13 23:53:41 -07:00
James R. Barlow
b7d403f106 Deprecate Pix.read() behaving as an open function 2018-04-13 23:52:46 -07:00
James R. Barlow
b069de0caa Use Leptonica to rewrite all PNGs with predictor
Leptonica does a better job of encoding them than Ghostscript, about -15%.
For a test file 450k worth of
PNGs was reduced to 388k with no loss of quality.
2018-04-13 16:35:50 -07:00
James R. Barlow
136da74bfa Update branch with v6.1.4 2018-04-13 12:57:21 -07:00
James R. Barlow
7fc897e6dc Fix NameError 'ghostscript' v6.1.4 2018-04-12 21:24:05 -07:00
James R. Barlow
9b731d63b8 Set Ghostscript -sColorConversionStrategy the way old/new versions expect 2018-04-12 16:28:48 -07:00
James R. Barlow
10aa59f674 v6.1.4 fix test suite regression with Ghostscript 9.23 2018-04-12 15:16:54 -07:00
James R. Barlow
1f7837e7b1 v6.1.4 release notes update 2018-04-12 00:55:45 -07:00
James R. Barlow
ba0535e3fb Update test cache to account for unpaper --layout none change 2018-04-12 00:48:21 -07:00
James R. Barlow
49fa7f6b5c tesseract_cache: don't reveal host system file paths in manifest file 2018-04-12 00:47:28 -07:00
James R. Barlow
c95db246d4 v6.1.4 merge 2018-04-11 15:58:00 -07:00
James R. Barlow
1ba93371ce docs: Update installation to reflect qpdf 7.0.0 requirement 2018-04-11 15:40:50 -07:00
James R. Barlow
fedbbdb575 Travis: compile qpdf from source
The older version in Travis's Ubuntu 14.04 can't pass the test suite anymore.
2018-04-11 15:40:45 -07:00
James R. Barlow
85ebba72bc Fix setup.py syntax 2018-04-10 18:30:48 -07:00
James R. Barlow
b6cd436d5d setup: Blacklist Pillow 5.1.0 on macos
https://github.com/python-pillow/Pillow/issues/3068
2018-04-10 18:15:37 -07:00
James R. Barlow
ec170c7e1e Travis: use setup.py for requirements, don't override with .txt 2018-04-10 17:52:19 -07:00
James R. Barlow
f6399eb90f optimize: use Leptonica to compact JPEGs
Pillow could do it too, but Leptonica is somewhat more PDF aware.
2018-04-10 17:45:05 -07:00
James R. Barlow
77f2448e59 Leptonica: add L_COMP_DATA compressed data manager 2018-04-10 17:44:03 -07:00
James R. Barlow
3d69b46fca Release notes 2018-04-10 15:53:02 -07:00
James R. Barlow
4b6153ad18 Use defusedxml for XML parsing when reading XMP 2018-04-10 14:25:13 -07:00
James R. Barlow
75d37eb103 docs: expand ocr of image usage 2018-04-09 13:06:09 -07:00
James R. Barlow
11b6f77df0 unpaper: close images on error paths 2018-04-09 13:05:12 -07:00
James R. Barlow
db8b0319dd get_version: repeat system error messages if the process exists with a signal 2018-04-09 13:04:51 -07:00
James R. Barlow
c9dd330766 JBIG2: refactor, don't recompress existing JBIG2 2018-04-09 13:04:10 -07:00
James R. Barlow
e40228102c JBIG2: Streams created in this manner are already indirect objects 2018-04-06 17:11:17 -07:00
James R. Barlow
7889c6fb4c Parallelize JBIG2 execution with thread pools 2018-04-06 17:00:23 -07:00
James R. Barlow
6eb1773110 Fix JBIG2Globals included multiple times in output 2018-04-06 17:00:03 -07:00
James R. Barlow
1d25823746 Implement functional, single threaded optimize
Passes verapdf
2018-04-06 15:49:16 -07:00
James R. Barlow
d1d4f1e198 Add issue links to release notes 2018-04-06 14:52:40 -07:00
James R. Barlow
709c01c7a1 Regroup three merge steps into a single step
All take the same inputs and deliver similar outputs, so it makes sense.
2018-04-06 01:07:02 -07:00
James R. Barlow
4a341c9034 Merge branch 'master' into feature/jbig2-2018 2018-04-05 21:29:39 -07:00
James R. Barlow
be41ff6d54 Update flowchart
[ci skip]
2018-04-05 21:26:37 -07:00
James R. Barlow
1dbb6f1746 Notes on relevant envvars, repology 2018-04-05 02:15:01 -07:00
James R. Barlow
753e6274ab Tell unpaper to use --layout none so it won't blank out multi column text 2018-04-05 02:14:33 -07:00
James R. Barlow
7f462c618b v6.1.3 notes v6.1.3 2018-04-03 00:11:20 -07:00
James R. Barlow
d8ac6e28ab Convert monochrome images to JBIG2
Awkwardly using fitz and pikepdf, transcode monochrome to CCITT.
This requires _OCRMYPDF_NO_FITZ=1.

00000x.opt.pdf can be checked for JBIG2 to confirm, but this file is
not passed to the output since it's not all wired up yet.
2018-04-03 00:00:53 -07:00
James R. Barlow
a95ffcdc46 Experimental add jbig2
It appears that fitz forces conversion of jbig2 to ccitt no matter what,
so pikepdf will be needed to patch jbig2 images.
2018-04-03 00:00:53 -07:00
James R. Barlow
1b01d45dd2 Warn about Python 3.5 page count issue 2018-04-02 19:29:17 -07:00
James R. Barlow
7a1cd39b21 Fix creation date metadata lost from input
Closes #247
2018-04-02 17:53:39 -07:00
James R. Barlow
1c1fd9616a Don't depend on pytest-xdist in setup.cfg 2018-04-02 11:45:03 -07:00
Sean Whitton
11e19e4085 remove addopts key from tool:pytest section of setup.cfg (#246)
The '-n' command line argument is not supported by recent pytest.
2018-04-02 14:43:38 -04:00
James R. Barlow
2a43f73228 Update installation.rst, further info on fitz 2018-04-02 11:32:57 -07:00
James R. Barlow
b1d1310a75 Dockerfile: use fitz 2018-04-02 11:08:03 -07:00
James R. Barlow
0e7fa78e65 Remove inaccurate statement from setup.py 2018-04-01 13:20:17 -07:00
James R. Barlow
4032570d97 Change docs for fitz/PyMuPDF 2018-04-01 13:19:57 -07:00
James R. Barlow
90644a3017 pipeline: refactoring, use with block for images 2018-03-31 13:26:40 -07:00
James R. Barlow
4f6bffb477 Update copyrights 2018-03-31 11:54:38 -07:00
James R. Barlow
158f902c3b Fixed setup.py syntax error v6.1.2 2018-03-30 14:00:36 -07:00