2676 Commits

Author SHA1 Message Date
James R. Barlow
383e726d65 Expand size growth reasons to other arguments that trigger transcoding 2018-04-27 19:34:57 -07:00
James R. Barlow
e046f70642 Set OMP_THREAD_LIMIT unconditionally, for pngquant 2018-04-27 19:19:30 -07:00
James R. Barlow
2131ad4670 Fix --remove-background error on PDFs with colormapped images
It's unclear how exactly a
colormapped image gets to this
spot given the tendency of other
image processing tools to flatten
such images, but someone made it happen, so now we make sure
the image is okay.

Closes #262
2018-04-27 17:21:01 -07:00
James R. Barlow
219fe2155b test_pageinfo: remove duplicate import 2018-04-27 17:16:42 -07:00
James R. Barlow
4209034d20 Add gpg key to issue template 2018-04-27 15:51:26 -07:00
James R. Barlow
abcae0c2a4 Fix helpers.py again 2018-04-25 22:10:51 -07:00
James R. Barlow
0934905493 Don't suppress error message from config_notfound
Since it showed up in s390x bionic
2018-04-25 21:58:18 -07:00
James R. Barlow
11cd6201d9 helpers: fix missing call to complain()
In practice this is probably unreachable.
2018-04-25 21:57:50 -07:00
James R. Barlow
8d2a917676 Page unsplit, development 2018-04-25 21:56:43 -07:00
James R. Barlow
44b4afa534 Begin conversion from page splititng to page markers 2018-04-23 22:57:50 -07:00
James R. Barlow
775be3933c Cherrypick merge_pages unification 2018-04-20 23:08:15 -07:00
James R. Barlow
df87e21c85 Add support for PDF/A-3
No ability to attach files however
2018-04-20 00:06:55 -07:00
Hugo
d761d80750 Use more standard __version__ rather than PILLOW_VERSION (#257) 2018-04-19 23:35:32 -07:00
James R. Barlow
8052019dde optimize: fix reporting of jbig2 groups 2018-04-19 01:54:44 -07:00
James R. Barlow
a3d8950088 optimize: Don't save JPEGs if larger 2018-04-19 01:25:49 -07:00
James R. Barlow
004f5d3bf1 optimize: further improve decodeparms handling 2018-04-18 15:52:25 -07:00
James R. Barlow
f5d308a156 optimize: refactor tricky /Filter and /DecodeParms handling 2018-04-18 15:30:21 -07:00
James R. Barlow
3869996758 optimize: jbig2 error 2018-04-18 01:36:38 -07:00
James R. Barlow
cdb2107c4e optimize: jbigs2 fix 2018-04-18 01:31:35 -07:00
James R. Barlow
4db2b3413b optimize: more robustness 2018-04-18 01:25:34 -07:00
James R. Barlow
b2f31bec79 Make optimize a lot safer 2018-04-18 00:20:06 -07:00
James R. Barlow
78f9f4a266 Be more defensive about accessing 2018-04-18 00:11:39 -07:00
James R. Barlow
ad6087c342 optimize: more fixes 2018-04-17 23:58:10 -07:00
James R. Barlow
0d6ef430de optimize: fix "length not defined" 2018-04-17 23:38:00 -07:00
James R. Barlow
a5942209e8 optimize: fix error on missing /Filter 2018-04-17 23:27:56 -07:00
James R. Barlow
9a60694cfc optimize: ccitt header fixes
Changed to match TIFF spec's use of unsigned types, eliminated check for
/Columns.

There is some complex behavior for /Width != /Columns and
(/Width, /Columns) mod 8 != 0
that is not described well in the PDF spec.
2018-04-17 23:27:25 -07:00
James R. Barlow
4bf13f4737 optimize: be less chatty 2018-04-17 23:25:41 -07:00
James R. Barlow
9e89b75186 Merge v6.1.5 2018-04-17 22:51:13 -07:00
James R. Barlow
0b10db91be Fix regression: Disable Ghostscript JPEG passthrough entirely v6.1.5 2018-04-17 17:00:24 -07:00
James R. Barlow
1a516b2af9 Fix regression: time stamp test suite failures 2018-04-17 16:59:21 -07:00
James R. Barlow
076363d78e Disable JPEG passthrough for Ghostscript 9.23
Seems to corrupt JPEGs involved in image masks?
2018-04-17 16:31:03 -07:00
James R. Barlow
5fde214290 Update notes for v6.1.5 2018-04-17 15:23:35 -07:00
James R. Barlow
a620724d6a Fix PDF/A validation failure due to timezone being omitted from /ModDate 2018-04-17 15:16:48 -07:00
James R. Barlow
640b953ec7 Fix PDF/A validation failure due to timezone being omitted from /ModDate 2018-04-17 14:55:32 -07:00
James R. Barlow
a009ca7597 Disable JPEG passthrough for Ghostscript 9.23
Seems to corrupt JPEGs involved in image masks?
2018-04-17 13:54:34 -07:00
James R. Barlow
7368399f8b Clarify license of two test files - https://github.com/jbarlow83/OCRmyPDF/issues/254 2018-04-17 11:56:36 -07:00
James R. Barlow
c974aec934 Search for image masks too 2018-04-17 02:06:08 -07:00
James R. Barlow
3033f03f64 Iterate images with pikepdf / fix mono PNG corruption
To work around PNG corruption problem in PyMuPDF for monochrome images,
extract and save monochrome CCITT with synthetic TIFF header.

Works better but currently skips /ImageMask due to qpdf
implementation, which affects many files.
2018-04-17 01:50:37 -07:00
James R. Barlow
72723e0bb5 optimize: be quieter 2018-04-16 18:06:02 -07:00
James R. Barlow
2fb6ab3939 Trap writePNG error 2018-04-16 17:29:10 -07:00
James R. Barlow
25c1c160b8 Move optimize to new file 2018-04-16 17:22:06 -07:00
James R. Barlow
7e92895471 Parallelize pngquant 2018-04-16 12:37:51 -07:00
James R. Barlow
d291d48991 PNG palette: parse PDF string from leptonica instead
Seems better to accept whatever leptonica rather than make detailed
assumptions about how it encodes the palette.

Experimented with setting FlateDecode on the palette but it seems to
expand it.
2018-04-16 12:16:13 -07:00
James R. Barlow
0e6b8042b0 Implement PNG palettization 2018-04-16 11:18:52 -07:00
James R. Barlow
34c78a892a Fix list table for tests/resources
[ci skip]
2018-04-15 23:52:19 -07:00
James R. Barlow
9d28879505 Update Ubuntu 14.04 instructions
Closes #252
2018-04-14 17:30:33 -07:00
James R. Barlow
2482296e2b hocr: avoid division by zero
Issue #253 - PDF that produces the error is not available, but if font_width
is zero, chances are the text is nonprinting characters, so suppress it.
2018-04-14 17:24:21 -07:00
James R. Barlow
f755fb76ee Try pngquant 2018-04-14 01:37:14 -07:00
James R. Barlow
c61b5dcb62 Fix PDF/A validation error from setting /Predictor 0 2018-04-14 01:36:46 -07:00
James R. Barlow
fae893b9d9 Reinstate transcoding of PNG 2018-04-14 00:19:24 -07:00