James R. Barlow
9425506c2a
Use pikepdf to handle paletted images
...
Removes all use of PyMuPDF in optimize
2018-05-18 17:44:29 -07:00
James R. Barlow
93b858afd1
Remove qpdf appimage support for now, check for pngquant
2018-05-18 16:24:33 -07:00
James R. Barlow
7b0a3ec365
Add notes for v7
v7.0.0b2
2018-05-18 00:20:45 -07:00
James R. Barlow
083d442529
main: wording change
2018-05-18 00:20:24 -07:00
James R. Barlow
b52eb95cf8
optimize: use pikepdf to save PIL images
...
Eliminates another usage of PyMuPDF in the main path.
2018-05-18 00:18:44 -07:00
James R. Barlow
f4571e2508
Ensure we try compress anything that's not compressed when saving
2018-05-17 22:05:01 -07:00
James R. Barlow
b06ef03aac
pipeline: use the resolution of the OCR image rather than recalculating
...
(Recalculating would fail if the image is not centered.)
2018-05-17 16:51:53 -07:00
James R. Barlow
1d1962a106
weave: fix rescaling logic
...
rotation % 90 == 0 is always true.
2018-05-17 16:50:01 -07:00
James R. Barlow
4b98e9ff08
weave: if we don't have textonly_pdf, delete instruction to draw image
2018-05-17 16:49:20 -07:00
James R. Barlow
f83ca5d8ac
weave: whitespace
2018-05-17 16:06:36 -07:00
James R. Barlow
95cb4d22d7
pipeline: make /Info from indirect object as required
2018-05-17 16:06:13 -07:00
James R. Barlow
0c279b01a4
Fix test failure on missing JobContext
v7.0.0b1
2018-05-17 01:16:58 -07:00
James R. Barlow
3b820ffa7b
test_metadata: change from xfail to skipif without fitz
2018-05-17 00:14:57 -07:00
James R. Barlow
35cb416563
pipeline: remove fitz-based attempt to repair table of contents
...
Prior to unsplit, if we were rebuilding the PDF we'd lose the
table of contents. With unsplit we keep the original file and patch
the table of contents as necessary, adn that works fine.
This remaining bit of code from PyMuPDF actually damages the
table of contents and removing it fixes the test suite. G'bye.
2018-05-16 23:24:57 -07:00
James R. Barlow
cdb737259c
pipeline: remove old page merge strategies
2018-05-16 22:16:54 -07:00
James R. Barlow
0843b5939c
pipeline: Move weave* to its own file
2018-05-16 22:08:31 -07:00
James R. Barlow
2b5f23a2d1
Add code to repair ToC with pikepdf
2018-05-16 21:39:23 -07:00
James R. Barlow
5e20d1d554
metadata: Fix failing test on __getitem__['/CreationDate']
2018-05-16 13:46:07 -07:00
James R. Barlow
18595ca86a
Use pikepdf for get_pdfmark
...
It does fine.
2018-05-16 12:24:35 -07:00
James R. Barlow
3e269fa188
Ubuntu 14.04 has a qpdf 8.0.2 backport, making life easier
2018-05-15 21:43:19 -07:00
James R. Barlow
65405c2cb9
Try getting qpdf from Ubuntu 18.04
2018-05-15 21:27:27 -07:00
James R. Barlow
442cf8897a
Travis: maybe upgrading wheel?
2018-05-15 18:12:35 -07:00
James R. Barlow
d5fb275e9e
Travis: hack in qpdf appimage version
...
qpdf from appimage does not report its version with --version if renamed
or accessed via symlink. Use an environment variable to supply it
where needed.
2018-05-15 17:45:58 -07:00
James R. Barlow
e60aec81ca
Travis: why can't we use qpdf appimage?
2018-05-15 16:59:16 -07:00
James R. Barlow
398e9e535e
optimize: Changed pikepdf API
2018-05-15 16:29:57 -07:00
James R. Barlow
08bf651ef2
Refactor JBIG2 path for non-CCITT monochrome images
2018-05-15 15:32:15 -07:00
James R. Barlow
6171de41bf
optimize: move a lot of image scanning code to pikepdf
2018-05-14 22:21:53 -07:00
James R. Barlow
f0a56592e2
Pull JobContext out of pipeline.py to avoid circular reference
2018-05-14 14:01:25 -07:00
James R. Barlow
87a7d4d1a8
Another fitz failure - incorrect object reference introduced
...
MuPDF/fitz changed some font references to point to table of contents
entries, corrupting the page. It no longer gets to save.
2018-05-14 13:58:49 -07:00
James R. Barlow
05287902a2
Travis: again
2018-05-13 11:02:25 -07:00
James R. Barlow
96e453feb6
Travis: Tweak setup so it can run
2018-05-13 01:21:24 -07:00
James R. Barlow
9c0fa9fc04
Travis: again
2018-05-13 01:17:04 -07:00
James R. Barlow
3bde0715b0
Move qpdf to before_script
2018-05-13 01:01:48 -07:00
James R. Barlow
e2ec3d8b9b
Travis: adjust qpdf appimage
2018-05-13 00:53:31 -07:00
James R. Barlow
ad91eaf8a7
Travis: try using qpdf appimage to speed up build
2018-05-13 00:42:48 -07:00
James R. Barlow
b6d30214fd
PyMuPDF 1.13.4 looks good, use it
2018-05-12 12:35:46 -07:00
James R. Barlow
c4ab01d63d
Fix "AttributeError: 'ImageInfo' object has no attribute '_type'"
...
Also deal with 'fixme' imagemask comment.
Also fix bpc incorrectly set to 8 by default on stencil masks.
2018-05-12 12:14:57 -07:00
James R. Barlow
4ba3b3f55a
Fix rotate_pages_threshold test failure
2018-05-12 11:47:46 -07:00
James R. Barlow
52d2706a9e
optimize: Fix error causing many images to be skipped
2018-05-12 01:37:30 -07:00
James R. Barlow
964afc69f6
leptonica: ErrorTrap is an implementation detail
2018-05-12 01:21:45 -07:00
James R. Barlow
3ddf545ccd
optimize: leptonica can fail to open PNG
...
ERROR - Info in pixReadStreamPng: converting (cmap + alpha) ==> RGBA
Error in pixReadStreamPng: spp == 1, cmap, trans array, invalid depth: 4
To investigate later....
2018-05-12 01:21:19 -07:00
James R. Barlow
f9374733bb
optimize: process ICCBased images that declare an /Alternate we recognize
2018-05-12 00:43:36 -07:00
James R. Barlow
5930135f45
optimize: Refactor naming helpers
2018-05-12 00:42:24 -07:00
James R. Barlow
f03f6bc128
optimize: document problem with transcode free compressed image data
2018-05-11 23:43:06 -07:00
James R. Barlow
6c50c70235
Try to optimize paletted images
2018-05-11 23:42:26 -07:00
James R. Barlow
8790fc2c1b
optimize: add knobs to control image quality but don't show the user yet
2018-05-11 23:41:49 -07:00
James R. Barlow
f86c4fccf4
optimize: don't alter >8 bpc images
2018-05-11 22:31:24 -07:00
James R. Barlow
7d0785e9ed
main: do better parameter validation
2018-05-11 22:31:09 -07:00
James R. Barlow
2cac88162c
Ignore masks when deciding what color to rasterize at
2018-05-11 21:27:57 -07:00
James R. Barlow
4809627d8a
Fix jbig2enc name
2018-05-11 17:51:08 -07:00