James R. Barlow
6ac9e92f17
Fix PEP8 docstring convention misuse in a few places
2018-06-22 17:51:25 -07:00
James R. Barlow
faaa4a1def
Ghostscript, PDF/A: support pathlib
2018-06-22 17:45:10 -07:00
James R. Barlow
0aa51f0f3a
Remove fitz from Travis
2018-06-18 15:38:41 -07:00
James R. Barlow
73431d9761
Remove obsolete _naive_find_text
2018-06-13 14:00:50 -07:00
James R. Barlow
45cb4525cf
Remove other references to PyMuPDF
2018-06-13 01:02:53 -07:00
James R. Barlow
8c84c515b6
Use Ghostscript for text region detection
...
Ghostscript txtwrite seems to be quite effective at the task.
Eliminates dependency on fitz
2018-06-13 00:58:09 -07:00
James R. Barlow
1dfbbdebf4
Adjust for pikepdf API change
v7.0.0rc2
2018-06-08 22:47:56 -07:00
James R. Barlow
740918daee
Create debug envvar to override Creator or Producer
...
Note that Ghostscript always overrides Producer
2018-06-06 23:17:28 -07:00
jbarlow83
1d10eac764
Add wiki link to issue template
...
[ci skip]
2018-06-06 12:59:59 -07:00
jbarlow83
3f868118cd
Remove gpg
...
[ci skip]
2018-06-06 12:58:02 -07:00
James R. Barlow
04d79b15b4
optimize: fix error in Py3.5
v7.0.0rc1
2018-06-06 12:25:32 -07:00
James R. Barlow
a13c398c06
Suppress some spurious tesseract errors
2018-06-05 23:26:28 -07:00
James R. Barlow
e3b3f716ee
optimize: use tempdir for cmdline invocation
2018-06-05 21:20:54 -07:00
James R. Barlow
cf43c06f46
Use python-xmp-toolkit for xmp check
...
Eliminates PyPDF2 and defusedxml as dependencies.
2018-05-29 22:00:52 -07:00
James R. Barlow
74a5a18607
Tweak release notes
v7.0.0b4
2018-05-28 14:52:06 -07:00
James R. Barlow
44241c6dd5
Travis: remove deploy to testpypi since it's broken
2018-05-27 01:49:18 -07:00
James R. Barlow
8fff496ffd
Fix Py3.5 not understanding os.path.exists(Path(...))
v7.0.0b3
2018-05-26 22:55:22 -07:00
James R. Barlow
edf75c519c
Update v7 release notes
2018-05-26 02:08:49 -07:00
James R. Barlow
9608b22d34
Remove all uses of PyPDF2 except PDF/A check
...
Leave PDF/A check alone for now, since pikepdf has no equivalent.
2018-05-26 02:07:18 -07:00
James R. Barlow
8ba4968c48
pdfinfo: more robustness
2018-05-26 01:54:25 -07:00
James R. Barlow
ffdd78f1a5
pdfinfo: Fix text_operators type not changed in related commit
2018-05-25 02:10:39 -07:00
James R. Barlow
ad9f8ca78e
pdfinfo: reinstate stack normalization for q/Q
2018-05-25 01:28:26 -07:00
James R. Barlow
78a686ecb4
Consider qpdf behavior on algo4 a pass
...
qpdf opens files with null user password, so do the same.
2018-05-25 00:33:31 -07:00
James R. Barlow
59e786eb3c
Remove old code to deal with single page only things
2018-05-25 00:32:55 -07:00
James R. Barlow
6d0461435f
Use OperandGrouper whitelist
2018-05-24 22:52:33 -07:00
James R. Barlow
0a04a60f69
Document need for pdfinfo to be pickleable
2018-05-24 22:24:13 -07:00
James R. Barlow
68d8642988
Found out this test was extremely slow - no reason to actual use a large file
2018-05-24 22:22:51 -07:00
James R. Barlow
16f70ff054
Main changeset for pikepdf-based refactor pdfinfo
2018-05-24 22:22:01 -07:00
James R. Barlow
c00aeafff0
Add scratch file
2018-05-24 22:20:15 -07:00
James R. Barlow
83f35e00f3
Start removing PyPDF2
2018-05-21 01:28:21 -07:00
James R. Barlow
786a2ad65a
Make optimize test do a little more
2018-05-18 17:50:39 -07:00
James R. Barlow
9425506c2a
Use pikepdf to handle paletted images
...
Removes all use of PyMuPDF in optimize
2018-05-18 17:44:29 -07:00
James R. Barlow
93b858afd1
Remove qpdf appimage support for now, check for pngquant
2018-05-18 16:24:33 -07:00
James R. Barlow
7b0a3ec365
Add notes for v7
v7.0.0b2
2018-05-18 00:20:45 -07:00
James R. Barlow
083d442529
main: wording change
2018-05-18 00:20:24 -07:00
James R. Barlow
b52eb95cf8
optimize: use pikepdf to save PIL images
...
Eliminates another usage of PyMuPDF in the main path.
2018-05-18 00:18:44 -07:00
James R. Barlow
f4571e2508
Ensure we try compress anything that's not compressed when saving
2018-05-17 22:05:01 -07:00
James R. Barlow
b06ef03aac
pipeline: use the resolution of the OCR image rather than recalculating
...
(Recalculating would fail if the image is not centered.)
2018-05-17 16:51:53 -07:00
James R. Barlow
1d1962a106
weave: fix rescaling logic
...
rotation % 90 == 0 is always true.
2018-05-17 16:50:01 -07:00
James R. Barlow
4b98e9ff08
weave: if we don't have textonly_pdf, delete instruction to draw image
2018-05-17 16:49:20 -07:00
James R. Barlow
f83ca5d8ac
weave: whitespace
2018-05-17 16:06:36 -07:00
James R. Barlow
95cb4d22d7
pipeline: make /Info from indirect object as required
2018-05-17 16:06:13 -07:00
James R. Barlow
0c279b01a4
Fix test failure on missing JobContext
v7.0.0b1
2018-05-17 01:16:58 -07:00
James R. Barlow
3b820ffa7b
test_metadata: change from xfail to skipif without fitz
2018-05-17 00:14:57 -07:00
James R. Barlow
35cb416563
pipeline: remove fitz-based attempt to repair table of contents
...
Prior to unsplit, if we were rebuilding the PDF we'd lose the
table of contents. With unsplit we keep the original file and patch
the table of contents as necessary, adn that works fine.
This remaining bit of code from PyMuPDF actually damages the
table of contents and removing it fixes the test suite. G'bye.
2018-05-16 23:24:57 -07:00
James R. Barlow
cdb737259c
pipeline: remove old page merge strategies
2018-05-16 22:16:54 -07:00
James R. Barlow
0843b5939c
pipeline: Move weave* to its own file
2018-05-16 22:08:31 -07:00
James R. Barlow
2b5f23a2d1
Add code to repair ToC with pikepdf
2018-05-16 21:39:23 -07:00
James R. Barlow
5e20d1d554
metadata: Fix failing test on __getitem__['/CreationDate']
2018-05-16 13:46:07 -07:00
James R. Barlow
18595ca86a
Use pikepdf for get_pdfmark
...
It does fine.
2018-05-16 12:24:35 -07:00