191 Commits

Author SHA1 Message Date
James R. Barlow
f34b3015b2 Prevent Ghostscript from generating invalid XMP metadata
If DocumentInfo contains NULs Ghostscript will generate XMP with
NULs which is not allowed. Repair DocumentInfo before Ghostscript sees it.
2019-01-04 13:20:41 -08:00
James R. Barlow
089ece2715 use pikepdf 0.10.2 2019-01-03 12:08:43 -08:00
James R. Barlow
7d330afd81 Delinting 2019-01-02 13:34:45 -08:00
James R. Barlow
68fbd9fcc9 pikepdf: version bump 2018-12-31 15:37:31 -08:00
James R. Barlow
c771938907 Convert to f-strings where it makes sense 2018-12-31 15:01:19 -08:00
James R. Barlow
8c0009c5c8 Make pdfminer.six optional
Mainly since the current release of pdfminer.six lacks a sdist, blocking
homebrew packaging. Also in case other distros don't accept pdfminer.six.
2018-12-31 01:08:43 -08:00
James R. Barlow
8b90c45437 Drop support for Tesseract 3 2018-12-30 00:47:12 -08:00
James R. Barlow
72b920eb16 Drop support for Python 3.5 2018-12-30 00:23:26 -08:00
James R. Barlow
ab632f57cd v7.4.0 release notes 2018-12-15 15:27:23 -08:00
James R. Barlow
b973208137 Require pikepdf 0.9.1 2018-12-15 14:23:10 -08:00
James R. Barlow
5a7a8e573b Require pikepdf 0.9.0 2018-12-14 23:06:57 -08:00
James R. Barlow
632dab2cc0 Replace Ghostscript DOCINFO and fix 9.25 metadata date regression
We no longer use Ghostscript to manage PDF metadata, instead
omitting the DOCINFO segment from the pdfmark file we generate.

Instead all of the relevant metadata code has been migrated to pikepdf,
and we use that API. This should be more consistent and fixes the
Ghostscript version-depedent quirks.

Also removes our python-xmp-toolkit dependency, except for
testing.
2018-12-13 18:13:30 -08:00
James R. Barlow
9593aa4fb9 Merge v7.3.0 development 2018-11-11 01:38:42 -08:00
James R. Barlow
0f5c484b62 Travis: only need to specify chardet because we use pip install --no-deps 2018-11-10 13:57:04 -08:00
James R. Barlow
755b5d87e3 Add missing chardet, implied by pdfminer.six? 2018-11-10 01:50:51 -08:00
James R. Barlow
eed0424390 Update requirements 2018-11-10 00:56:04 -08:00
James R. Barlow
600d31a907 Require pikepdf 0.3.7 2018-10-30 16:22:05 -07:00
James R. Barlow
05aa43c856 Require pdfminer 2018-10-29 12:45:15 -07:00
Stefan Weil
a873278c2a Fix some recommendations from LGTM (#309)
* Fix unreachable code

This fixes an issue reported by LGTM.

Signed-off-by: Stefan Weil <sw@weilnetz.de>

* Remove unused imports

This fixes several recommendations from LGTM.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-28 13:59:58 -07:00
James R. Barlow
f5807a2053 Require pikepdf 0.3.5 2018-10-21 21:37:15 -07:00
James R. Barlow
5650eba848 Cleanup MANIFEST.in, reorg requirements/*.txt, fix non-Unicode readme 2018-10-10 23:53:08 -07:00
James R. Barlow
29116e1dec Change to README.md 2018-09-19 21:01:24 -07:00
James R. Barlow
eaa324939f Upgrade to pikepdf 0.3.3
Closes #231
2018-09-19 15:30:54 -07:00
James R. Barlow
e0599fe8d7 Require pikepdf 0.3.2 2018-08-24 12:41:43 -07:00
James R. Barlow
6decdaa062 Try setuptools_scm_git_archive again 2018-08-20 15:45:51 -07:00
James R. Barlow
cf9a8a91b5 Require pikepdf 0.3.1 2018-08-10 16:59:08 -07:00
James R. Barlow
4181a712d1 Update setup.py for version changes 2018-08-01 15:17:49 -07:00
James R. Barlow
11fbd32e6e Merge v6.2.2 (mainly to get release notes) 2018-07-13 12:38:15 -07:00
James R. Barlow
4650074428 Cherrypick Python 3.7 documentation updates from v7.0.0
From b0eacd6
2018-07-12 02:45:51 -07:00
James R. Barlow
70aa644c10 Backport Python 3.7 fix for ruffus 2.7.0 from ocrmypdf v7.0.0 2018-07-12 02:45:51 -07:00
James R. Barlow
d6eb1f9578 Remove dependency on private fork of ruffus, change to official 2.7 2018-07-09 12:51:56 -07:00
James R. Barlow
5f99f7f6ca Upgrade to Py3.7 locally and resolve a few issues 2018-07-02 23:47:51 -07:00
James R. Barlow
7200623007 Fix installation for Python 3.7
Need to use private fork of ruffus for Python 3.7. Backward compatible with Python 3.6 for ruffus 2.6.3

Disable locale checking for 3.7 since the various fixes in that release should make it unnecessary.
2018-07-02 16:47:14 -07:00
James R. Barlow
b0eacd6586 Add Python 3.7 support 2018-06-28 13:57:45 -07:00
James R. Barlow
bf214eecb3 Use newer pikepdf API for objgen 2018-06-28 12:59:01 -07:00
James R. Barlow
b9dc109892 optimize: use new pikepdf api for objgen 2018-06-24 00:16:28 -07:00
James R. Barlow
8c84c515b6 Use Ghostscript for text region detection
Ghostscript txtwrite seems to be quite effective at the task.

Eliminates dependency on fitz
2018-06-13 00:58:09 -07:00
James R. Barlow
1dfbbdebf4 Adjust for pikepdf API change 2018-06-08 22:47:56 -07:00
James R. Barlow
cf43c06f46 Use python-xmp-toolkit for xmp check
Eliminates PyPDF2 and defusedxml as dependencies.
2018-05-29 22:00:52 -07:00
James R. Barlow
93b858afd1 Remove qpdf appimage support for now, check for pngquant 2018-05-18 16:24:33 -07:00
James R. Barlow
d5fb275e9e Travis: hack in qpdf appimage version
qpdf from appimage does not report its version with --version if renamed
or accessed via symlink. Use an environment variable to supply it
where needed.
2018-05-15 17:45:58 -07:00
James R. Barlow
96e453feb6 Travis: Tweak setup so it can run 2018-05-13 01:21:24 -07:00
James R. Barlow
b6d30214fd PyMuPDF 1.13.4 looks good, use it 2018-05-12 12:35:46 -07:00
James R. Barlow
f00183115d Update our dependencies 2018-05-11 02:11:55 -07:00
James R. Barlow
601863f9e9 Return to PyMuPDF 1.12.5 2018-05-10 18:47:10 -07:00
James R. Barlow
63032d304d Revert "Since PyMuPDF 1.13.3 corrupts text, pin 1.12.5 and work around it"
This reverts commit b0ce7c63dd27257d9c979fde9013243b8ae38c98.
2018-05-10 16:27:17 -07:00
James R. Barlow
b0ce7c63dd Since PyMuPDF 1.13.3 corrupts text, pin 1.12.5 and work around it 2018-05-10 16:10:24 -07:00
James R. Barlow
001c8d7678 Upgrade PyMuPDF version 2018-05-07 16:24:26 -07:00
James R. Barlow
fedbbdb575 Travis: compile qpdf from source
The older version in Travis's Ubuntu 14.04 can't pass the test suite anymore.
2018-04-11 15:40:45 -07:00
James R. Barlow
85ebba72bc Fix setup.py syntax 2018-04-10 18:30:48 -07:00