147 Commits

Author SHA1 Message Date
James R. Barlow
58c29ffb5c weave: use explicit pdf.close(), drastically reduce open file handles
With the new pikepdf 1.2.0 we no longer need to hold file handles
open because of the "copy to memory" functionality. We retain
the behavior of closing/reopening the output PDF every 100 pages as
a way to limit memory usage.
2019-04-18 15:12:48 -07:00
James R. Barlow
67a405c6b7 Move install-time external program checks out of setup.py
We did runtime tests for several of them anyway, and it's better to do
at runtime since config may change after installation.
2019-03-03 03:26:56 -08:00
James R. Barlow
602570fcf9 Update requirements 2019-03-03 02:27:56 -08:00
James R. Barlow
92c8a5885e Declare build system in pyproject.toml 2019-02-26 12:23:33 -08:00
James R. Barlow
b8cd3acd9e v8.0.1 notes 2019-01-17 00:57:28 -08:00
James R. Barlow
f472587d22 Bump pikepdf version, point to release notes 2019-01-05 16:48:13 -08:00
James R. Barlow
f34b3015b2 Prevent Ghostscript from generating invalid XMP metadata
If DocumentInfo contains NULs Ghostscript will generate XMP with
NULs which is not allowed. Repair DocumentInfo before Ghostscript sees it.
2019-01-04 13:20:41 -08:00
James R. Barlow
089ece2715 use pikepdf 0.10.2 2019-01-03 12:08:43 -08:00
James R. Barlow
7d330afd81 Delinting 2019-01-02 13:34:45 -08:00
James R. Barlow
68fbd9fcc9 pikepdf: version bump 2018-12-31 15:37:31 -08:00
James R. Barlow
c771938907 Convert to f-strings where it makes sense 2018-12-31 15:01:19 -08:00
James R. Barlow
8c0009c5c8 Make pdfminer.six optional
Mainly since the current release of pdfminer.six lacks a sdist, blocking
homebrew packaging. Also in case other distros don't accept pdfminer.six.
2018-12-31 01:08:43 -08:00
James R. Barlow
8b90c45437 Drop support for Tesseract 3 2018-12-30 00:47:12 -08:00
James R. Barlow
72b920eb16 Drop support for Python 3.5 2018-12-30 00:23:26 -08:00
James R. Barlow
ab632f57cd v7.4.0 release notes 2018-12-15 15:27:23 -08:00
James R. Barlow
b973208137 Require pikepdf 0.9.1 2018-12-15 14:23:10 -08:00
James R. Barlow
5a7a8e573b Require pikepdf 0.9.0 2018-12-14 23:06:57 -08:00
James R. Barlow
632dab2cc0 Replace Ghostscript DOCINFO and fix 9.25 metadata date regression
We no longer use Ghostscript to manage PDF metadata, instead
omitting the DOCINFO segment from the pdfmark file we generate.

Instead all of the relevant metadata code has been migrated to pikepdf,
and we use that API. This should be more consistent and fixes the
Ghostscript version-depedent quirks.

Also removes our python-xmp-toolkit dependency, except for
testing.
2018-12-13 18:13:30 -08:00
James R. Barlow
9593aa4fb9 Merge v7.3.0 development 2018-11-11 01:38:42 -08:00
James R. Barlow
0f5c484b62 Travis: only need to specify chardet because we use pip install --no-deps 2018-11-10 13:57:04 -08:00
James R. Barlow
755b5d87e3 Add missing chardet, implied by pdfminer.six? 2018-11-10 01:50:51 -08:00
James R. Barlow
eed0424390 Update requirements 2018-11-10 00:56:04 -08:00
James R. Barlow
600d31a907 Require pikepdf 0.3.7 2018-10-30 16:22:05 -07:00
James R. Barlow
05aa43c856 Require pdfminer 2018-10-29 12:45:15 -07:00
Stefan Weil
a873278c2a Fix some recommendations from LGTM (#309)
* Fix unreachable code

This fixes an issue reported by LGTM.

Signed-off-by: Stefan Weil <sw@weilnetz.de>

* Remove unused imports

This fixes several recommendations from LGTM.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-28 13:59:58 -07:00
James R. Barlow
f5807a2053 Require pikepdf 0.3.5 2018-10-21 21:37:15 -07:00
James R. Barlow
5650eba848 Cleanup MANIFEST.in, reorg requirements/*.txt, fix non-Unicode readme 2018-10-10 23:53:08 -07:00
James R. Barlow
29116e1dec Change to README.md 2018-09-19 21:01:24 -07:00
James R. Barlow
eaa324939f Upgrade to pikepdf 0.3.3
Closes #231
2018-09-19 15:30:54 -07:00
James R. Barlow
e0599fe8d7 Require pikepdf 0.3.2 2018-08-24 12:41:43 -07:00
James R. Barlow
6decdaa062 Try setuptools_scm_git_archive again 2018-08-20 15:45:51 -07:00
James R. Barlow
cf9a8a91b5 Require pikepdf 0.3.1 2018-08-10 16:59:08 -07:00
James R. Barlow
4181a712d1 Update setup.py for version changes 2018-08-01 15:17:49 -07:00
James R. Barlow
11fbd32e6e Merge v6.2.2 (mainly to get release notes) 2018-07-13 12:38:15 -07:00
James R. Barlow
4650074428 Cherrypick Python 3.7 documentation updates from v7.0.0
From b0eacd6
2018-07-12 02:45:51 -07:00
James R. Barlow
70aa644c10 Backport Python 3.7 fix for ruffus 2.7.0 from ocrmypdf v7.0.0 2018-07-12 02:45:51 -07:00
James R. Barlow
d6eb1f9578 Remove dependency on private fork of ruffus, change to official 2.7 2018-07-09 12:51:56 -07:00
James R. Barlow
5f99f7f6ca Upgrade to Py3.7 locally and resolve a few issues 2018-07-02 23:47:51 -07:00
James R. Barlow
7200623007 Fix installation for Python 3.7
Need to use private fork of ruffus for Python 3.7. Backward compatible with Python 3.6 for ruffus 2.6.3

Disable locale checking for 3.7 since the various fixes in that release should make it unnecessary.
2018-07-02 16:47:14 -07:00
James R. Barlow
b0eacd6586 Add Python 3.7 support 2018-06-28 13:57:45 -07:00
James R. Barlow
bf214eecb3 Use newer pikepdf API for objgen 2018-06-28 12:59:01 -07:00
James R. Barlow
b9dc109892 optimize: use new pikepdf api for objgen 2018-06-24 00:16:28 -07:00
James R. Barlow
8c84c515b6 Use Ghostscript for text region detection
Ghostscript txtwrite seems to be quite effective at the task.

Eliminates dependency on fitz
2018-06-13 00:58:09 -07:00
James R. Barlow
1dfbbdebf4 Adjust for pikepdf API change 2018-06-08 22:47:56 -07:00
James R. Barlow
cf43c06f46 Use python-xmp-toolkit for xmp check
Eliminates PyPDF2 and defusedxml as dependencies.
2018-05-29 22:00:52 -07:00
James R. Barlow
93b858afd1 Remove qpdf appimage support for now, check for pngquant 2018-05-18 16:24:33 -07:00
James R. Barlow
d5fb275e9e Travis: hack in qpdf appimage version
qpdf from appimage does not report its version with --version if renamed
or accessed via symlink. Use an environment variable to supply it
where needed.
2018-05-15 17:45:58 -07:00
James R. Barlow
96e453feb6 Travis: Tweak setup so it can run 2018-05-13 01:21:24 -07:00
James R. Barlow
b6d30214fd PyMuPDF 1.13.4 looks good, use it 2018-05-12 12:35:46 -07:00
James R. Barlow
f00183115d Update our dependencies 2018-05-11 02:11:55 -07:00