2676 Commits

Author SHA1 Message Date
James R. Barlow
8f5c95f0f4
Remove last vestiges of command line usage of qpdf - change to check_pdf 2020-04-26 05:33:26 -07:00
James R. Barlow
168fc60774
Update release notes with v10 changes 2020-04-26 05:14:59 -07:00
James R. Barlow
c84d0f606d
ghostscript: remove deprecated argument from generate_pdfa 2020-04-26 05:11:11 -07:00
James R. Barlow
8b54ce338f
setup: remove deprecated message about removeal of --force parameter 2020-04-26 05:09:42 -07:00
James R. Barlow
18c4aa10bf
Adjust number of workers for concurrent page scanning 2020-04-26 04:21:15 -07:00
James R. Barlow
991db17fde
Remove Ghostscript-based text extraction
While faster than Python based methods, we've outgrown the limited
amount of information Ghostscript provides with this feature, and it
repeats an analysis we have to do anyway to learn what images are
present.
2020-04-26 04:02:07 -07:00
James R. Barlow
2c07515907 macOS - use spawn for multiprocessing
See bpo-33725. This is the default for 3.8, opt-in for 3.7 and older.
2020-04-26 03:49:40 -07:00
James R. Barlow
27a3b80376 Use once-per-worker pikepdf init 2020-04-26 03:49:20 -07:00
James R. Barlow
8c381a0227 Replace task_initargs with use of partial() 2020-04-26 03:49:20 -07:00
James R. Barlow
86145a8c76 Some wrong with forking worker_pdf, just open it once per page for now 2020-04-26 03:49:20 -07:00
James R. Barlow
7513f5425c Fix some broken tests 2020-04-26 03:49:20 -07:00
James R. Barlow
af3c3c6466 Further refactoring of concurrency concerns 2020-04-26 03:49:20 -07:00
James R. Barlow
db3e75e33e Refactor multiprocessing pool 2020-04-26 03:49:13 -07:00
James R. Barlow
ce49fc26dd Do pikepdf.open() once instead of per worker 2020-04-26 03:42:13 -07:00
James R. Barlow
d0d0a98dca First cut at concurrent page scan
Improvement appears on 168 page file. Needs refactoring
2020-04-26 03:42:13 -07:00
James R. Barlow
3834d1a0bf
azure: use brew python instead 2020-04-26 00:58:38 -07:00
James R. Barlow
33e982b3fd
azure: add certifi, openssl for macOS 2020-04-26 00:37:14 -07:00
James R. Barlow
43d650e78c
Fix issue where only first PNG-style image would be optimized 2020-04-25 03:50:11 -07:00
James R. Barlow
b4c65c5781
Update requirements 2020-04-25 03:49:34 -07:00
James R. Barlow
d96867e6ab watcher: add polling and log level adjustment 2020-04-24 04:14:44 -07:00
James R. Barlow
0a5108e704 install: clarify that old ocrmypdf should be removed from Ubuntu 18.04
Closes #526
2020-04-24 04:14:19 -07:00
James R. Barlow
94c52a6fa3
Refactor 'xyres' into Resolution 2020-04-24 04:12:05 -07:00
James R. Barlow
57771f06a3
Refactor xy-pair for resolution to tuple 2020-04-16 15:38:33 -07:00
James R. Barlow
58abb5785c
pytest picky about list vs tuple v9.7.2 2020-04-15 03:16:51 -07:00
James R. Barlow
509e75eaff
v9.7.2 release notes 2020-04-15 02:56:46 -07:00
James R. Barlow
0c50eedb2a Support pdfminer.six 20200402 2020-04-15 02:55:22 -07:00
James R. Barlow
4581027246 Drop support for pdfminer.six 20181108
This version required a patch that has since been mainlined, and also did not
declare its dependency on chardet
correctly. We can remove both hacks now.
2020-04-15 02:50:36 -07:00
James R. Barlow
31b5f63f85 hocrtransform: cleanup/PEP8
Some API breaking changes.
2020-04-15 02:48:56 -07:00
James R. Barlow
957fb1494e
pytest picky about list vs tuple 2020-04-15 02:26:20 -07:00
James R. Barlow
9e3e4f2687
Improve help text about aborting due to text 2020-04-15 02:17:55 -07:00
James R. Barlow
2155bcacb4
Loosen test language requirements - eng/deu 2020-04-15 00:30:38 -07:00
James R. Barlow
346da95899 Suppress loglevel since we have color now 2020-04-15 00:09:36 -07:00
James R. Barlow
f4f7946a0c Add colored logs 2020-04-15 00:05:38 -07:00
James R. Barlow
c2919f2e1c Reinstate logging of page numbers 2020-04-15 00:05:23 -07:00
James R. Barlow
a63d624052 Improve logging of subprocess output 2020-04-15 00:04:43 -07:00
James R. Barlow
af91489376 Remove safe_symlink log= warning 2020-04-14 23:59:33 -07:00
James R. Barlow
d146d2b65c The Great Logging Refactor
Remove all instances of logger object being passed as parameters.
This was a holdover from ruffus, and complicated a lot of simple things.
2020-04-14 23:59:33 -07:00
James R. Barlow
4ff4ed24a8 Refactor Windows executable shims 2020-04-14 23:59:33 -07:00
James R. Barlow
c38ff90081 Merge branch 'master' of github.com:jbarlow83/OCRmyPDF 2020-04-14 23:55:01 -07:00
James R. Barlow
4c029e973f
Fix isinstance(..,str) 2020-04-14 23:53:52 -07:00
Lars K.W. Gohlke
21cf9029e8
docs: Set ownership when using docker image (#518) 2020-04-14 23:32:01 -07:00
James R. Barlow
4a640b8dcd
Fix language argument not working as list
Fixes #523
2020-04-14 23:18:52 -07:00
James R. Barlow
9471bc8921
Fix versions with leading v, e.g. v5.0 v9.7.1 2020-04-10 13:42:33 -07:00
James R. Barlow
7fe06c64fc v9.7.1 release notes 2020-04-10 13:00:19 -07:00
James R. Barlow
d13d70fd56 Fix version checker failing for qpdf 10.0.0
Fixes #527
2020-04-10 13:00:19 -07:00
James R. Barlow
58ec56180a Add a few more type annotations to public APIs 2020-04-10 13:00:19 -07:00
James R. Barlow
32a88f1bad docs: warn that AWS Lambda doesn't work 2020-04-10 13:00:19 -07:00
James R. Barlow
99ef42940c docs: warn that Windows users should use an ifmain guard 2020-04-10 13:00:19 -07:00
jbarlow83
c152710617 Update issue templates 2020-04-04 15:41:53 -07:00
James R. Barlow
8de0f9b86f
v9.7.0 release notes v9.7.0 2020-03-29 22:45:25 -07:00