3031 Commits

Author SHA1 Message Date
James R. Barlow
0885799010 Update docs for conda
Closes #743
2021-03-03 00:43:45 -08:00
James R. Barlow
8ffc99f648 optimize: log errors more loudly 2021-03-03 00:43:40 -08:00
James R. Barlow
2261c51eff
Reactivate pngquant on windows v11.7.0 2021-02-26 01:19:03 -08:00
James R. Barlow
5c470778a3
v11.7.0 release notes 2021-02-26 00:29:52 -08:00
James R. Barlow
4124889f36
Don't generate PDF/A-1b with object streams
Acrobat insists that PDF/A-1b should not have object streams.
Other programs like veraPDF disagree with this restriction, but
we can accommodate Acrobat so we will.

Also add more tests around this.
2021-02-26 00:23:57 -08:00
James R. Barlow
a23c22b0e8
helpers: tidy check_pdf 2021-02-25 22:51:53 -08:00
James R. Barlow
dd1f5f7215
pyproject: black doesn't like py39 yet 2021-02-25 16:10:20 -08:00
Dima Kuznetsov
5e2206bae7
Allow --sidecar along --pages (#735) 2021-02-19 16:55:35 -08:00
James R. Barlow
079ee86d43
pyproject: also target py39 2021-02-18 01:48:56 -08:00
James R. Barlow
3692868004
v11.6.2 release notes v11.6.2 2021-02-15 01:48:14 -08:00
James R. Barlow
064f935699
Fix page rotation regression
Page size fixes in commit b26749 did accounted for a "kept" rotation,
but not a corrected rotation.

Fixes #730.
2021-02-15 01:47:09 -08:00
James R. Barlow
8770fff968
tests: remove unreliable/incomplete test 2021-02-15 01:05:08 -08:00
James R. Barlow
82de78b6b0
v11.6.1 release notes v11.6.1 2021-02-14 01:51:26 -08:00
James R. Barlow
2a52c6dec2 optimize: skip images with unusually small dimensions
They're unlikely to be handled well by our recompressors. It seems
that JBIG2 cannot handle very small widths.

Fixes #732
2021-02-14 01:43:25 -08:00
James R. Barlow
2898879be7 docker-compose: fix typo 2021-02-14 01:43:06 -08:00
James R. Barlow
18e613657c
docker-compose: fix typo 2021-02-14 01:23:01 -08:00
James R. Barlow
a48ca556c7
Add filter_pdf_page hook 2021-02-14 01:22:33 -08:00
James R. Barlow
9cba738b48 Remove deprecated code 2021-01-31 19:27:59 -08:00
James R. Barlow
bccf2f423f Stricter parameter checking for many public functions 2021-01-31 19:27:25 -08:00
James R. Barlow
390fdf8c05 Package OCR in Form XObject
Should improve results in some situations where the initial content
stream is messy or not well-formed.
2021-01-31 19:27:25 -08:00
James R. Barlow
166de3086b Merge branch 'feature/colorstrategy' 2021-01-31 19:26:59 -08:00
James R. Barlow
206c675df6
docs: api 2021-01-31 19:26:35 -08:00
James R. Barlow
6c8f9223e9
Update awslambda to new pluginspec 2021-01-31 03:00:05 -08:00
James R. Barlow
85c6a974ca
Fix calls to hook.get_executor 2021-01-31 02:46:44 -08:00
James R. Barlow
dccdcfaa91
leptonica: tidy 2021-01-31 02:46:09 -08:00
James R. Barlow
b1da09f141 Add plugin for setting logging console
So that we are not tied to tqdm.
2021-01-31 02:29:11 -08:00
James R. Barlow
42c84531e4
optimize: rewrite JPEG optimize to avoid use of tqdm and parallelize
For some reason JPEG optimization was not done in parallel, and was
perhaps never done in parallel. Strange oversight.
2021-01-31 02:18:46 -08:00
James R. Barlow
a9ad805347
optimize: Remove shim for unsupported pikepdf version 2021-01-31 00:08:20 -08:00
James R. Barlow
16bda74974
Refactor - decouple progressbar from executor 2021-01-30 20:42:00 -08:00
James R. Barlow
d274d88929
Refactor to eliminate global state in _concurrent 2021-01-30 17:36:30 -08:00
James R. Barlow
327df5cbbc Use ColorConversionStrategy "LeaveColorUnchanged"
Faster, still produces PDF/A
2021-01-26 15:44:41 -08:00
James R. Barlow
46d0632fe2
v11.6.0 release notes v11.6.0 2021-01-26 01:47:49 -08:00
James R. Barlow
ef1e7a814e
Delinting 2021-01-26 01:45:04 -08:00
James R. Barlow
1084724937
docs: improve API docs 2021-01-26 01:40:40 -08:00
James R. Barlow
ecb0109d79
docs: fix rst formatting error 2021-01-26 01:29:45 -08:00
James R. Barlow
386cabff00
Make progress pool common rather than plugin-specific 2021-01-24 23:56:09 -08:00
James R. Barlow
3bd5054634 lambda: move to extra_plugins folder 2021-01-24 23:47:26 -08:00
James R. Barlow
6a8dd65aa2 lambda: more issues related to new executor semantics
Now all tests pass, except for:
-tests that check the progress bar
-tests where xdist may or may not load a _lambda_plugin by running
some other test first before a test in optimize
2021-01-24 23:46:40 -08:00
James R. Barlow
6083b4f0a7 lambda: don't overrun number of workers needed 2021-01-24 23:46:40 -08:00
James R. Barlow
1a3ce59476 lambda: Don't be paranoid about exception marshalling
It works
2021-01-24 23:46:40 -08:00
James R. Barlow
c395436ba3 lambda: tidying, special casing use_threads 2021-01-24 23:46:40 -08:00
James R. Barlow
8d23d0b441 Operational lambda executor 2021-01-24 23:46:40 -08:00
James R. Barlow
c6a2716cdb Temporary move into package 2021-01-24 23:46:40 -08:00
James R. Barlow
5545bae76f lambda_plugin.py: doesn't work since entry point needs to be in package 2021-01-24 23:46:33 -08:00
James R. Barlow
7bccb8c748 tests: fix concurrency 2021-01-24 23:46:33 -08:00
James R. Barlow
173c0d1274 concurrency: lock progress pool
For API sanity and to communicate expectations. One progress pool at
a time is plenty of complexity.
2021-01-24 23:46:33 -08:00
James R. Barlow
6953f32465 pdfinfo: remove some messy concurrency handling
We can cut down on the use of global variables and save opening
an extra copy of the Pdf when threaded.
2021-01-24 23:46:33 -08:00
James R. Barlow
26b4d9bb4b Refactor concurrency so that it is pluggable
However, this may not be the best idea because it involves global
state that could be overridden by a parallel call to ocrmypdf.ocr.
2021-01-24 23:46:29 -08:00
James R. Barlow
34e564cd7d Use queue.Queue instead of multiprocessing.Queue in threaded mode 2021-01-24 23:45:26 -08:00
James R. Barlow
504d5776d2 Refactor plugin manager to eliminate callback 2021-01-24 23:42:40 -08:00