2895 Commits

Author SHA1 Message Date
James R. Barlow
eeae6f8292 test: Add syntax checks for shell completions 2019-07-02 13:49:17 -07:00
James R. Barlow
4dab299619 Fix parameterization of --verbose 2019-07-02 13:27:07 -07:00
James R. Barlow
340e2bbac6 Drop --mask-barcodes from completions 2019-07-02 13:10:05 -07:00
James R. Barlow
187283192b Fix reporting output file size skipped
Due to change to using finally for clean up
2019-06-30 15:08:10 -07:00
James R. Barlow
f855bdd36b Docker: Ubuntu image should be manylinux1 compatible v9.0.0b1 2019-06-24 01:32:42 -07:00
James R. Barlow
9873d51f58 release notes: add next 2019-06-23 16:54:53 -07:00
James R. Barlow
11a57c7a17 Drop --mask-barcodes feature 2019-06-23 16:54:43 -07:00
James R. Barlow
8aa678859d Use pandoc to rewrite .rst files
Fixes all of the long lines, mainly.
2019-06-22 17:29:26 -07:00
James R. Barlow
1beb7dfd37 helpers: don't expect psutil will be installed
It's not in stdlib
2019-06-22 02:36:06 -07:00
James R. Barlow
9b60d3e285 Improve testing of _validation.py 2019-06-22 02:33:04 -07:00
James R. Barlow
3331a686fa Fix tess_threads clamped to 1 2019-06-22 00:59:33 -07:00
James R. Barlow
c32ea3b374 If a page have vector content, promote to full color 2019-06-22 00:59:04 -07:00
James R. Barlow
c357d4146e Restructure ocrmypdf.pdfinfo 2019-06-20 03:10:41 -07:00
James R. Barlow
f47cb2fade docs: update ocrmypdf.ocrmypdf to .run 2019-06-20 02:45:14 -07:00
James R. Barlow
9c4b1aeb8d docs: plugin; renaming 2019-06-20 02:44:54 -07:00
James R. Barlow
51ed381bfc Rename weave -> graft 2019-06-13 01:16:56 -07:00
James R. Barlow
5ee45411c9 Decide on OMP_THREAD_LIMIT more intelligently 2019-06-13 01:02:07 -07:00
James R. Barlow
16990890d8 Remove "from ocrmypdf import ocrmypdf"
Messes up future imports from ocrmypdf, so don't do it.
2019-06-12 17:52:25 -07:00
James R. Barlow
cfb11559d5 logging: capture warnings too 2019-06-12 17:28:02 -07:00
James R. Barlow
8b8de7cc1d Add new --pages feature to limit OCR to only specific pages 2019-06-12 17:27:47 -07:00
James R. Barlow
aba293fd80 Change "Temporary working files" output message 2019-06-12 13:56:02 -07:00
James R. Barlow
066a293462 If verbose, print stacktrace on KeyboardInterrupt 2019-06-12 13:55:43 -07:00
James R. Barlow
0bbd6885e2 Make the go/no-go decision pluggable v8.4.0b1 2019-06-06 23:07:46 -07:00
James R. Barlow
5dd10c961c Docker: prefer streaming 2019-06-05 03:14:36 -07:00
James R. Barlow
81fc95556c Add progress bar for PdfInfo step 2019-06-05 03:08:04 -07:00
James R. Barlow
20ad032977 Fix some error messages that printed directly to sys.stderr instead of logging 2019-06-05 03:07:48 -07:00
James R. Barlow
93f1b73579 Fix --remove-vectors which was broken in API migration
It got dropped during the change. This feature has also been altered so that
the final visual appearance of the file is not affected, only the OCR image.
2019-06-05 02:04:45 -07:00
James R. Barlow
fd427a8ec1 plugins: replace path manipulation 2019-06-05 01:46:56 -07:00
James R. Barlow
9444cf357b optimize: add divide by zero check 2019-06-04 02:01:53 -07:00
James R. Barlow
5ab69153ee Fix .coveragerc 2019-06-03 02:26:49 -07:00
James R. Barlow
eb5200d26a Change most tests to use ocrmypdf API instead of subprocess
The main benefit of this is code coverage gains can actually follow it.
Also removes most ugly os.environ hacks.
2019-06-03 01:45:27 -07:00
James R. Barlow
98a3fda1f5 Drop support for Tesseract 4 alpha releases without textonly_pdf (mostly)
hocr renderer can still be used
2019-06-03 01:39:41 -07:00
James R. Barlow
e73740ae9d test: remove test code that support tess3 or tess4 testing 2019-06-03 01:33:24 -07:00
James R. Barlow
fb933edc0f Use newer pytest tmp_path API 2019-06-01 01:55:51 -07:00
James R. Barlow
ba41ccae1b conftest: don't modify PYTEST_CURRENT_TEST when manipulating os.environ
It confuses pytest.
2019-06-01 01:41:39 -07:00
James R. Barlow
df9e286e9c Make bypassed exception clearer 2019-06-01 01:35:15 -07:00
James R. Barlow
b9d6e46572 shutil.rmtree: use builtin error suppression 2019-05-31 15:12:46 -07:00
James R. Barlow
8347c0d662 validation: remove dead code check_input_file 2019-05-31 01:57:08 -07:00
James R. Barlow
45a361d112 Add option to use threads instead of processes
Mainly since they are more convenient for debugging
2019-05-31 01:56:16 -07:00
James R. Barlow
522e1e948b ghostscript: don't use threads= for generate_pdfa
Not supported for pdfwrite
2019-05-31 01:55:29 -07:00
James R. Barlow
8ed4e229f3 ghostscript: avoid log=None construct 2019-05-30 13:57:38 -07:00
James R. Barlow
db29cae177 Docker docs: Remove legacy images, revive Ubuntu 2019-05-28 21:36:45 -07:00
James R. Barlow
d5b6cbb95e Update Ubuntu dockerfile 2019-05-28 15:36:50 -07:00
James R. Barlow
396c39978a Reorganize .docker folder so we don't have to rebuild as much 2019-05-28 14:18:54 -07:00
James R. Barlow
9d5f23e961 Rename filters to plugins 2019-05-28 02:39:25 -07:00
James R. Barlow
26a6232e1c Ignore DSStore 2019-05-28 02:33:35 -07:00
James R. Barlow
7566d4b768 Introduce plugins/filters 2019-05-27 16:55:04 -07:00
James R. Barlow
5c4c32ab3c Remove multiprocessing tests - no longer valid 2019-05-27 12:07:20 -07:00
James R. Barlow
692f7b3151 Dockerfile: with newer pip
Newer pip seems to install ocrmypdf-*.dist-info and has no problem reporting
installed version unlike -egg-info, so
skip copying.

Also move WORKDIR
2019-05-26 04:31:53 -07:00
James R. Barlow
8d0958d7ea Dockerfile: qpdf-dev needs to be requested explicitly 2019-05-26 04:30:34 -07:00