James R. Barlow
8b8de7cc1d
Add new --pages feature to limit OCR to only specific pages
2019-06-12 17:27:47 -07:00
James R. Barlow
aba293fd80
Change "Temporary working files" output message
2019-06-12 13:56:02 -07:00
James R. Barlow
066a293462
If verbose, print stacktrace on KeyboardInterrupt
2019-06-12 13:55:43 -07:00
James R. Barlow
0bbd6885e2
Make the go/no-go decision pluggable
v8.4.0b1
2019-06-06 23:07:46 -07:00
James R. Barlow
5dd10c961c
Docker: prefer streaming
2019-06-05 03:14:36 -07:00
James R. Barlow
81fc95556c
Add progress bar for PdfInfo step
2019-06-05 03:08:04 -07:00
James R. Barlow
20ad032977
Fix some error messages that printed directly to sys.stderr instead of logging
2019-06-05 03:07:48 -07:00
James R. Barlow
93f1b73579
Fix --remove-vectors which was broken in API migration
...
It got dropped during the change. This feature has also been altered so that
the final visual appearance of the file is not affected, only the OCR image.
2019-06-05 02:04:45 -07:00
James R. Barlow
fd427a8ec1
plugins: replace path manipulation
2019-06-05 01:46:56 -07:00
James R. Barlow
9444cf357b
optimize: add divide by zero check
2019-06-04 02:01:53 -07:00
James R. Barlow
5ab69153ee
Fix .coveragerc
2019-06-03 02:26:49 -07:00
James R. Barlow
eb5200d26a
Change most tests to use ocrmypdf API instead of subprocess
...
The main benefit of this is code coverage gains can actually follow it.
Also removes most ugly os.environ hacks.
2019-06-03 01:45:27 -07:00
James R. Barlow
98a3fda1f5
Drop support for Tesseract 4 alpha releases without textonly_pdf (mostly)
...
hocr renderer can still be used
2019-06-03 01:39:41 -07:00
James R. Barlow
e73740ae9d
test: remove test code that support tess3 or tess4 testing
2019-06-03 01:33:24 -07:00
James R. Barlow
fb933edc0f
Use newer pytest tmp_path API
2019-06-01 01:55:51 -07:00
James R. Barlow
ba41ccae1b
conftest: don't modify PYTEST_CURRENT_TEST when manipulating os.environ
...
It confuses pytest.
2019-06-01 01:41:39 -07:00
James R. Barlow
df9e286e9c
Make bypassed exception clearer
2019-06-01 01:35:15 -07:00
James R. Barlow
b9d6e46572
shutil.rmtree: use builtin error suppression
2019-05-31 15:12:46 -07:00
James R. Barlow
8347c0d662
validation: remove dead code check_input_file
2019-05-31 01:57:08 -07:00
James R. Barlow
45a361d112
Add option to use threads instead of processes
...
Mainly since they are more convenient for debugging
2019-05-31 01:56:16 -07:00
James R. Barlow
522e1e948b
ghostscript: don't use threads= for generate_pdfa
...
Not supported for pdfwrite
2019-05-31 01:55:29 -07:00
James R. Barlow
8ed4e229f3
ghostscript: avoid log=None construct
2019-05-30 13:57:38 -07:00
James R. Barlow
db29cae177
Docker docs: Remove legacy images, revive Ubuntu
2019-05-28 21:36:45 -07:00
James R. Barlow
d5b6cbb95e
Update Ubuntu dockerfile
2019-05-28 15:36:50 -07:00
James R. Barlow
396c39978a
Reorganize .docker folder so we don't have to rebuild as much
2019-05-28 14:18:54 -07:00
James R. Barlow
9d5f23e961
Rename filters to plugins
2019-05-28 02:39:25 -07:00
James R. Barlow
26a6232e1c
Ignore DSStore
2019-05-28 02:33:35 -07:00
James R. Barlow
7566d4b768
Introduce plugins/filters
2019-05-27 16:55:04 -07:00
James R. Barlow
5c4c32ab3c
Remove multiprocessing tests - no longer valid
2019-05-27 12:07:20 -07:00
James R. Barlow
692f7b3151
Dockerfile: with newer pip
...
Newer pip seems to install ocrmypdf-*.dist-info and has no problem reporting
installed version unlike -egg-info, so
skip copying.
Also move WORKDIR
2019-05-26 04:31:53 -07:00
James R. Barlow
8d0958d7ea
Dockerfile: qpdf-dev needs to be requested explicitly
2019-05-26 04:30:34 -07:00
James R. Barlow
e9731b6bac
Docker: upgrade pip, temporarily enable community repository for qpdf
2019-05-26 04:00:24 -07:00
James R. Barlow
0628a89041
docs: mention how to use Docker image shell
2019-05-26 00:20:40 -07:00
James R. Barlow
c14f62752b
Tests: add an API test
2019-05-25 16:24:09 -07:00
James R. Barlow
24855045e1
Provisionally add filters
2019-05-25 16:23:39 -07:00
James R. Barlow
ed236e0c27
Begin API documentation
2019-05-24 01:05:32 -07:00
James R. Barlow
db6aa22eae
Progress bar: unit types
2019-05-23 02:00:47 -07:00
James R. Barlow
805aa776ad
Re-disable progress bar when not connected to tty
2019-05-23 02:00:35 -07:00
James R. Barlow
d0efdf643c
Cleanup working files when done with a particular file, rather than end of process
2019-05-23 01:25:08 -07:00
James R. Barlow
22298b31be
Fix distinction between clean and clean_final lost in API refactor
2019-05-23 01:19:58 -07:00
James R. Barlow
5cecb3ecb4
Convert one test to use API
2019-05-22 23:53:48 -07:00
James R. Barlow
a139e64c67
api: short-circuit exception handler, as caller should provide their own
2019-05-22 18:30:30 -07:00
James R. Barlow
db69b4d11a
Improve argparse behavior for its role in making the API work
2019-05-22 15:55:48 -07:00
James R. Barlow
8bcb85720c
release notes: clarify
2019-05-22 15:34:23 -07:00
James R. Barlow
09ca1bee97
Add progress bar to optimize and add option to disable it
2019-05-22 15:31:48 -07:00
James R. Barlow
23dd77ce0f
api: fix progress_bar_friendly=False
2019-05-22 15:31:03 -07:00
James R. Barlow
32a076c039
Refactor validation and exceptions
...
CLI now tracks check_options exceptions. API now works more like
an API, without an exception handler,
because the caller should provide one.
2019-05-20 18:01:17 -07:00
James R. Barlow
e4baa8c0dd
Remove sys.exit() calls so we don't terminate caller application
2019-05-20 15:08:20 -07:00
James R. Barlow
2fdaa76a0d
Refactor configure_logging
2019-05-20 14:54:34 -07:00
James R. Barlow
7ee0c52a57
Refactor cli into basic high level api
2019-05-19 22:34:45 -07:00