83 Commits

Author SHA1 Message Date
James R. Barlow
18b59c57b4
Refactor our tests that check if we are in a container 2024-10-27 11:55:22 -07:00
James R. Barlow
580252a1a0
Merge branch 'feature/gscan2pdf'
Reconcile release notes and copy_final() with new pipeline.
2023-10-30 00:01:28 -07:00
James R. Barlow
b5e73ac4e4
Drop check for obsolete .dockerinit file 2023-10-24 13:49:46 -07:00
James R. Barlow
e8ae370ceb
Eliminate api= kwarg and implicit creation of pluginmanager 2023-10-24 00:54:30 -07:00
James R. Barlow
9b8d14d16e
Accept most of ruff's delinting 2023-04-14 00:45:34 -07:00
James R. Barlow
f4155dca77
tests: convert all uses of multipage.pdf to fixture 2022-08-11 01:13:10 -07:00
James R. Barlow
80ed2117cc
Change to SPDX license tracking 2022-07-28 01:10:07 -07:00
James R. Barlow
dc6f1a266a
Modernize type annotations 2022-07-23 00:39:24 -07:00
James Barlow
776ada6713 Upgrade pre-commit and associated tools; various lints 2022-04-03 20:53:01 -07:00
James R. Barlow
13af3252ff tests: simplify run_ocrmypdf API 2021-12-06 17:00:25 -08:00
James R. Barlow
9de06f62ee Use Python executors instead of pools
ProcessPool/ThreadPool don't have the ability to notice when a child worker
was terminated. ProcessPoolExecutor and ThreadPoolExecutor do notice and
provide better error messages.

Add tests to check.
2021-12-06 15:38:27 -08:00
James R. Barlow
8fdcb15b4e tests: improve typing and remove some legacy code 2021-12-06 15:38:27 -08:00
James R. Barlow
380b981763 Remove most Python 3.6 special casing 2021-11-13 00:27:48 -08:00
James R. Barlow
790d3022f6 Implement --output-type=none to skip producing the PDF and use only the sidecar
Closes #787
2021-09-26 01:07:34 -07:00
James R. Barlow
906d77b389
tests: remove obsolete running_in_travis() 2021-04-07 02:25:10 -07:00
James R. Barlow
9416e850ff
Remove another instance of helpers_namespace 2021-04-07 02:23:04 -07:00
James R. Barlow
aa115a8be3
Remove pytest_helpers_namespace 2021-04-07 01:56:51 -07:00
James R. Barlow
2846d46bb8
Remove .coveragerc and fold into setup.cfg 2021-01-06 03:58:18 -08:00
James R. Barlow
895fddd85e
Replace most uses of universal_newlines with text
The parameters are equivalent but the latter is better named. Since
Python 3.6 doesn't support text= we use our wrapper to add it in that
place.

This is for subprocess.run.
2020-11-07 00:48:08 -08:00
James R. Barlow
aa0ec40102
Change license of all GPLv3 files to MPL-2.0
https://github.com/jbarlow83/OCRmyPDF/issues/600
2020-08-05 00:44:42 -07:00
James R. Barlow
48e2750551
Fix some tests that were failing in Docker 2020-06-21 01:48:13 -07:00
James R. Barlow
64891c2fc3
Pre-release delinting 2020-06-09 15:27:14 -07:00
James R. Barlow
0f942fb714 Rename ocrmypdf.exec -> ocrmypdf._exec 2020-06-09 14:59:09 -07:00
James R. Barlow
3b6f6782f0
Remove tesseract_env, --tesseract-env 2020-06-09 00:39:53 -07:00
James R. Barlow
21c0e045cb
Remove _OCRMYPDF_TEST_PATH environment variable 2020-06-09 00:30:13 -07:00
James R. Barlow
ebbf68bd08
The big payoff: abolishing spoofing machinery 2020-06-09 00:08:20 -07:00
James R. Barlow
a9a473f2e5 Convert all tesseract cache usages to plugin 2020-06-05 17:55:18 -07:00
James R. Barlow
1598f2f0e5 Abolish spoof_tesseract_noop 2020-06-01 03:07:53 -07:00
James R. Barlow
2b23f7ec73
tesseract_noop: begin implementing with plugin 2020-06-01 02:45:49 -07:00
James R. Barlow
9bccff4f88
Move Tesseract specific arguments to plugin 2020-05-16 03:24:31 -07:00
James R. Barlow
2bd586e093
Compare requested languages to OCR engine instead of tesseract directly
Also refactoring to facilitating validation needing the plugin manager.
2020-05-16 01:50:37 -07:00
James R. Barlow
41eb54cc0a
Standardize tesseract.generate_hocr and _pdf parameters 2020-05-14 03:23:25 -07:00
James R. Barlow
12a2f78c4d
Fix validation of languages not using tesseract_env
And some related issues.
2020-05-14 03:19:22 -07:00
James R. Barlow
85cbf94a6e
Convert many uses of str paths to Path 2020-05-06 02:53:47 -07:00
James R. Barlow
c85278b31d
Delinting 2020-05-03 00:53:29 -07:00
James R. Barlow
e02f6c1e97
Support plugin invocation with API 2020-05-02 03:34:31 -07:00
James R. Barlow
378e4dae3b
Expand documentation for subprocess.run() from test 2020-03-04 13:37:44 -08:00
James R. Barlow
422ea9777e Remove session scope from fixtures
pytest seems to prepare os.environ in complex ways, so we want to ensure
these fixtures are not reused.
2019-12-31 17:09:23 -08:00
James R. Barlow
2f1c743227 Rewrite main pool loop
pytest-cov documentation recommends using explicit
management of multiprocessing.Pool rather than the context manager.
This is supposed to work better for collecting coverage data, particularly
on Windows.
2019-12-31 16:23:41 -08:00
James R. Barlow
96ee21aee9 Try to set up subprocess coverage better 2019-12-31 15:39:45 -08:00
James R. Barlow
25d2b0cda4 test: environment warnings/cleanup 2019-12-30 22:38:50 -08:00
James R. Barlow
c5edff2c2f Sort imports 2019-12-19 15:31:18 -08:00
James R. Barlow
f6510e2b15 Document function of symlink shim 2019-12-06 15:00:12 -08:00
James R. Barlow
06a1f987d4 Use _OCRMYPDF_TEST_PATH for testing and .py stubs to simulate symlinks 2019-12-04 21:01:06 -08:00
James R. Barlow
43ab7c88d7 Remove os_environ() context manager 2019-12-04 17:37:38 -08:00
James R. Barlow
0cd424ffcb Enforce str-only environment for Windows since it's more strict 2019-12-04 17:14:27 -08:00
James R. Barlow
fde550f9a7 test: Replace many instances of run_ocrmypdf in subprocess with inline 2019-12-04 17:14:27 -08:00
James R. Barlow
3f92867ae6 Fix TypeError "environment can only contain strings"
Apparently Windows Python doesn't coerce pathlib.Path to str.
2019-12-04 17:13:51 -08:00
James R. Barlow
7755c5c5a7 tests: fix interpretation of None as omitted argument 2019-08-11 16:58:22 -07:00
James R. Barlow
6fbeb6347d Merge api (without plugins) 2019-07-27 02:04:01 -07:00