James R. Barlow
380b981763
Remove most Python 3.6 special casing
2021-11-13 00:27:48 -08:00
James R. Barlow
790d3022f6
Implement --output-type=none to skip producing the PDF and use only the sidecar
...
Closes #787
2021-09-26 01:07:34 -07:00
James R. Barlow
906d77b389
tests: remove obsolete running_in_travis()
2021-04-07 02:25:10 -07:00
James R. Barlow
9416e850ff
Remove another instance of helpers_namespace
2021-04-07 02:23:04 -07:00
James R. Barlow
aa115a8be3
Remove pytest_helpers_namespace
2021-04-07 01:56:51 -07:00
James R. Barlow
2846d46bb8
Remove .coveragerc and fold into setup.cfg
2021-01-06 03:58:18 -08:00
James R. Barlow
895fddd85e
Replace most uses of universal_newlines with text
...
The parameters are equivalent but the latter is better named. Since
Python 3.6 doesn't support text= we use our wrapper to add it in that
place.
This is for subprocess.run.
2020-11-07 00:48:08 -08:00
James R. Barlow
aa0ec40102
Change license of all GPLv3 files to MPL-2.0
...
https://github.com/jbarlow83/OCRmyPDF/issues/600
2020-08-05 00:44:42 -07:00
James R. Barlow
48e2750551
Fix some tests that were failing in Docker
2020-06-21 01:48:13 -07:00
James R. Barlow
64891c2fc3
Pre-release delinting
2020-06-09 15:27:14 -07:00
James R. Barlow
0f942fb714
Rename ocrmypdf.exec -> ocrmypdf._exec
2020-06-09 14:59:09 -07:00
James R. Barlow
3b6f6782f0
Remove tesseract_env, --tesseract-env
2020-06-09 00:39:53 -07:00
James R. Barlow
21c0e045cb
Remove _OCRMYPDF_TEST_PATH environment variable
2020-06-09 00:30:13 -07:00
James R. Barlow
ebbf68bd08
The big payoff: abolishing spoofing machinery
2020-06-09 00:08:20 -07:00
James R. Barlow
a9a473f2e5
Convert all tesseract cache usages to plugin
2020-06-05 17:55:18 -07:00
James R. Barlow
1598f2f0e5
Abolish spoof_tesseract_noop
2020-06-01 03:07:53 -07:00
James R. Barlow
2b23f7ec73
tesseract_noop: begin implementing with plugin
2020-06-01 02:45:49 -07:00
James R. Barlow
9bccff4f88
Move Tesseract specific arguments to plugin
2020-05-16 03:24:31 -07:00
James R. Barlow
2bd586e093
Compare requested languages to OCR engine instead of tesseract directly
...
Also refactoring to facilitating validation needing the plugin manager.
2020-05-16 01:50:37 -07:00
James R. Barlow
41eb54cc0a
Standardize tesseract.generate_hocr and _pdf parameters
2020-05-14 03:23:25 -07:00
James R. Barlow
12a2f78c4d
Fix validation of languages not using tesseract_env
...
And some related issues.
2020-05-14 03:19:22 -07:00
James R. Barlow
85cbf94a6e
Convert many uses of str paths to Path
2020-05-06 02:53:47 -07:00
James R. Barlow
c85278b31d
Delinting
2020-05-03 00:53:29 -07:00
James R. Barlow
e02f6c1e97
Support plugin invocation with API
2020-05-02 03:34:31 -07:00
James R. Barlow
378e4dae3b
Expand documentation for subprocess.run() from test
2020-03-04 13:37:44 -08:00
James R. Barlow
422ea9777e
Remove session scope from fixtures
...
pytest seems to prepare os.environ in complex ways, so we want to ensure
these fixtures are not reused.
2019-12-31 17:09:23 -08:00
James R. Barlow
2f1c743227
Rewrite main pool loop
...
pytest-cov documentation recommends using explicit
management of multiprocessing.Pool rather than the context manager.
This is supposed to work better for collecting coverage data, particularly
on Windows.
2019-12-31 16:23:41 -08:00
James R. Barlow
96ee21aee9
Try to set up subprocess coverage better
2019-12-31 15:39:45 -08:00
James R. Barlow
25d2b0cda4
test: environment warnings/cleanup
2019-12-30 22:38:50 -08:00
James R. Barlow
c5edff2c2f
Sort imports
2019-12-19 15:31:18 -08:00
James R. Barlow
f6510e2b15
Document function of symlink shim
2019-12-06 15:00:12 -08:00
James R. Barlow
06a1f987d4
Use _OCRMYPDF_TEST_PATH for testing and .py stubs to simulate symlinks
2019-12-04 21:01:06 -08:00
James R. Barlow
43ab7c88d7
Remove os_environ() context manager
2019-12-04 17:37:38 -08:00
James R. Barlow
0cd424ffcb
Enforce str-only environment for Windows since it's more strict
2019-12-04 17:14:27 -08:00
James R. Barlow
fde550f9a7
test: Replace many instances of run_ocrmypdf in subprocess with inline
2019-12-04 17:14:27 -08:00
James R. Barlow
3f92867ae6
Fix TypeError "environment can only contain strings"
...
Apparently Windows Python doesn't coerce pathlib.Path to str.
2019-12-04 17:13:51 -08:00
James R. Barlow
7755c5c5a7
tests: fix interpretation of None as omitted argument
2019-08-11 16:58:22 -07:00
James R. Barlow
6fbeb6347d
Merge api (without plugins)
2019-07-27 02:04:01 -07:00
James R. Barlow
12769b96e5
Drop support for omitting pdfminer.six
2019-07-10 13:37:01 -07:00
James R. Barlow
20ad032977
Fix some error messages that printed directly to sys.stderr instead of logging
2019-06-05 03:07:48 -07:00
James R. Barlow
eb5200d26a
Change most tests to use ocrmypdf API instead of subprocess
...
The main benefit of this is code coverage gains can actually follow it.
Also removes most ugly os.environ hacks.
2019-06-03 01:45:27 -07:00
James R. Barlow
fb933edc0f
Use newer pytest tmp_path API
2019-06-01 01:55:51 -07:00
James R. Barlow
ba41ccae1b
conftest: don't modify PYTEST_CURRENT_TEST when manipulating os.environ
...
It confuses pytest.
2019-06-01 01:41:39 -07:00
James R. Barlow
5cecb3ecb4
Convert one test to use API
2019-05-22 23:53:48 -07:00
James R. Barlow
dc616bb507
Fix test suite so --clean is not requested when unpaper is not installed
2019-03-05 22:33:13 -08:00
James R. Barlow
5da26e4c9c
Convert most uses of subprocess.Popen to subprocess.run in test suite
2019-03-05 22:25:22 -08:00
James R. Barlow
f095e91cb4
unpaper-args: add test case and harden feature
2019-02-07 16:21:02 -08:00
James R. Barlow
8c0009c5c8
Make pdfminer.six optional
...
Mainly since the current release of pdfminer.six lacks a sdist, blocking
homebrew packaging. Also in case other distros don't accept pdfminer.six.
2018-12-31 01:08:43 -08:00
James R. Barlow
0880b16491
Sort imports with isort
2018-12-30 01:28:15 -08:00
James R. Barlow
06308a22ce
Reformat with black
2018-12-30 01:27:49 -08:00