295 Commits

Author SHA1 Message Date
James R. Barlow
aa115a8be3
Remove pytest_helpers_namespace 2021-04-07 01:56:51 -07:00
James R. Barlow
bb258fc99c
pdfinfo: Refactor pageinfo dictionary into a class 2020-12-24 01:47:53 -08:00
James R. Barlow
895fddd85e
Replace most uses of universal_newlines with text
The parameters are equivalent but the latter is better named. Since
Python 3.6 doesn't support text= we use our wrapper to add it in that
place.

This is for subprocess.run.
2020-11-07 00:48:08 -08:00
James R. Barlow
e6a7b58863 Merge branch 'de-gpl' 2020-08-12 12:20:38 -07:00
James R. Barlow
9b641055e1
Fix KeyError: 'dpi' when using --threshold on image to PDF
Fixes #607
2020-08-07 02:21:02 -07:00
James R. Barlow
aa0ec40102
Change license of all GPLv3 files to MPL-2.0
https://github.com/jbarlow83/OCRmyPDF/issues/600
2020-08-05 00:44:42 -07:00
James R. Barlow
86a73191b0
Plugin manager: accept Path(plugin) 2020-06-30 04:17:30 -07:00
James R. Barlow
48e2750551
Fix some tests that were failing in Docker 2020-06-21 01:48:13 -07:00
James R. Barlow
892db88f0e
test_two_languages: use narrower test 2020-06-12 14:33:02 -07:00
James R. Barlow
393c5a9ea4 Fix error on -l lang1+lang2 2020-06-12 12:10:29 -07:00
James R. Barlow
0f942fb714 Rename ocrmypdf.exec -> ocrmypdf._exec 2020-06-09 14:59:09 -07:00
James R. Barlow
be8ca589d4
Move ocrmypdf.exec.run and friends to ocrmypdf.subprocess 2020-06-09 14:53:10 -07:00
James R. Barlow
b109445215
Move Ghostscript rasterize_pdf to plugin 2020-06-08 17:10:27 -07:00
James R. Barlow
a9a473f2e5 Convert all tesseract cache usages to plugin 2020-06-05 17:55:18 -07:00
James R. Barlow
6268e2faff
Begin replacing tests/spoof/tesseract_cache with plugin 2020-06-05 17:27:10 -07:00
James R. Barlow
1b92f447c3
Convert tesseract_crash to plugin 2020-06-02 02:36:41 -07:00
James R. Barlow
4f4ad0fb76
Convert tesseract_big_image_error to plugin 2020-06-02 01:49:47 -07:00
James R. Barlow
1598f2f0e5 Abolish spoof_tesseract_noop 2020-06-01 03:07:53 -07:00
James R. Barlow
2b23f7ec73
tesseract_noop: begin implementing with plugin 2020-06-01 02:45:49 -07:00
James R. Barlow
41eb54cc0a
Standardize tesseract.generate_hocr and _pdf parameters 2020-05-14 03:23:25 -07:00
James R. Barlow
12a2f78c4d
Fix validation of languages not using tesseract_env
And some related issues.
2020-05-14 03:19:22 -07:00
James R. Barlow
85cbf94a6e
Convert many uses of str paths to Path 2020-05-06 02:53:47 -07:00
James R. Barlow
c85278b31d
Delinting 2020-05-03 00:53:29 -07:00
James R. Barlow
8f5c95f0f4
Remove last vestiges of command line usage of qpdf - change to check_pdf 2020-04-26 05:33:26 -07:00
James R. Barlow
991db17fde
Remove Ghostscript-based text extraction
While faster than Python based methods, we've outgrown the limited
amount of information Ghostscript provides with this feature, and it
repeats an analysis we have to do anyway to learn what images are
present.
2020-04-26 04:02:07 -07:00
James R. Barlow
94c52a6fa3
Refactor 'xyres' into Resolution 2020-04-24 04:12:05 -07:00
James R. Barlow
57771f06a3
Refactor xy-pair for resolution to tuple 2020-04-16 15:38:33 -07:00
James R. Barlow
346da95899 Suppress loglevel since we have color now 2020-04-15 00:09:36 -07:00
James R. Barlow
61a2674317 Skip test that needs chmod when on Windows 2020-01-06 02:36:04 -08:00
James R. Barlow
422ea9777e Remove session scope from fixtures
pytest seems to prepare os.environ in complex ways, so we want to ensure
these fixtures are not reused.
2019-12-31 17:09:23 -08:00
James R. Barlow
0c0d53b10f tests: AcroForm test case did not work correctly; fixed 2019-12-30 17:50:32 -08:00
James R. Barlow
63de7e1677 Improve error message for unreadable input files 2019-12-30 16:14:52 -08:00
James R. Barlow
c5571388e2 Improve test coverage of _sync.py 2019-12-10 01:06:27 -08:00
James R. Barlow
607eee198d tests: split out preprocessing tests 2019-12-09 16:18:01 -08:00
James R. Barlow
5e2a7f8a56 tests: speed up several slow tests 2019-12-09 16:17:57 -08:00
James R. Barlow
51abd79136 Tesseract no longer posts an error message if config file not found 2019-12-04 21:35:28 -08:00
James R. Barlow
5607429d9a tests: error message from tesseract change 2019-12-04 21:31:01 -08:00
James R. Barlow
cff37bf681 Make test_german more Windows-friendly 2019-12-04 21:01:09 -08:00
James R. Barlow
66d04dd6e3 Don't expect filenames to be replicated on NT 2019-12-04 21:01:09 -08:00
James R. Barlow
06a1f987d4 Use _OCRMYPDF_TEST_PATH for testing and .py stubs to simulate symlinks 2019-12-04 21:01:06 -08:00
James R. Barlow
ca9669742d Move gs tests to test_ghostscript 2019-12-04 17:14:27 -08:00
James R. Barlow
0cd424ffcb Enforce str-only environment for Windows since it's more strict 2019-12-04 17:14:27 -08:00
James R. Barlow
fde550f9a7 test: Replace many instances of run_ocrmypdf in subprocess with inline 2019-12-04 17:14:27 -08:00
James R. Barlow
37f6f72df3 tests: a few Windows fixes 2019-12-04 17:13:51 -08:00
James R. Barlow
72d3ee3a87 Refactor symlink usage to support Windows 2019-12-04 17:13:51 -08:00
James R. Barlow
4d26867dee Delinting 2019-09-20 17:17:11 -07:00
James R. Barlow
68c852acec Remove test_tesseract_config_invalid from suite
Also causes problems in CI
2019-09-18 13:28:02 -07:00
James R. Barlow
d7b7ca0574 v9.0.3 notes; Remove test_tesseract_config_notfound from suite 2019-09-05 13:39:43 -07:00
James R. Barlow
19ba3ae011 Allow test_german to xfail if deu language is not installed 2019-09-03 17:38:54 -07:00
James R. Barlow
feff1e38bb Use context managers to ensure Pillow images are closed 2019-09-03 17:19:12 -07:00