James R. Barlow
e6a7b58863
Merge branch 'de-gpl'
2020-08-12 12:20:38 -07:00
James R. Barlow
9b641055e1
Fix KeyError: 'dpi' when using --threshold on image to PDF
...
Fixes #607
2020-08-07 02:21:02 -07:00
James R. Barlow
bed74501fc
Fix test breakage in validation
...
Broken in commit 4cc0dc
2020-08-05 01:35:26 -07:00
James R. Barlow
aa0ec40102
Change license of all GPLv3 files to MPL-2.0
...
https://github.com/jbarlow83/OCRmyPDF/issues/600
2020-08-05 00:44:42 -07:00
James R. Barlow
7263702de9
Remove gs.py (spoofers entirely removed) and update copyright
2020-07-29 16:31:47 -07:00
James R. Barlow
44149ad319
Disable test_error_trap for Leptonica < 1.79
...
Old error trap seems unreliable in the first place so difficult to set up
a test.
2020-07-20 21:12:00 -07:00
James R. Barlow
5cbbff8472
For Leptonica 1.79+ use leptSetStderrHandler
...
Lock free and considerably less dangerous to stderr messages.
2020-07-19 03:40:33 -07:00
James R. Barlow
86a73191b0
Plugin manager: accept Path(plugin)
2020-06-30 04:17:30 -07:00
James R. Barlow
66337813e6
Spell runslow correctly
2020-06-22 23:32:09 -07:00
James R. Barlow
eb5a211e72
New hocrtransform test isn't platform stable - mark runslow
2020-06-22 16:59:59 -07:00
James R. Barlow
06ab114aa8
Update test cache
2020-06-22 16:31:34 -07:00
James R. Barlow
1257419465
test_hocrtransform: this test is worth not caching
2020-06-22 16:31:06 -07:00
James R. Barlow
30404f53f0
Add test to sanity check our pdf renderers
2020-06-22 16:18:38 -07:00
James R. Barlow
f4cb424451
Support input/output streams at API level
2020-06-22 02:02:18 -07:00
James R. Barlow
fef14778d5
Fix missing f-string in log message
2020-06-22 01:17:16 -07:00
James R. Barlow
48e2750551
Fix some tests that were failing in Docker
2020-06-21 01:48:13 -07:00
James R. Barlow
ebfe4f0d29
Fix issue #582 - PDF/A acquires title "Untitled" after conversion
2020-06-20 02:01:16 -07:00
James R. Barlow
892db88f0e
test_two_languages: use narrower test
2020-06-12 14:33:02 -07:00
James R. Barlow
eeb44f78cc
Fix tests that failed on other platforms from previous fix
2020-06-12 12:59:46 -07:00
James R. Barlow
393c5a9ea4
Fix error on -l lang1+lang2
2020-06-12 12:10:29 -07:00
James R. Barlow
c6b9a49cbb
Fix tests that fail in CI
2020-06-10 17:08:00 -07:00
James R. Barlow
872bafad4b
Reinstate quick test for text/no text
...
Partial revert of commit 991db17
2020-06-10 12:00:52 -07:00
James R. Barlow
64891c2fc3
Pre-release delinting
2020-06-09 15:27:14 -07:00
James R. Barlow
fe156db41d
Merge branch 'release/v10' into trialmerge
2020-06-09 15:12:56 -07:00
James R. Barlow
0f942fb714
Rename ocrmypdf.exec -> ocrmypdf._exec
2020-06-09 14:59:09 -07:00
James R. Barlow
be8ca589d4
Move ocrmypdf.exec.run and friends to ocrmypdf.subprocess
2020-06-09 14:53:10 -07:00
James R. Barlow
3b6f6782f0
Remove tesseract_env, --tesseract-env
2020-06-09 00:39:53 -07:00
James R. Barlow
21c0e045cb
Remove _OCRMYPDF_TEST_PATH environment variable
2020-06-09 00:30:13 -07:00
James R. Barlow
ebbf68bd08
The big payoff: abolishing spoofing machinery
2020-06-09 00:08:20 -07:00
James R. Barlow
2059e916da
Convert all ghostscript spoofs to test plugins
2020-06-09 00:00:25 -07:00
James R. Barlow
7b9025f397
Convert generate_pdfa to plugin
2020-06-08 22:28:38 -07:00
James R. Barlow
b109445215
Move Ghostscript rasterize_pdf to plugin
2020-06-08 17:10:27 -07:00
James R. Barlow
a9a473f2e5
Convert all tesseract cache usages to plugin
2020-06-05 17:55:18 -07:00
James R. Barlow
6268e2faff
Begin replacing tests/spoof/tesseract_cache with plugin
2020-06-05 17:27:10 -07:00
James R. Barlow
ec3f506500
Convert tesseract_badutf8 to plugin
2020-06-05 16:38:19 -07:00
James R. Barlow
5e14d5b0dd
Fix test_report_file_size
...
Use more realistic test data
2020-06-03 13:24:55 -07:00
James R. Barlow
c6b2fa8851
Remove unpaper spoof; no plugin needed
2020-06-02 02:42:14 -07:00
James R. Barlow
1b92f447c3
Convert tesseract_crash to plugin
2020-06-02 02:36:41 -07:00
James R. Barlow
82e7eb91d2
Tidy tesseract_noop
2020-06-02 01:50:02 -07:00
James R. Barlow
4f4ad0fb76
Convert tesseract_big_image_error to plugin
2020-06-02 01:49:47 -07:00
James R. Barlow
1598f2f0e5
Abolish spoof_tesseract_noop
2020-06-01 03:07:53 -07:00
James R. Barlow
2b23f7ec73
tesseract_noop: begin implementing with plugin
2020-06-01 02:45:49 -07:00
James R. Barlow
642ebc6098
Fix test that failed on Windows
2020-05-28 15:52:00 -07:00
James R. Barlow
df9f5157bd
Fix shim_paths to account for unexpected files in Program Files\gs
...
Fixes #565
2020-05-28 14:58:41 -07:00
James R. Barlow
aa060db5bc
Refactor tesseract_env variable into the plugin
...
Removed all cases except one in api.py, which isn't worth solving because
it should be removed anyway.
This also fixes a logic error in the OMP_THREAD_LIMIT decision, api.py
did not use pass kwargs correctly so they never worked before.
2020-05-26 02:14:06 -07:00
James R. Barlow
d43212d30b
Refactor --language argument into set
2020-05-25 03:20:10 -07:00
James R. Barlow
a0f9ca3a30
Move Tesseract options validation into plugin
2020-05-25 01:31:46 -07:00
James R. Barlow
9bccff4f88
Move Tesseract specific arguments to plugin
2020-05-16 03:24:31 -07:00
James R. Barlow
2bd586e093
Compare requested languages to OCR engine instead of tesseract directly
...
Also refactoring to facilitating validation needing the plugin manager.
2020-05-16 01:50:37 -07:00
James R. Barlow
9af94ac9b7
pipeline: use OCR engine abstraction instead of Tesseract
2020-05-16 01:28:56 -07:00