James R. Barlow
8c17c9918e
Add documentation and test cases for —tesseract-config
...
This parameter has existed for along time but never really got any
attention.
2017-01-28 22:06:51 -08:00
James R. Barlow
a4f07756a5
tesseract caching: don't transcode tesseract's output, hash source file
...
For sanity's sake, deal with tesseract streams in binary without
transcoding (via universal_newlines, etc.). The only differences are
printing messages regarding spoofing.
Also hash the source file so that changes to the cache mechanism
invalidate old cache automatically. That is probably too aggressive,
but simple and safer than the previous approach.
2016-10-28 16:44:12 -07:00
James R. Barlow
cc7e328358
Improve some documentation for tests
2016-08-26 15:04:08 -07:00
James R. Barlow
8246cc0538
Gracefully recover from tesseract's failure to process very large images
...
And test cases to check this
2016-02-20 04:53:23 -08:00
James R. Barlow
b907234d5c
Update tesseract spoofing to cache orientation and script detection checks
...
No cache: 269 s
With cache: 144 s
test_oversample[tesseract] now fails, all others good
2016-02-08 02:21:56 -08:00
James R. Barlow
3b53e9adac
Use tesseract cache for -psm
2016-01-11 17:22:50 -08:00
James R. Barlow
09782242c8
Adjust test cases to use cache and noop more effectively
...
This reduces total execution time to 164s on my machine, down from
about double that.
2015-12-17 14:00:17 -08:00
James R. Barlow
9ec4aa039d
Add tesseract caching to speed up tests
2015-12-17 12:52:12 -08:00