2676 Commits

Author SHA1 Message Date
James R. Barlow
37dc03eec6 Add workaround for tess4 form feed behavior change 2017-10-10 23:33:54 -07:00
James R. Barlow
6b478172f6 Try using plain tessdata instead of tessdata_best 2017-10-10 20:52:03 -07:00
James R. Barlow
c6e73bcfd6 Still failing, did the cp work at all? 2017-10-10 15:15:26 -07:00
James R. Barlow
a2d62938ce cp overwrite needs sudo 2017-10-10 14:54:52 -07:00
James R. Barlow
d7ae1f3cca travis: Replacement problematic traineddate file 2017-10-10 14:39:05 -07:00
James R. Barlow
70219581c4 Disable tesseract 4 so tests can succeed
tess4 -psm 0 is broken right now
2017-10-10 14:18:02 -07:00
James R. Barlow
f70ac9fb89 Workaround travis issues in build stages... maybe 2017-10-10 12:36:50 -07:00
James R. Barlow
235b9fbaf0 Resolve merge conflicts 2017-10-10 12:24:10 -07:00
James R. Barlow
ebda7f42db travis: need script for each stage 2017-10-10 12:22:23 -07:00
James R. Barlow
0b04e4b977 Try out travis build matrix 2017-10-10 12:14:50 -07:00
James R. Barlow
9498601a37 Add docs on adding to docker iamge 2017-10-10 12:13:20 -07:00
James R. Barlow
ef5d320e06 Ignore .vscode too 2017-10-09 16:19:41 -07:00
James R. Barlow
b00c9a562d Remove meaningless version from Dockerfile.polyglot 2017-10-09 16:18:02 -07:00
James R. Barlow
5372656893 Don't say tess4 support is experimental - it's pretty good now 2017-10-09 16:17:42 -07:00
James R. Barlow
571de0e368 Update release notes 2017-10-08 12:41:03 -07:00
James R. Barlow
82cea2fd85 Update batch processing docs to include Synology script 2017-10-08 12:34:36 -07:00
James R. Barlow
aed9814345 Use Ubuntu 17.04 instead of 16.10 for Docker image (issue #191)
Due to 16.10 PPAs no longer being generated by alex-p
2017-10-08 12:13:20 -07:00
James R. Barlow
34fc1f5fd7 Add reminder that blank.pdf is not trivial 2017-09-13 01:19:18 -07:00
James R. Barlow
87c2ed8b27 Improve clarity of --pdf-renderer=tesseract deprecation warning 2017-09-12 14:34:53 -07:00
James R. Barlow
1467d118ab Add more leptonica functions 2017-09-06 00:27:02 -07:00
James R. Barlow
922dbe83c3 Update MANIFEST rules 2017-09-02 20:05:57 -07:00
James R. Barlow
6af7d61ee5 Fix CI failure due to spoofers not being updated to Tesseract 3.05 strings v5.3.3 2017-09-01 16:17:26 -07:00
James R. Barlow
bafd08391d Update release notes 2017-09-01 12:50:45 -07:00
James R. Barlow
82ebd8ef1a Fix missing error message about trying to use sandwich on old tesseract 2017-09-01 12:50:36 -07:00
James R. Barlow
4ed1aa4d23 Release notes: fix indentation 2017-09-01 12:47:22 -07:00
James R. Barlow
d04e43d46d Update copyright info for test files
[ci skip]
2017-09-01 01:00:32 -07:00
James R. Barlow
952f0cca15 Dockerfiles: set LANG=C.UTF-8
Issue #184 to avoid issue with printing UTF-8 text to sidecar
2017-08-30 13:25:54 -07:00
James R. Barlow
f6a4d8f1f8 Fix Ubuntu 14.04 install instructions to account for dropping Py3.4 support
[ci skip]
2017-08-27 13:53:36 -07:00
James R. Barlow
b3097a2384 Fix broken test case related to language packs v5.3.2 2017-08-24 13:01:02 -07:00
James R. Barlow
6d9ddbe98b v5.3.1 notes v5.3.1 2017-08-24 01:09:19 -07:00
James R. Barlow
9bb42c0229 Wrong error type used for missing language 2017-08-24 01:07:23 -07:00
James R. Barlow
bd7226b27a Merge branch 'master' of github.com:jbarlow83/OCRmyPDF 2017-08-23 23:30:19 -07:00
James R. Barlow
5b413e3873 Cookbook: add "don't OCR" examples 2017-08-23 23:29:41 -07:00
James R. Barlow
be5831a629 Offer the readme as a long description for new PyPI 2017-08-23 23:29:21 -07:00
jbarlow83
084d2bf8e2 More badges 2017-08-23 23:19:29 -07:00
James R. Barlow
da79e6bac7 macos: Skip brew audit because it seems to crash ruby on travis 2017-07-27 16:00:41 -07:00
James R. Barlow
c4831ac00c v5.3 release notes v5.3 2017-07-27 00:11:12 -07:00
James R. Barlow
93a954ef9f Fix missing import for Py3.5 2017-07-26 23:40:01 -07:00
James R. Barlow
f7ce8f44e9 Weaken the --user-words test so it will pass on Travis 2017-07-26 21:03:51 -07:00
James R. Barlow
0b012697e5 Whitelist the Latin-1 languages that work with HOCR
Omitted French because the rare 'oe' and 'ÿ' glyphs are not in Latin-1.
Basically steer people away from HOCR renderer but avoid a potential
disruptive behavior change.
2017-07-26 21:03:18 -07:00
James R. Barlow
58e357c992 Report location of attempted output_file that fails to write 2017-07-22 17:49:56 -07:00
James R. Barlow
71fbad83ad Fix py3.5 test 2017-07-21 17:01:06 -07:00
James R. Barlow
52483072dc Add a differential test that checks tesseract uses supplied word list 2017-07-21 16:40:20 -07:00
James R. Barlow
7f0b8621f3 Tests: accept rich path objects without having to str() everything 2017-07-21 16:39:22 -07:00
James R. Barlow
cd8db60b06 Crash test all renderers, not just two 2017-07-21 14:10:02 -07:00
James R. Barlow
1aa34f5d2e Make some interfaces accepting of both str-paths and Path objects 2017-07-21 13:28:30 -07:00
James R. Barlow
dfa1d88ce9 Fix missing user_words/user_patterns from textonly_pdf case 2017-07-20 17:14:04 -07:00
James R. Barlow
dd38519f07 Merge branch 'feature/user-words' into develop
# Conflicts:
#	ocrmypdf/exec/tesseract.py
2017-07-20 16:25:20 -07:00
James R. Barlow
098f5d4f0b docs: remove deprecated example of pdftotext 2017-07-20 16:20:17 -07:00
James R. Barlow
ffc685d536 docs: envvar markup 2017-07-20 16:19:57 -07:00