689 Commits

Author SHA1 Message Date
James R. Barlow
313bbbb94c setup_scm_git_archive: add additional files 2016-02-29 12:46:27 -08:00
James R. Barlow
0360f078de get_postscript_icc_path: don't check the same path multiple times 2016-02-29 12:45:58 -08:00
James R. Barlow
c8901666c4 Merge branch 'master' of https://github.com/jbarlow83/OCRmyPDF 2016-02-29 00:06:07 -08:00
James R. Barlow
7430006596 Improve install instructions for OS X (unpaper) 2016-02-29 00:05:31 -08:00
James R. Barlow
f3e06b2dbd Add bookmarks to file for more testing 2016-02-29 00:05:07 -08:00
jbarlow83
e97df307ff Merge pull request #54 from stweil/master
Replace broken link to c't article by permalink
2016-02-28 07:18:40 -08:00
Stefan Weil
1443354aa2 Replace broken link to c't article by permalink
Update also the 2nd article link to use a permalink, too.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2016-02-28 13:57:42 +01:00
James R. Barlow
250e68c1cd v4.0.5 release notes v4.0.5 2016-02-27 01:01:38 -08:00
James R. Barlow
6a380ee99c Fix temporary file placed in wrong folder 2016-02-27 00:51:47 -08:00
James R. Barlow
3c90bd96a9 Remove extraneous debug print() messages 2016-02-27 00:50:58 -08:00
James R. Barlow
06a7ceb25a v4.0.4 Updates release notes v4.0.4 2016-02-27 00:22:37 -08:00
James R. Barlow
733a8e7d58 Merge branch 'feature/parsecontent' 2016-02-27 00:19:19 -08:00
James R. Barlow
570bbe9a05 Add comments and remove debugging, improve inline handling
Squashed commits:
[bfff3c9] pageinfo, have a main()
2016-02-27 00:18:36 -08:00
James R. Barlow
5cc3adb39a Add support for inline images 2016-02-27 00:18:36 -08:00
James R. Barlow
3957a0606c Compute image pixel density without performing rectangle intersection (+5 squashed commits)
Squashed commits:
[0e27904] Partially implement DPI calculation with rotation of the image

Fixes test suite
[a64f662] pageinfo: all tests pass
[c5b811a] Fix typos
[cdd2286] Can now find inline images for efficiently
[60dde8d] First cut at implementing intelligent DPI detection based on content stream

Broke many of the test cases
2016-02-27 00:18:36 -08:00
James R. Barlow
11a561dbce v4.0.3 release notes v4.0.3 2016-02-26 01:12:15 -08:00
James R. Barlow
dad2198394 Log information about detected page orientations in a summary line 2016-02-26 01:07:59 -08:00
James R. Barlow
e40fdc502d Always dump stack trace for unexpected errors 2016-02-26 01:06:59 -08:00
James R. Barlow
d446fe5922 Fix "too few characters" reported as error by tesseract -psm 0 2016-02-21 08:53:34 -08:00
James R. Barlow
4ca90c106d Docker: fix blank JPEG2000 PDF issue 2016-02-21 04:24:21 -08:00
James R. Barlow
7c5e58a497 Fix test cases that break in Docker, improve test for running in Docker v4.0.2 2016-02-20 23:47:37 -08:00
James R. Barlow
323b9a5f8e Add other missing files v4.0.2rc1 2016-02-20 05:34:21 -08:00
James R. Barlow
cab381a339 Add JPEG 2000 test case 2016-02-20 05:13:19 -08:00
James R. Barlow
fe4d4c39cd Merge commit '6f3ac46b1c176d48782347cfa14d9ef6ce773f37' into develop 2016-02-20 04:56:12 -08:00
James R. Barlow
ad188d7ae1 Docker: supply openjpeg to address JPXDecode errors 2016-02-20 04:54:55 -08:00
James R. Barlow
8246cc0538 Gracefully recover from tesseract's failure to process very large images
And test cases to check this
2016-02-20 04:53:23 -08:00
James R. Barlow
6f3ac46b1c Gracefully recover from tesseract's failure to process very large images
And test cases to check this
2016-02-20 04:53:02 -08:00
James R. Barlow
ac71c3be63 4.0.2rc1 - release notes, add missing file caught by Travis 2016-02-20 03:36:37 -08:00
James R. Barlow
ecc0ac9b19 Fix error on --tesseract-timeout timing out 2016-02-20 03:13:23 -08:00
James R. Barlow
ea4e6bf67d leptonica: serialization tweaks, memory handling 2016-02-20 02:54:53 -08:00
James R. Barlow
46c204f533 Fix leptonica pickling 2016-02-20 02:35:34 -08:00
James R. Barlow
71fbda8bf6 Adjust page orientation parsing to deal with change in Tess 3.04.01 2016-02-20 01:32:56 -08:00
James R. Barlow
9b79b4a7c8 Leptonica: documentation, helper functions 2016-02-20 01:20:06 -08:00
James R. Barlow
c04cc853d7 leptonica: remove special PNM handling
We no longer use PNM as an intermediate format, so there's no need to
handle leptonica's PNM quirks.
2016-02-19 15:13:14 -08:00
James R. Barlow
dd41e70ccc leptonica: nit 2016-02-19 15:11:48 -08:00
James R. Barlow
4206e74f42 tests: also check that monochrome correlation correctly detects matches 2016-02-19 14:35:31 -08:00
James R. Barlow
68c3ce56a9 Don't do chmod unless necessarily (breaks py.test on Docker) 2016-02-19 14:09:56 -08:00
James R. Barlow
ab0e5fa425 Improve error checking for tesseract -psm 0 (orientation) errors 2016-02-19 03:58:39 -08:00
James R. Barlow
f3b0434a87 Improve ability to capture error messages from tesseract on a crash 2016-02-19 03:48:49 -08:00
James R. Barlow
aa394440db Just use the PyPI version of ocrmypdf in dockerfile
Apparently setuptools_scm_git_archive is ineffective on hub.docker.com
automatic build, it still can't find a version.
v4.0.1
2016-02-17 15:14:23 -08:00
James R. Barlow
3b98a1a04b Fix KeyError on unexpected tess output 2016-02-17 06:05:27 -08:00
James R. Barlow
fcb89b0c58 Forgot to save release notes 2016-02-17 01:48:25 -08:00
James R. Barlow
ac65d6a03a v4.0: release notes v4.0 2016-02-17 01:21:17 -08:00
James R. Barlow
2103f60906 Merge branch 'release/v4.0.0' 2016-02-17 01:13:24 -08:00
James R. Barlow
e3c3d848c1 Save Dockerfile comment 2016-02-17 01:11:41 -08:00
James R. Barlow
d4ef3411e0 Suppress --pdf-renderer tesseract warning in Docker image
Since the corrected font is provided in the Docker image, there's no
reason to show the warning.
2016-02-17 01:03:20 -08:00
James R. Barlow
71d616e413 Restore Dockerfile on local and probably on automated build as well 2016-02-17 00:13:45 -08:00
James R. Barlow
fe651d1bf5 Overwrite Tesseract 3.04 default pdf font with better pdf font 2016-02-16 21:45:44 -08:00
James R. Barlow
582ba8cfad Provide sharp2.ttf for Docker images 2016-02-16 21:45:17 -08:00
James R. Barlow
d23291650a Remove duplicate line from documentation 2016-02-16 14:30:15 -08:00