James R. Barlow
cd1a99a0de
Refactor int(os.path.basename(s)[0:6]) -> page_number(s)
2017-06-26 13:29:40 -07:00
James R. Barlow
48e3b267fc
Accept PDFs with whitespace ahead of %PDF marker
...
Noticed in @aagahi 's fork
2017-06-26 13:17:47 -07:00
James R. Barlow
3a7c3417bb
Don’t check tags and branch at the same time as Travis doesn’t get this
...
Travis is weird
v5.2
2017-06-13 13:14:34 -07:00
James R. Barlow
d792ef7222
Give the ‘auto’ renderer setting more test covfefe
2017-06-13 13:13:58 -07:00
James R. Barlow
2c24f67deb
Rename “tess4” renderer to “sandwich” and make it default in Tess 3.05.01
...
Tesseract 3.05.01 backported the textonly_pdf=1 which allows the use
of this superior PDF renderer prior to 4.00 alpha. This means that
the tess4 name is no longer accurate, so call it a sandwich because of
its merge-preserve characteristic. Preserve the tess4 name. Fix the
documentation and tests to reflect this.
Make it the default, because it’s better. It does not have the issues
the “tesseract” renderer does prior to Tess 3.05.00 with rendering
PDFs that Ghostscript corrupts, and it produces better output without
re-rastering.
Deprecate some old stuff to avoid the test suite growing obscenely
large.
2017-06-13 13:09:12 -07:00
James R. Barlow
9e75e28d0c
Homebrew needs x11 to compile Pillow
2017-06-13 11:03:26 -07:00
James R. Barlow
3232643809
Support “textonly PDF” renderer in Tesseract 3.05.01
2017-06-13 10:18:08 -07:00
James R. Barlow
f7ee9e90ce
Document what is meant by the ocrmypdf “API”
2017-06-13 10:15:11 -07:00
James R. Barlow
47298be132
Remove Python <3.5 test
2017-06-13 10:14:28 -07:00
James R. Barlow
a88fa83515
Travis: fix deploy conditions for homebrew autobrew
2017-05-31 02:29:32 -07:00
James R. Barlow
12bfe20385
v5.1 release notes
v5.1
2017-05-29 14:36:50 -07:00
James R. Barlow
3d2f6f0772
Fix tess4 test using old-style pageinfo API
2017-05-29 13:51:21 -07:00
James R. Barlow
1cb607f64b
Merge UserUnit
2017-05-29 13:22:55 -07:00
James R. Barlow
d3c54fbbde
For —rotate-pages, rasterize preview at half DPI instead of 200 DPI
...
Ensures that time is not wasted on previews at higher resolution than
the input as was sometimes the case
2017-05-29 13:01:18 -07:00
James R. Barlow
28341b755f
Refactor common test fixtures
2017-05-29 12:47:55 -07:00
James R. Barlow
4b5cd420e1
Add new test file
2017-05-29 12:16:08 -07:00
James R. Barlow
1d57bcc99e
Fix Ghostscript rasterizing of UserUnit pages and related sizing issues
2017-05-29 12:14:10 -07:00
James R. Barlow
facdd13879
Ghostscript: refactor image output resizing
2017-05-29 11:42:27 -07:00
James R. Barlow
6e891f91d3
ghostscript, qpdf: Restore API backward compatibility
2017-05-29 11:13:06 -07:00
James R. Barlow
9b50ede977
Partially solve ghostscript rasterize_pdf producing wrong file size
...
Kludge. Assumes JPEG for now. Messy.
2017-05-25 01:17:43 -07:00
James R. Barlow
82cf010333
Error out if trying to produce PDF/A >200” due to Ghostscript limitation
2017-05-25 00:07:29 -07:00
James R. Barlow
6ff6c8614f
—output-type=pdf now outputs /UserUnit PDFs at the correct size
...
This currently distorts the output size because Tesseract assumes it
knows the DPI better than we do.
Does not work for Ghostscript, because it emerges that Ghostscript
honors /UserUnit for rasterizing but not in pdfwrite (resolve/wontfix).
https://bugs.ghostscript.com/show_bug.cgi?id=690781
Ghostscript’s output would need to be patched in a PDF/A safe way for
this to work. Temporary route may be to block Ghostscript if
/UserUnit.
2017-05-24 23:26:07 -07:00
James R. Barlow
eb1cd38f6c
Add an open helper that is compatible with pathlib
2017-05-24 16:19:15 -07:00
James R. Barlow
148b632b4f
Prove multiprocessing works, although it is still racy in some places
2017-05-23 16:32:13 -07:00
James R. Barlow
591e213713
Add more dependencies for autobrew
2017-05-23 13:52:28 -07:00
James R. Barlow
75f2262659
Ensure JobContext stuff is actually tested for IPC consistency
2017-05-19 17:57:07 -07:00
James R. Barlow
d9005a1074
pdfinfo: replace most remaining dict-style access
2017-05-19 16:17:36 -07:00
James R. Barlow
3e73fa81bf
pageinfo: deprecation warning
2017-05-19 16:17:07 -07:00
James R. Barlow
ba6e290231
Restore old pageinfo.py to avoid breaking compatibility
2017-05-19 15:49:23 -07:00
James R. Barlow
08e47117a3
Rename pageinfo to pdfinfo
2017-05-19 15:48:23 -07:00
James R. Barlow
532ef38157
/UserUnit is a scalar, not an array
2017-05-19 14:19:50 -07:00
James R. Barlow
4c09875890
docs: upload unpaper Dropbox link, .rst typo blocking macOS install
...
[ci skip]
2017-05-19 12:18:09 -07:00
James R. Barlow
0e98139712
Upload to upload.pypi.org/legacy as recommend by PyPA
...
https://github.com/pypa/warehouse/issues/1996#issuecomment-302784126
2017-05-19 12:06:24 -07:00
James R. Barlow
4c04d802d7
Introduce /UserUnit checking
2017-05-19 12:01:19 -07:00
James R. Barlow
b3dc404571
Update unpaper.deb link ( fixes #171 )
...
*Shakes fist a Dropbox*
2017-05-19 11:28:45 -07:00
James R. Barlow
8694f8d2eb
Replace magic strings colorspace and encoding with Enums
2017-05-18 22:32:27 -07:00
James R. Barlow
263f9b79f4
pageinfo: debug stuff
2017-05-18 21:52:55 -07:00
James R. Barlow
56d2aae963
Refactor from ImageInfo index to attribute accessing
2017-05-18 18:39:14 -07:00
James R. Barlow
127706153d
Refactor dictionary based image info to ImageInfo
2017-05-18 18:26:31 -07:00
James R. Barlow
caee5b1428
Access PageInfo instance variables instead of dictionary
2017-05-18 17:12:04 -07:00
James R. Barlow
6c12e7e944
Refactor pageinfo dictionary to PageInfo()
2017-05-18 16:53:38 -07:00
James R. Barlow
cd04ae6949
Refactor PdfInfo(str(filename)) -> PdfInfo(filename)
2017-05-18 16:43:50 -07:00
James R. Barlow
6a0b68298f
Refactor pdf_get_all_pageinfo to PdfInfo
2017-05-18 16:31:18 -07:00
James R. Barlow
0a2f732267
docs: Fix restructured text typos
2017-05-16 23:27:10 -07:00
James R. Barlow
4bade99f27
docs: Remark that someone got bash on Windows working
2017-05-16 23:24:34 -07:00
James R. Barlow
0b048cd24e
Join the build badge club
2017-05-16 23:24:05 -07:00
James R. Barlow
c69ee63d82
Travis, true is a program, not a keyword
v5.0.1
2017-05-15 15:12:14 -07:00
James R. Barlow
744fa104d7
v5.0.1 release notes (anticipating)
2017-05-14 23:59:09 -07:00
James R. Barlow
e24ff0fd64
Travis: don’t update the homebrew version because we pushed to testpypi
2017-05-14 23:55:40 -07:00
James R. Barlow
5de107d44c
tesseract_cache: update explanatory notes
2017-05-14 23:54:09 -07:00