James R. Barlow
76276f61e5
Split out rotation related tests
2018-05-01 23:51:35 -07:00
James R. Barlow
bfd26e6ec6
Tests: confirm OCR layer copied
2018-05-01 23:16:41 -07:00
James R. Barlow
d787e1ea0f
ghostscript.py not saved in last commit
...
Given importance of last one, confirmed that when the file is saved all tests pass too.
Passing is invariant with this change.
2018-05-01 22:59:22 -07:00
James R. Barlow
b5d7e9cbb0
Fix all issues with rotations
...
All tests now pass
2018-05-01 22:50:20 -07:00
James R. Barlow
f3b6d9dcdf
Fix a comment about Tesseract behavior in certain versions
2018-05-01 21:31:09 -07:00
James R. Barlow
a9abe13185
Remove the old tesseract pdf_renderer
2018-05-01 17:31:34 -07:00
James R. Barlow
6b315e8315
Add ability to disable cache
2018-05-01 15:52:00 -07:00
James R. Barlow
37677de884
Fix regressions: pdfa.ps not used, PDF/A failures, handling of text layers with no font
2018-05-01 15:51:46 -07:00
James R. Barlow
c7387de325
Fix auto rotate
2018-05-01 15:18:28 -07:00
James R. Barlow
2495b1e038
Refactor find font, get test cases working again
2018-05-01 14:48:41 -07:00
James R. Barlow
073ee52ce7
Use hocr and weave; eliminate old combine layers and merge pages
2018-05-01 14:21:53 -07:00
James R. Barlow
54150a14e9
Further elimination of tesseract renderer special casing
...
We don't need to keep a "skip page" around anymore since
skipping means just not grafting on the text layer.
2018-05-01 13:36:20 -07:00
James R. Barlow
88ff091cce
Unify tesseract and sandwich renderer paths
...
Since the new weaving method copies the font and content
stream from the Tesseract PDF, it doesn't matter if Tesseract
happens to have an image or not.
If Tesseract is text-only capable we use that feature for efficiency,
but ignore the image either way.
2018-05-01 13:24:20 -07:00
James R. Barlow
e87a5776f1
Remove now-unnecessary code to rotate pages
...
Track only the decision to change rotation.
2018-05-01 13:01:25 -07:00
James R. Barlow
0806ce6406
Fix rotation for unsplit (modulo --rotate-pages)
2018-04-30 20:58:42 -07:00
James R. Barlow
6409894a71
feature/unsplit-try-imagerotate
2018-04-30 20:48:59 -07:00
James R. Barlow
e7286f6129
Unsplit now works with multipage, --force-ocr
2018-04-30 14:46:20 -07:00
James R. Barlow
2ab94b3151
unsplit: it's alive
...
First successful file output.
2018-04-28 01:57:41 -07:00
James R. Barlow
7ee90890ec
Add copying of essential information from Tesseract textonly
2018-04-27 23:19:08 -07:00
James R. Barlow
383e726d65
Expand size growth reasons to other arguments that trigger transcoding
2018-04-27 19:34:57 -07:00
James R. Barlow
e046f70642
Set OMP_THREAD_LIMIT unconditionally, for pngquant
2018-04-27 19:19:30 -07:00
James R. Barlow
2131ad4670
Fix --remove-background error on PDFs with colormapped images
...
It's unclear how exactly a
colormapped image gets to this
spot given the tendency of other
image processing tools to flatten
such images, but someone made it happen, so now we make sure
the image is okay.
Closes #262
2018-04-27 17:21:01 -07:00
James R. Barlow
219fe2155b
test_pageinfo: remove duplicate import
2018-04-27 17:16:42 -07:00
James R. Barlow
4209034d20
Add gpg key to issue template
2018-04-27 15:51:26 -07:00
James R. Barlow
abcae0c2a4
Fix helpers.py again
2018-04-25 22:10:51 -07:00
James R. Barlow
0934905493
Don't suppress error message from config_notfound
...
Since it showed up in s390x bionic
2018-04-25 21:58:18 -07:00
James R. Barlow
11cd6201d9
helpers: fix missing call to complain()
...
In practice this is probably unreachable.
2018-04-25 21:57:50 -07:00
James R. Barlow
8d2a917676
Page unsplit, development
2018-04-25 21:56:43 -07:00
James R. Barlow
44b4afa534
Begin conversion from page splititng to page markers
2018-04-23 22:57:50 -07:00
James R. Barlow
775be3933c
Cherrypick merge_pages unification
2018-04-20 23:08:15 -07:00
James R. Barlow
df87e21c85
Add support for PDF/A-3
...
No ability to attach files however
2018-04-20 00:06:55 -07:00
Hugo
d761d80750
Use more standard __version__ rather than PILLOW_VERSION ( #257 )
2018-04-19 23:35:32 -07:00
James R. Barlow
8052019dde
optimize: fix reporting of jbig2 groups
2018-04-19 01:54:44 -07:00
James R. Barlow
a3d8950088
optimize: Don't save JPEGs if larger
2018-04-19 01:25:49 -07:00
James R. Barlow
004f5d3bf1
optimize: further improve decodeparms handling
2018-04-18 15:52:25 -07:00
James R. Barlow
f5d308a156
optimize: refactor tricky /Filter and /DecodeParms handling
2018-04-18 15:30:21 -07:00
James R. Barlow
3869996758
optimize: jbig2 error
2018-04-18 01:36:38 -07:00
James R. Barlow
cdb2107c4e
optimize: jbigs2 fix
2018-04-18 01:31:35 -07:00
James R. Barlow
4db2b3413b
optimize: more robustness
2018-04-18 01:25:34 -07:00
James R. Barlow
b2f31bec79
Make optimize a lot safer
2018-04-18 00:20:06 -07:00
James R. Barlow
78f9f4a266
Be more defensive about accessing
2018-04-18 00:11:39 -07:00
James R. Barlow
ad6087c342
optimize: more fixes
2018-04-17 23:58:10 -07:00
James R. Barlow
0d6ef430de
optimize: fix "length not defined"
2018-04-17 23:38:00 -07:00
James R. Barlow
a5942209e8
optimize: fix error on missing /Filter
2018-04-17 23:27:56 -07:00
James R. Barlow
9a60694cfc
optimize: ccitt header fixes
...
Changed to match TIFF spec's use of unsigned types, eliminated check for
/Columns.
There is some complex behavior for /Width != /Columns and
(/Width, /Columns) mod 8 != 0
that is not described well in the PDF spec.
2018-04-17 23:27:25 -07:00
James R. Barlow
4bf13f4737
optimize: be less chatty
2018-04-17 23:25:41 -07:00
James R. Barlow
9e89b75186
Merge v6.1.5
2018-04-17 22:51:13 -07:00
James R. Barlow
0b10db91be
Fix regression: Disable Ghostscript JPEG passthrough entirely
v6.1.5
2018-04-17 17:00:24 -07:00
James R. Barlow
1a516b2af9
Fix regression: time stamp test suite failures
2018-04-17 16:59:21 -07:00
James R. Barlow
076363d78e
Disable JPEG passthrough for Ghostscript 9.23
...
Seems to corrupt JPEGs involved in image masks?
2018-04-17 16:31:03 -07:00