2895 Commits

Author SHA1 Message Date
James R. Barlow
709c01c7a1 Regroup three merge steps into a single step
All take the same inputs and deliver similar outputs, so it makes sense.
2018-04-06 01:07:02 -07:00
James R. Barlow
4a341c9034 Merge branch 'master' into feature/jbig2-2018 2018-04-05 21:29:39 -07:00
James R. Barlow
be41ff6d54 Update flowchart
[ci skip]
2018-04-05 21:26:37 -07:00
James R. Barlow
1dbb6f1746 Notes on relevant envvars, repology 2018-04-05 02:15:01 -07:00
James R. Barlow
753e6274ab Tell unpaper to use --layout none so it won't blank out multi column text 2018-04-05 02:14:33 -07:00
James R. Barlow
7f462c618b v6.1.3 notes v6.1.3 2018-04-03 00:11:20 -07:00
James R. Barlow
d8ac6e28ab Convert monochrome images to JBIG2
Awkwardly using fitz and pikepdf, transcode monochrome to CCITT.
This requires _OCRMYPDF_NO_FITZ=1.

00000x.opt.pdf can be checked for JBIG2 to confirm, but this file is
not passed to the output since it's not all wired up yet.
2018-04-03 00:00:53 -07:00
James R. Barlow
a95ffcdc46 Experimental add jbig2
It appears that fitz forces conversion of jbig2 to ccitt no matter what,
so pikepdf will be needed to patch jbig2 images.
2018-04-03 00:00:53 -07:00
James R. Barlow
1b01d45dd2 Warn about Python 3.5 page count issue 2018-04-02 19:29:17 -07:00
James R. Barlow
7a1cd39b21 Fix creation date metadata lost from input
Closes #247
2018-04-02 17:53:39 -07:00
James R. Barlow
1c1fd9616a Don't depend on pytest-xdist in setup.cfg 2018-04-02 11:45:03 -07:00
Sean Whitton
11e19e4085 remove addopts key from tool:pytest section of setup.cfg (#246)
The '-n' command line argument is not supported by recent pytest.
2018-04-02 14:43:38 -04:00
James R. Barlow
2a43f73228 Update installation.rst, further info on fitz 2018-04-02 11:32:57 -07:00
James R. Barlow
b1d1310a75 Dockerfile: use fitz 2018-04-02 11:08:03 -07:00
James R. Barlow
0e7fa78e65 Remove inaccurate statement from setup.py 2018-04-01 13:20:17 -07:00
James R. Barlow
4032570d97 Change docs for fitz/PyMuPDF 2018-04-01 13:19:57 -07:00
James R. Barlow
90644a3017 pipeline: refactoring, use with block for images 2018-03-31 13:26:40 -07:00
James R. Barlow
4f6bffb477 Update copyrights 2018-03-31 11:54:38 -07:00
James R. Barlow
158f902c3b Fixed setup.py syntax error v6.1.2 2018-03-30 14:00:36 -07:00
James R. Barlow
6dc25ddc6e v6.1.2: add license to wheels, depend on defusedxml 2018-03-30 13:22:35 -07:00
James R. Barlow
7f6aaeaecf v6.1.2 2018-03-30 12:39:33 -07:00
James R. Barlow
ace439910e Remove PyMuPDF 1.12.4 shim 2018-03-30 12:33:27 -07:00
James R. Barlow
7f038568de Add envvar to ease testing without PyMuPDF 2018-03-30 12:32:48 -07:00
James R. Barlow
af777c0b6a Test macos without fitz too v6.1.1 2018-03-30 00:13:09 -07:00
James R. Barlow
fc299032a4 v6.1.1 release notes
Better get the last one out
2018-03-30 00:11:52 -07:00
James R. Barlow
e0f3f07907 Fix text reported as found on all pages when PyMuPDF is not available 2018-03-30 00:10:53 -07:00
James R. Barlow
b36df9cf9e pdfa: codecs.encode -> hexlify (simpler) 2018-03-29 22:17:23 -07:00
James R. Barlow
81c3f780d4 Travis: Should test 3.6 Linux without fitz too v6.1.0 2018-03-28 23:54:43 -07:00
James R. Barlow
b51efdd3e3 Travis: don't upload to legacy PyPI anymore, it will stop working soon 2018-03-28 23:40:29 -07:00
James R. Barlow
610b769df9 Update release notes 2018-03-28 23:33:34 -07:00
James R. Barlow
527f4d0101 Workaround fitz not escaping parentheses
Closes #239
2018-03-28 23:23:34 -07:00
James R. Barlow
8d9be43c60 test_bookmarks_preserved won't raise ImportError any more
Due to trapping this in ocrmypdf.lib
2018-03-28 23:22:55 -07:00
James R. Barlow
40ef4f0bbe Add new argument --skip-repair to skip the repair step 2018-03-28 00:54:58 -07:00
James R. Barlow
d0271d5049 More debug messages on repair; update notes 2018-03-28 00:39:38 -07:00
James R. Barlow
5becfcf8ea Refactor fitz ImportError trap 2018-03-27 21:38:02 -07:00
James R. Barlow
112e8d6c18 Fix regression: PDF/A broken without fitz 2018-03-27 21:33:10 -07:00
James R. Barlow
1d8d49a01d Add PyMuPDF to preamble 2018-03-27 21:32:38 -07:00
James R. Barlow
5050155685 Add warning for large file size increases 2018-03-27 15:49:16 -07:00
James R. Barlow
a9bd494cc0 Merge branch 'optional-fitz' 2018-03-27 13:36:33 -07:00
James R. Barlow
6a4df78bc0 Add _naive_find_text to search for text when fitz is not available 2018-03-27 13:36:17 -07:00
James R. Barlow
530eae3898 Fix test_main missing file_claims_pdfa 2018-03-26 15:33:53 -07:00
James R. Barlow
3e444f6a90 Make fitz optional 2018-03-26 13:22:09 -07:00
James R. Barlow
45dbff6401 Fix table of contents not preserved in PDF/A 2018-03-26 02:23:19 -07:00
James R. Barlow
bc56b8e058 Move metadata tests to new test_metadata 2018-03-26 01:49:25 -07:00
James R. Barlow
d86e315c48 v6.0.1 start release notes 2018-03-26 01:44:01 -07:00
James R. Barlow
746969207a Remove deprecated --pdf-renderer tess4, which was renamed to sandwich
Should have been cut in v6.0.0
2018-03-26 01:17:22 -07:00
James R. Barlow
1caebaefb5 tesseract: Fix FileExistsError on if output file was created at timeout 2018-03-25 21:38:20 -07:00
James R. Barlow
2d10fdcf0f Fix typo in release notes 2018-03-25 21:37:06 -07:00
James R. Barlow
355ec70a80 Note other web frontends 2018-03-25 21:36:39 -07:00
James R. Barlow
a2f499de01 Remove pageinfo.py which release notes said was gone for v6 2018-03-25 12:16:56 -07:00