2895 Commits

Author SHA1 Message Date
James R. Barlow
15a988b999 weave: use emplacement method, scrap TOC repair
The new emplacement method updates page objects in place without
generating new objgen numbers, meaning we no longer need to update the table
of contents to preserve links.
2019-05-11 12:40:25 -07:00
James R. Barlow
83398e54ea weave: fix corruption of certain high page count files
Corruption occurred when replacements was not incremented for multiple
consecutive pages.
2019-05-11 12:22:21 -07:00
James R. Barlow
bcdd196699 ghostscript: remove unnecessary post-render resizing step 2019-05-11 12:10:50 -07:00
James R. Barlow
0cd576e701 Rename bash completions file 2019-05-11 10:52:42 -07:00
James R. Barlow
4d5e0eb749 docs: mention completions 2019-05-06 18:07:41 -07:00
Frank
7ed0f8f50e Add bash completion (#384)
* Add bash completion

file must be copied to completion folder e.g. /usr/share/bash-completion/completions
2019-05-06 15:18:15 -07:00
James R. Barlow
79c84eefa3 Fix main.txt v8.2.4 2019-04-23 02:21:31 -07:00
James R. Barlow
5398003160 Fix test.txt 2019-04-23 00:42:40 -07:00
James R. Barlow
58b2bed99d v8.2.4 notes 2019-04-23 00:07:12 -07:00
James R. Barlow
58c29ffb5c weave: use explicit pdf.close(), drastically reduce open file handles
With the new pikepdf 1.2.0 we no longer need to hold file handles
open because of the "copy to memory" functionality. We retain
the behavior of closing/reopening the output PDF every 100 pages as
a way to limit memory usage.
2019-04-18 15:12:48 -07:00
James R. Barlow
f615b6f0e8 pdfinfo: be more specific about detecting XFA we can't render 2019-04-18 15:07:25 -07:00
James R. Barlow
e0c8dadcce Explicitly close most pikepdf.Pdf when done with them 2019-04-18 15:02:12 -07:00
James R. Barlow
9a86f53109 Ignore pip-wheel-metadata folder
https://github.com/pypa/pip/issues/6213
2019-04-18 10:42:10 -07:00
James R. Barlow
91cb092aa0 Remove PyCharm debugger hack 2019-04-18 10:15:02 -07:00
James R. Barlow
f4b87915df Fix --redo-ocr 2019-04-15 13:11:26 -07:00
James R. Barlow
922a107b7f Remove safety traversal of PDF table of contents
qpdf fixed the danging reference issue (qpdf #240) in 8.3.0, which is
required by pikepdf 1.1.0. We no
longer need the workaround.
2019-04-13 00:24:03 -07:00
mawi
1c44fd4f3b fix: typo 2019-04-08 15:01:04 +02:00
mawi
c92ccc6134 fix: tests 2019-04-08 14:57:42 +02:00
mawi
1137534e97 fix: update pytest version
Solves install error: pkg_resources.ContextualVersionConflict: (pytest 4.3.0 (/app/.eggs/pytest-4.3.0-py3.6.egg), Requirement.parse('pytest>=4.4.0'), {'pytest-xdist'})
2019-04-08 11:08:29 +02:00
mawi
39617dd739 fix: remove ruffus 2019-04-08 11:07:32 +02:00
mawi
6590875756 feat: add triage step
remove tqdm demo
2019-04-08 10:26:56 +02:00
mawi
01bbf064e0 feat: add tqdm progress bar
This is just a POC. Will be removed.
2019-04-05 19:52:38 +02:00
mawi
fc1c4f12f5 feat: add concurrent.futures pipeline 2019-04-05 18:48:34 +02:00
mawi
2647382cf6 fix: most of the tests (37 failed, 133 passed, 28 skipped) 2019-04-05 14:06:07 +02:00
mawi
783a128bd1 feat: move to sync (none ETL) implementation - remove ruffus 2019-04-04 21:02:38 +02:00
Martin Wind
b214aa5b38 feat: move to sync (none ETL) implementation 2019-04-03 19:59:43 +02:00
James R. Barlow
6e49bb3588 v8.2.3 notes v8.2.3 2019-04-03 01:19:12 -07:00
Martin Wind
aa512b6181 feat: move to sync (none ETL) implementation (WIP) 2019-04-02 20:03:09 +02:00
Martin Wind
a4667b5656 refactor: move ruffus related code to one file 2019-03-28 20:16:10 +01:00
Martin Wind
f65a3d3762 fix import in unpaper test 2019-03-26 10:04:26 +01:00
Martin Wind
2fa43ecf09 refactor: split argparse and run_pipline 2019-03-26 08:10:20 +01:00
James R. Barlow
427afc0616 Fix LeptonicaErrorTrap when a sys.stderr.fileno() is not available
The LeptonicaErrorTrap was problematic for Celery and other
libraries that mess with stderr.

Closes #359
2019-03-17 14:22:36 -07:00
James R. Barlow
9c7ee2bf23 Better help text for --verbose 2019-03-17 13:29:25 -07:00
James R. Barlow
c5cfaa950b readme: tweaks 2019-03-16 14:09:19 -07:00
James R. Barlow
4e2a98ead4 leptonica: fix junkpixt harder 2019-03-16 14:08:58 -07:00
James R. Barlow
210f134b5b Merge branch 'master' of github.com:jbarlow83/OCRmyPDF 2019-03-08 15:38:31 -08:00
James R. Barlow
696c0721a0 docs: fix broken sphinx ref
[ci skip]
2019-03-08 15:38:26 -08:00
James R. Barlow
aabab95418 docs: use images folder 2019-03-08 15:38:01 -08:00
James R. Barlow
7d614dd68b docs: explain Automator workflow 2019-03-08 15:37:42 -08:00
jumblies
f57dda7939 Update batch.rst (#362)
Added docker instructions for passing "find" filenames into container.  Obviates prior incorrect flag fix.
2019-03-08 12:46:50 -08:00
James R. Barlow
1b4542aa77 Further fixes to external program version testing v8.2.2 2019-03-07 14:27:16 -08:00
James R. Barlow
6c7fca57ec v8.2.1 notes 2019-03-06 22:22:50 -08:00
James R. Barlow
486dc7e22c Fix some test failures missed in prev commit 2019-03-06 13:28:50 -08:00
James R. Barlow
dc616bb507 Fix test suite so --clean is not requested when unpaper is not installed 2019-03-05 22:33:13 -08:00
James R. Barlow
902bda43e3 main: fix version testing unnecessarily throwing exception to itself 2019-03-05 22:32:06 -08:00
James R. Barlow
f7da63f68b main: fix redundant argument test 2019-03-05 22:29:29 -08:00
James R. Barlow
5da26e4c9c Convert most uses of subprocess.Popen to subprocess.run in test suite 2019-03-05 22:25:22 -08:00
James R. Barlow
c19c852705 Fix exception while attempting to print error message for missing program 2019-03-05 16:32:48 -08:00
James R. Barlow
a27ee3ee8c optimize: use Decode to invert 1bpp PNGs for now v8.2.0 2019-03-03 17:50:12 -08:00
James R. Barlow
c2f316c2c5 v8.2.0 release notes: optimizer 2019-03-03 15:26:01 -08:00