mawi
39617dd739
fix: remove ruffus
2019-04-08 11:07:32 +02:00
mawi
6590875756
feat: add triage step
...
remove tqdm demo
2019-04-08 10:26:56 +02:00
mawi
01bbf064e0
feat: add tqdm progress bar
...
This is just a POC. Will be removed.
2019-04-05 19:52:38 +02:00
mawi
fc1c4f12f5
feat: add concurrent.futures pipeline
2019-04-05 18:48:34 +02:00
mawi
2647382cf6
fix: most of the tests (37 failed, 133 passed, 28 skipped)
2019-04-05 14:06:07 +02:00
mawi
783a128bd1
feat: move to sync (none ETL) implementation - remove ruffus
2019-04-04 21:02:38 +02:00
Martin Wind
b214aa5b38
feat: move to sync (none ETL) implementation
2019-04-03 19:59:43 +02:00
James R. Barlow
6e49bb3588
v8.2.3 notes
v8.2.3
2019-04-03 01:19:12 -07:00
Martin Wind
aa512b6181
feat: move to sync (none ETL) implementation (WIP)
2019-04-02 20:03:09 +02:00
Martin Wind
a4667b5656
refactor: move ruffus related code to one file
2019-03-28 20:16:10 +01:00
Martin Wind
f65a3d3762
fix import in unpaper test
2019-03-26 10:04:26 +01:00
Martin Wind
2fa43ecf09
refactor: split argparse and run_pipline
2019-03-26 08:10:20 +01:00
James R. Barlow
427afc0616
Fix LeptonicaErrorTrap when a sys.stderr.fileno() is not available
...
The LeptonicaErrorTrap was problematic for Celery and other
libraries that mess with stderr.
Closes #359
2019-03-17 14:22:36 -07:00
James R. Barlow
9c7ee2bf23
Better help text for --verbose
2019-03-17 13:29:25 -07:00
James R. Barlow
c5cfaa950b
readme: tweaks
2019-03-16 14:09:19 -07:00
James R. Barlow
4e2a98ead4
leptonica: fix junkpixt harder
2019-03-16 14:08:58 -07:00
James R. Barlow
210f134b5b
Merge branch 'master' of github.com:jbarlow83/OCRmyPDF
2019-03-08 15:38:31 -08:00
James R. Barlow
696c0721a0
docs: fix broken sphinx ref
...
[ci skip]
2019-03-08 15:38:26 -08:00
James R. Barlow
aabab95418
docs: use images folder
2019-03-08 15:38:01 -08:00
James R. Barlow
7d614dd68b
docs: explain Automator workflow
2019-03-08 15:37:42 -08:00
jumblies
f57dda7939
Update batch.rst ( #362 )
...
Added docker instructions for passing "find" filenames into container. Obviates prior incorrect flag fix.
2019-03-08 12:46:50 -08:00
James R. Barlow
1b4542aa77
Further fixes to external program version testing
v8.2.2
2019-03-07 14:27:16 -08:00
James R. Barlow
6c7fca57ec
v8.2.1 notes
2019-03-06 22:22:50 -08:00
James R. Barlow
486dc7e22c
Fix some test failures missed in prev commit
2019-03-06 13:28:50 -08:00
James R. Barlow
dc616bb507
Fix test suite so --clean is not requested when unpaper is not installed
2019-03-05 22:33:13 -08:00
James R. Barlow
902bda43e3
main: fix version testing unnecessarily throwing exception to itself
2019-03-05 22:32:06 -08:00
James R. Barlow
f7da63f68b
main: fix redundant argument test
2019-03-05 22:29:29 -08:00
James R. Barlow
5da26e4c9c
Convert most uses of subprocess.Popen to subprocess.run in test suite
2019-03-05 22:25:22 -08:00
James R. Barlow
c19c852705
Fix exception while attempting to print error message for missing program
2019-03-05 16:32:48 -08:00
James R. Barlow
a27ee3ee8c
optimize: use Decode to invert 1bpp PNGs for now
v8.2.0
2019-03-03 17:50:12 -08:00
James R. Barlow
c2f316c2c5
v8.2.0 release notes: optimizer
2019-03-03 15:26:01 -08:00
James R. Barlow
974979b0a0
Merge branch 'feature/optimization-fixes'
2019-03-03 15:00:20 -08:00
James R. Barlow
66586bdaab
optimize: Disable jpg->png migration
...
Needs more testing before release
2019-03-03 14:59:59 -08:00
James R. Barlow
01d2ea309f
Fix Predictor name and photometric flip
2019-03-03 14:57:15 -08:00
James R. Barlow
e918480351
v8.2.0 release notes
2019-03-03 14:15:20 -08:00
James R. Barlow
52fd84fa95
Remove debug message
2019-03-03 13:31:10 -08:00
James R. Barlow
2c56b0935c
docs: minor
2019-03-03 03:28:17 -08:00
James R. Barlow
4f69ace868
optimize: fix all JBIG2 images binned on last page
...
During some past refactor it appears we now end up treating
all JBIG2 images as if they appeared on the last page in the
file. This bug had no visual side ffects but probably led to
suboptimal JBIG2 encoding.
2019-03-03 03:28:17 -08:00
James R. Barlow
497c531112
optimize: update comments
2019-03-03 03:28:17 -08:00
James R. Barlow
b27b92fbf3
optimize: on aggressive settings try JPG to PNG transcoding
...
If the color count of an image is low such as when black and white
documents are scanned in color, PNG with lossy quantization may
produce a superior encoding to JPEG. This is expensive to test however.
2019-03-03 03:28:17 -08:00
James R. Barlow
2e6ba2df8c
optimize: fix recoding of PNGs
...
Previously we opened pngquant-compressed PNGs with transcoding
because the transcode free function in Leptonica didn't seem to
work. This mean Leptonica may have thrown away the hard of
pngquant if didn't understand the encoding.
This change resolves the issue and allows us to open PNG encoded
data and insert it into a PDF without transcoding. Should improve
encoding quality.
2019-03-03 03:28:17 -08:00
James R. Barlow
67a405c6b7
Move install-time external program checks out of setup.py
...
We did runtime tests for several of them anyway, and it's better to do
at runtime since config may change after installation.
2019-03-03 03:26:56 -08:00
James R. Barlow
58e6663806
Update test cache for french->german change
2019-03-03 03:23:59 -08:00
James R. Barlow
602570fcf9
Update requirements
2019-03-03 02:27:56 -08:00
James R. Barlow
691f8ce254
Docs: reorganize for new docker-alpine image
2019-03-01 23:15:32 -08:00
James R. Barlow
22812e74b9
Merge branch 'master' of github.com:jbarlow83/OCRmyPDF
2019-02-26 13:01:59 -08:00
Martin Wind
9d824e723d
Add Dockerfile based on alpine:3.9 ( #354 )
...
* Do not exclude .git from docker build
* Use multi-stage builds to keep the image size down
* Copy project files to get the test suite.
* Add webservice
* Add tesseract language data for German and Chinese Simplified
2019-02-26 13:01:38 -08:00
James R. Barlow
5dad800d85
Add version to build-system declaration
2019-02-26 12:58:44 -08:00
James R. Barlow
56a56a4dcb
docs: avoid importing ocrmypdf
2019-02-26 12:57:50 -08:00
James R. Barlow
3f1d9ef99c
Fix tests for move to Alpine dockerfile
2019-02-26 12:30:21 -08:00