2676 Commits

Author SHA1 Message Date
fritz-hh
463b04e795 typo 2014-01-06 20:09:12 +01:00
fritz-hh
c0d8508264 minor change in log msg 2014-01-06 19:30:19 +01:00
fritz-hh
6ef4ba31e2 help and documentation improved 2014-01-05 22:02:12 +01:00
fritz-hh
10a3d26291 default PDI definition moved to cfg file 2014-01-05 22:01:45 +01:00
fritz-hh
ab994b32ee explanations added for no_ligature cfg file 2014-01-05 22:01:10 +01:00
fritz-hh
9352b71d78 Copyright added 2014-01-05 22:00:12 +01:00
fritz-hh
71593421ed minor change 2014-01-05 21:22:31 +01:00
fritz-hh
2754970f37 Echo arguments of script in debug mode 2014-01-04 21:43:41 +01:00
fritz-hh
5945454597 Support for -f option
Fixes #16
2014-01-04 21:24:33 +01:00
fritz-hh
884dbce712 copyright years updated 2014-01-04 21:20:29 +01:00
fritz-hh
8ee1bc6598 Minor change 2014-01-04 21:19:51 +01:00
fritz-hh
7d76c46731 Check if page already contains a font 2014-01-04 18:05:21 +01:00
fritz-hh
f8ccf42c06 path to tmp folder now defined in config.sh 2014-01-04 17:24:35 +01:00
fritz-hh
0abe0f1f10 minor change 2014-01-03 17:00:35 +01:00
fritz-hh
ee8a5d80ff echo also java version in debug mode 2014-01-03 16:27:11 +01:00
fritz-hh
f08893b5c8 Support section added 2014-01-03 16:13:17 +01:00
fritz-hh
41cd88506e Echo version of the used tools
Fixes #35
2014-01-03 15:59:51 +01:00
fritz-hh
081223b138 Delete 2013_09_LED_und_Energiesparlampen.pdf
file committed by mistake... So deleting it now
2013-12-31 23:38:38 +01:00
fritz-hh
4e60c9ba09 Warn user in case of low resolution 2013-12-30 23:55:26 +01:00
fritz-hh
95fe7cd3bc Oversampling + more than 1 img
- Oversampling resolution can now be set from the cmd line (-o option)
- If a page contains more than one image, warn the user but process the
page anyway with a default resolution
2013-12-30 23:44:38 +01:00
fritz-hh
79ec1d994e Automatic oversampling
- If resolution is too low (<250dpi) perform automatic oversampling of
the image
- comments improved
- log messages improved
2013-12-30 22:27:10 +01:00
fritz-hh
045362425f minor change 2013-12-30 19:16:29 +01:00
fritz-hh
2b2637fbc3 minor change 2013-12-30 18:21:03 +01:00
fritz-hh
bfc4f7a28d better resolution handling (fixes #38)
- dpi computation moved to in dedicated function
- do not exit in case of resolution mismatch (fixes #38)
- comments improved
2013-11-29 10:37:09 +01:00
fritz-hh
407670e1f3 Minor change 2013-11-29 10:34:05 +01:00
fritz-hh
d0671d81b5 New log level added (LOG_WARN) 2013-11-29 10:33:46 +01:00
fritz-hh
7a74ebbcc3 comments and log messages improved 2013-11-29 00:39:44 +01:00
fritz-hh
9e69800332 typo 2013-11-28 00:37:57 +01:00
fritz-hh
7542188592 Removed bashism
== does not exist in bourne shell
2013-11-27 23:44:45 +01:00
fritz-hh
b4a23c005d fixes #34
tell GNU parallel to protect against evaluation by the sub shell (-q
flag).
This is required in case the file name passed as argument contains
special characters like "#"
2013-11-27 23:15:54 +01:00
fritz-hh
5e0f8be4b1 Various improvements
-Constants moved to config.sh
- Use "python2" cmd instead of "python"
- few other minor changes
2013-11-27 22:34:21 +01:00
fritz-hh
50dee55606 File Test_Issue_#28 renamed 2013-11-27 22:30:43 +01:00
fritz-hh
da5cd01fe4 copyright line added 2013-05-06 23:13:29 +03:00
fritz-hh
d3fb317d41 readme updated
new feature: Process several pages in parallel if more than one CPU core
is available
2013-05-06 21:54:41 +02:00
fritz-hh
88ddeb1fb6 OCRmyPDF.sh: added dependency to GNU parallel 2013-05-06 21:54:05 +02:00
fritz-hh
f9e2e74bf3 Merge remote-tracking branch 'origin/v1.x' into v2.x 2013-05-06 21:35:34 +02:00
fritz-hh
87e01aff60 readme updated for v1.0-stable v1.0-stable 2013-05-06 21:29:15 +02:00
fritz-hh
7e8481186a OCRmyPDF.sh: metadata not added anymore
Removed feature to add metadata in final pdf file (because it lead to to
final PDF file that does not comply to the PDF/A-1 format)
2013-05-06 21:26:33 +02:00
fritz-hh
2b0103a4e6 basic implementation of parallel page processing
- basic implementation of parallel page processing using GNU parallel
- processing around 40% faster on dual core processor
2013-05-05 22:33:54 +02:00
fritz-hh
064d4be83c Merge remote-tracking branch 'origin/v1.x' into v2.x
Conflicts:
	OCRmyPDF.sh

Fixes #31
2013-05-05 21:01:17 +02:00
fritz-hh
ab536d5678 OCRmyPDF.sh: fixes issue for files having spaces
fixes #31
2013-05-05 20:56:45 +02:00
fritz-hh
9db805c4ad new file to OCR one page
Required to perform OCR of several pages in parallal (using GNU
parallel)
2013-05-05 20:45:27 +02:00
fritz-hh
f7923a9761 OCRmyPDF.sh: few variables renamed for clarity 2013-05-05 20:44:03 +02:00
fritz-hh
fd52650255 .gitattribute: handle *.jar and *.pdf as binary 2013-05-05 16:54:41 +02:00
fritz-hh
f0fe295175 jhove config: fixes #29 2013-05-05 16:36:54 +02:00
fritz-hh
2f89aa3935 .gitignore corrected + jhove jar files added
.gitignore file corrected, because it prevented some required jhove
binary files from being checked in (jar files)
2013-05-04 22:01:03 +02:00
fritz-hh
5aa27343e0 delete test file 2013-05-04 21:58:01 +02:00
fritz-hh
5ce2841389 JHove: deleted doc + source
Deleted number of jhove files that are not required
(documentation and java source code mainly)
Goal: reduce size of the package
2013-05-04 21:55:39 +02:00
fritz-hh
e4ffb58269 OCRmyPDF.sh: provision for parallel pages processing 2013-05-02 22:06:16 +02:00
fritz-hh
2ce3d9e19d added file to reproduce #28 2013-05-02 17:21:17 +02:00