2676 Commits

Author SHA1 Message Date
fritz-hh
9271fe73a8 OCRmyPDF.sh: fixes #27
The fix should now be compatible to most implementation of grep
2013-05-02 16:51:46 +02:00
fritz-hh
edaa70b97f OCRmyPDF.sh: fixes #25 and fixes #26
- In debug mode: compute and echo time required for processing
- Resolutions (x/y) that are nearly equal are not supported (because the
test did not take into account imprecision due to trauncation)
2013-05-01 15:58:55 +02:00
fritz-hh
ab07f4deea OCRmyPDF.sh: handling of path with spaces
- corrected fct absolutePath() to handle path with spaces correctly
- pdf title metadata: split on case change file name
- change of owner/group/permission removed from code
- improved logging
2013-05-01 13:44:20 +02:00
fritz-hh
beb1d7ab54 release notes: updated for v1.0-rc2 v1.0-rc2 2013-04-29 12:27:43 +03:00
fritz-hh
5ce3e9bfec OCRmyPDF.sh: Version number updated 2013-04-29 12:19:19 +03:00
fritz-hh
2bed210a30 OCRmyPDF.sh: added metadata in final pdf file
- added metadata in final pdf file: fixes #4
- improved logging of PDF/A validation results
2013-04-28 22:18:34 +02:00
fritz-hh
2441551156 OCRmyPDF.sh: final pdf same owner & permissions
fixes #9
2013-04-28 15:54:31 +02:00
fritz-hh
15baca5e08 HocrTransform.py: exist if page size if not found
fixes #21
2013-04-28 14:56:14 +02:00
fritz-hh
062ef0ca3a OCRmyPDF.sh: keep tmp files in debug mode
fixes #22
2013-04-28 14:43:21 +02:00
fritz-hh
24b4686944 release notes: unpaper version added 2013-04-27 14:03:17 +03:00
fritz-hh
d3d1c20ca2 Correct version number
fixes #19
2013-04-27 14:00:59 +03:00
fritz-hh
7f7b81154f Merge branch 'master' of https://github.com/fritz-hh/OCRmyPDF 2013-04-26 19:37:26 +02:00
fritz-hh
6372cec6b8 OCRmyPDF.sh: Fixed major problem with deskew
After deskew the images was cropped to the wrong size
2013-04-26 19:37:02 +02:00
fritz-hh
5ec875325e Update README.md 2013-04-26 18:46:18 +03:00
fritz-hh
4d80709cfd Update README.md 2013-04-26 18:43:15 +03:00
fritz-hh
2642c1b3d3 Update README.md 2013-04-26 18:00:58 +03:00
fritz-hh
c4cd7e1982 Merge branch 'master' of https://github.com/fritz-hh/OCRmyPDF v1.0-rc1 2013-04-26 16:53:19 +02:00
fritz-hh
b993c158d0 OCRmyPDF: log msg corrected 2013-04-26 16:52:58 +02:00
fritz-hh
1b727042fe release notes updated for v1.0-rc1 2013-04-26 17:50:54 +03:00
fritz-hh
ec26736577 folder structure cleaned
- put all src files (except OCRmyPDF.sh) to src
- rename tesseract_cfg to tess-cfg
2013-04-26 16:34:49 +02:00
fritz-hh
a766c5f2b7 typo 2013-04-26 16:19:18 +02:00
fritz-hh
3b2c804f23 Update README.md 2013-04-26 17:15:55 +03:00
fritz-hh
486ed6f217 Readme updated 2013-04-26 16:12:13 +02:00
fritz-hh
ae716a91cb jhove paths corrected 2013-04-26 16:11:59 +02:00
fritz-hh
d4195b4362 jhove package added 2013-04-26 14:46:47 +02:00
fritz-hh
6ae0452d87 added readme 2013-04-26 14:28:04 +02:00
fritz-hh
815117f653 add test script
aimed at checking if the quality of the images drops quickly or not
2013-04-26 14:26:03 +02:00
fritz-hh
1c0eb03b3b OCRmyPDF.sh: minor improvements
- additionnal data logged
- width/height were inverted: corrected
- few other minor changes
2013-04-26 14:20:45 +02:00
fritz-hh
3249fba4a2 OCRmyPDF.sh: log to stderr + check PDF/A profile
- fixes #10
- check not only if the final PDF is well formed and valid, but also if
it conforms to the PDF/A profile
2013-04-26 12:23:29 +02:00
fritz-hh
7c173dcc67 Merge branch 'master' of https://github.com/fritz-hh/OCRmyPDF 2013-04-26 11:50:52 +02:00
fritz-hh
357f449e07 OCRmyPDF.sh: check if python is installed
- fixes #14
- minor other changes
2013-04-26 11:50:39 +02:00
fritz-hh
83560cbd1d hocrTransform: font changed to Helvetica
- Font changed to Helvetica (instead of courrier)
- License text deleted (license file already available)
2013-04-26 11:49:21 +02:00
fritz-hh
1860f80cae Update COPYRIGHT.md 2013-04-26 12:19:28 +03:00
fritz-hh
4ea97c4fe4 Update README.md 2013-04-25 12:20:26 +03:00
fritz-hh
ee738be681 Fixed: issue with deskewing: size sometimes wrong
fixes #13
2013-04-25 11:13:30 +02:00
fritz-hh
e21b3155e5 OCRmyPDF.sh: corrected dpi computation
fixes #12
2013-04-24 21:12:35 +02:00
fritz-hh
c293ffd621 OCRmyPDF.sh: minor change in code documentation 2013-04-23 22:57:41 +02:00
fritz-hh
2fdaa7595c OCRmyPDF.sh: better handling of path and tmp folder
- user can now define the name/location of the output file
- check if the folder in which in/output files should be located exist
- tmp folder now build using timestamp and input file name
2013-04-23 22:54:58 +02:00
fritz-hh
968a66f66b Merge branch 'master' of https://github.com/fritz-hh/OCRmyPDF 2013-04-23 21:43:34 +02:00
fritz-hh
5992afb707 Support for additional tesseract config files
This corresponds to the -C option
2013-04-23 21:36:34 +02:00
fritz-hh
9aa83215c4 OCRmyPDF.sh: typo in usage 2013-04-23 00:35:42 +03:00
fritz-hh
939a148812 Update README.md 2013-04-23 00:33:48 +03:00
fritz-hh
4ce249e6ed OCRmyPDF.sh: new debug option (-g) added 2013-04-22 22:50:34 +02:00
fritz-hh
422aaa80f3 hocrTransform.py: various changes
-a option remove
bounding boxes for paragraphs added
color and style of bounding boxes improved
2013-04-22 22:48:41 +02:00
fritz-hh
b9a346ce7d OCRmyPDF.sh: log levels implemented
fixes #5
2013-04-22 20:56:45 +02:00
fritz-hh
64b92ed180 Usage described
fixes #6
2013-04-22 20:35:02 +02:00
fritz-hh
90fc5c9de4 Update README.md 2013-04-21 23:13:04 +03:00
fritz-hh
7118c2f04b Update README.md 2013-04-21 23:09:32 +03:00
fritz-hh
d66712ab42 Update README.md 2013-04-21 23:09:00 +03:00
fritz-hh
c5f2158b85 OCRmyPDF.sh: various changes
fixes #3
fixes #2
2013-04-21 21:59:42 +02:00