2895 Commits

Author SHA1 Message Date
James R. Barlow
b01d9e07e8
Deal with missing pthread_sigmask on Cygwin
Closes #701
2020-12-27 02:24:00 -08:00
James R. Barlow
91db94cf2e
watcher: fix OCR_LOGLEVEL env var not processed
Closes #702
2020-12-27 02:02:44 -08:00
James R. Barlow
416df803d4
pdfinfo: stricter typing 2020-12-24 22:39:00 -08:00
James R. Barlow
037b96ca16
pdfinfo: refactor to eliminate RawPageInfo 2020-12-24 02:57:44 -08:00
James R. Barlow
bb258fc99c
pdfinfo: Refactor pageinfo dictionary into a class 2020-12-24 01:47:53 -08:00
James R. Barlow
4b8ccbe8cb v11.4.1 release notes v11.4.1 2020-12-22 01:41:15 -08:00
James R. Barlow
ab1ff3331b
misc: synology fix
Accept user-contributed fix. Not testable.

Close #690.
2020-12-22 01:38:41 -08:00
James R. Barlow
3675ae918c
Fix certain invalid page ranges causing exception
Closes #686
2020-12-22 01:22:14 -08:00
James R. Barlow
0ba32b96b7 Revert "v11.4.0 release notes - remove change not actually implemented"
This reverts commit ad202693b3dcf905e180a665a54f349d00d8dfba.
Temporary folder prefix was actually changed in commit f11bb53e.
2020-12-22 00:47:25 -08:00
James R. Barlow
add64e4fa2 docs: com.github.ocrmypdf -> ocrmypdf.io 2020-12-22 00:46:42 -08:00
James R. Barlow
7fe2954ede
Change wheel tag to py36, update package_data to include py.typed 2020-12-12 16:49:04 -08:00
James R. Barlow
ad202693b3
v11.4.0 release notes - remove change not actually implemented
Remove a change that was pushed back to a future release.
2020-12-12 16:27:38 -08:00
James R. Barlow
594ef83551 v11.4.0 release notes v11.4.0 2020-12-11 15:09:49 -08:00
James R. Barlow
78b71618c1 Fix BufferedReader TypeError 2020-12-11 14:19:20 -08:00
James R. Barlow
b8aa89e1ec Fix log message queue flooding on certain files
Fixes #692
2020-12-11 14:14:21 -08:00
James R. Barlow
b4c1f66bc1 typing: tidy up 2020-12-11 14:14:21 -08:00
James R. Barlow
5172dbde8d subprocess: use more mypy-friendly syntax 2020-12-11 14:14:21 -08:00
James R. Barlow
d2908640c6 pdfa: help mypy figure out a type 2020-12-11 14:14:21 -08:00
James R. Barlow
997bf7578d hocrtransform: fix exception if no div ocr_page object 2020-12-11 14:14:21 -08:00
James R. Barlow
043258242c hocrtransform: trivial typing 2020-12-11 14:14:21 -08:00
James R. Barlow
156d5d9a9c cli: typing 2020-12-11 14:14:21 -08:00
James R. Barlow
0b7e52fb5e api: parse cmdline in more type friendly way 2020-12-11 14:14:21 -08:00
James R. Barlow
a5feef07d0 Declare ocrmypdf as typed 2020-12-11 14:14:21 -08:00
James R. Barlow
f11bb53e61 Change prefix of temporary folders
Shouldn't really use a name that suggests a connection to GitHub.
2020-12-07 21:51:46 -08:00
James R. Barlow
68a57a7839
Add feature to generate hocr-pdf with visible debug text 2020-12-04 17:38:48 -08:00
James R. Barlow
4194430dc1
Begin next release notes 2020-12-04 13:28:04 -08:00
James R. Barlow
a707c56fae docs: improve windows instructions 2020-12-04 13:21:54 -08:00
James R. Barlow
3cba50bfbd windows: look in registry for Tesseract and Ghostscript 2020-12-04 13:21:54 -08:00
James R. Barlow
ed5e17d0a4 completions: consider *.PDF and some images too 2020-12-04 13:20:35 -08:00
James R. Barlow
ce0e0ecd4d Decouple tqdm from progressbar setup 2020-12-04 13:20:28 -08:00
James R. Barlow
7e1223c12c
ghostscript: add output tracing 2020-11-29 14:53:35 -08:00
James R. Barlow
b83d7f6d1a
subprocess: refactor and add run_polling_stderr 2020-11-29 14:36:03 -08:00
James R. Barlow
80e957908a
tesseract: fix run call with logs_errors_to_stdout 2020-11-29 14:25:46 -08:00
James R. Barlow
f0e7bea8ba
docs: remove redundant statement 2020-11-27 13:54:36 -08:00
James R. Barlow
0cdb9bd04a
docs: remove description of how OMP_THREAD_LIMIT is managed 2020-11-23 12:36:04 -08:00
James R. Barlow
8224d89bc6
v11.3.4 release notes v11.3.4 2020-11-18 11:57:28 -08:00
James R. Barlow
a2bbbe2a26
v11.3.4 release notes 2020-11-18 11:56:29 -08:00
James R. Barlow
43f41863fa
check_pdf: document how we handle linearization 2020-11-18 11:54:07 -08:00
James R. Barlow
d71e50e83d
Fix "readLinearizationData for file that is not linearized"
pikepdf 2.1.0 throws wrong type of exception in this case, so special-case it.

Closes #680
Closes #681
2020-11-18 11:52:17 -08:00
James R. Barlow
1f598da3c1
ghostscript: better docs and comments 2020-11-18 11:34:17 -08:00
James R. Barlow
d0cdbd5e1c
watcher: include uppercase .PDF too 2020-11-12 02:29:47 -08:00
James R. Barlow
5c56f61209
unpaper: type hints 2020-11-11 02:59:37 -08:00
James R. Barlow
9bec85470a Merge branch 'master' of github.com:jbarlow83/OCRmyPDF 2020-11-10 04:08:05 -08:00
James R. Barlow
a03863a17d
docs: fix link to docker image 2020-11-10 04:08:01 -08:00
James R. Barlow
22cd9b2364
docs: fix csv-table errors 2020-11-10 04:07:49 -08:00
pretentious7
4fc7d6d93e
fix typo "charcter" -> "character" (#673) 2020-11-09 16:53:02 -08:00
James R. Barlow
71f0e7f545
v11.3.3 release notes v11.3.3 2020-11-07 00:53:33 -08:00
James R. Barlow
895fddd85e
Replace most uses of universal_newlines with text
The parameters are equivalent but the latter is better named. Since
Python 3.6 doesn't support text= we use our wrapper to add it in that
place.

This is for subprocess.run.
2020-11-07 00:48:08 -08:00
James R. Barlow
5a59e4d543
unpaper: don't use universal_newlines=True
There's no specific reason to do this. We can log binary output equally
 well.
2020-11-07 00:18:27 -08:00
James R. Barlow
b51abf2249
azure: Fix indentation mistake 2020-11-04 12:19:35 -08:00