James R. Barlow
6d3f9ff15a
api: rework ocr() slightly to simplify variable handling
2020-11-03 17:10:52 -08:00
James R. Barlow
5d1d1a712b
docs: more details about macOS API changes
...
Due to fork->spawn
2020-11-03 17:09:58 -08:00
James R. Barlow
6d5f8133e0
docs: show ifmain guard in example
2020-11-03 15:28:33 -08:00
James R. Barlow
13018d3d5c
ci: Extend test matrix to Python 3.9
2020-11-03 04:15:14 -08:00
James R. Barlow
14a85f9473
Fix pinned dependencies
v11.3.2
2020-11-03 04:12:47 -08:00
James R. Barlow
d22a1b3367
v11.3.2 release notes (2)
...
Since we never tagged it, fix other things.
2020-11-03 02:03:25 -08:00
James R. Barlow
b913e5dfef
ghostscript: don't repeat log in debug
...
Subprocess already does this for us.
2020-11-03 01:45:06 -08:00
James R. Barlow
dd8a5a4c72
Fix log domain names
...
ocrmypdf.subprocess.subprocess.ghostscript -> ocrmypdf.subprocess.ghostscript
2020-11-03 01:44:35 -08:00
James R. Barlow
36e9a54f02
Remove extraneous page rotation
...
This was added in commit b5ccbfd but seems to have been ill-advised.
2020-11-03 01:34:28 -08:00
James R. Barlow
3707af3b74
Change pdf.root to pdf.Root
2020-11-03 01:30:31 -08:00
James R. Barlow
ced7ad9164
unpaper: round off DPI
2020-11-03 01:14:57 -08:00
James R. Barlow
54bbbfdeb3
Fix UnboundLocalError when considering ImageMasks for optimization
...
Uncovered by test file in issue 667, although unrelated to that issue.
2020-11-03 01:08:14 -08:00
James R. Barlow
7f73a6ed1e
Some Python 3.9 fixes
2020-11-03 00:45:47 -08:00
James R. Barlow
dce206d3dc
Fix pre-commit for Py3.9
2020-11-03 00:20:25 -08:00
James R. Barlow
9304c856cf
Merge branch 'master' of github.com:jbarlow83/OCRmyPDF
2020-11-02 02:47:36 -08:00
James R. Barlow
e5df98cbdf
v11.3.2 release notes
2020-11-02 02:43:32 -08:00
James R. Barlow
19bf3aeb00
api: improve typing
2020-11-02 02:33:34 -08:00
James R. Barlow
e86be0031c
unpaper: fix process output handling
...
With the ocrmypdf.subprocess wrapper, logging the output here
is redundant and loses the page number context.
2020-11-02 01:07:41 -08:00
James R. Barlow
6425977998
unpaper: use pnm instead of png
...
Some users reported problems with PNG recently; try PNM.
Fixes #665
Fixes #667
2020-11-02 01:05:56 -08:00
James R. Barlow
d57df2d980
subprocess: support programs that write their messages to stdout
2020-11-02 01:00:59 -08:00
James R. Barlow
664d0c7969
Document configure_debug_logging
2020-11-02 00:59:00 -08:00
James R. Barlow
a354663ee1
Fix typo in API documentation
2020-11-02 00:58:28 -08:00
Graham Miln
b21b048ec4
Add macOS brew language support ( #615 )
...
Note `brew` command for installing additional languages on macOS.
2020-10-30 01:09:06 -07:00
James R. Barlow
709c65b41a
v11.3.1 release notes
v11.3.1
2020-10-27 23:11:11 -07:00
James R. Barlow
67f99c5bb7
Endorse pdfminer.six 20201018
2020-10-27 23:09:45 -07:00
James R. Barlow
d55e673d9c
Fix warning about --pdfa-image-compression argument at wrong times
...
Closes #663
2020-10-27 23:09:45 -07:00
James R. Barlow
21b90d2d14
Endorse pikepdf 2.x
2020-10-27 23:09:45 -07:00
Edward Betts
2def7e3392
Use % for percentage in string format ( #643 )
2020-10-27 23:09:14 -07:00
James R. Barlow
b0dcaa7512
v11.3.0 release notes
v11.3.0
2020-10-24 03:19:32 -07:00
James R. Barlow
e8285b1d10
Add test to confirm rasterize_pdf_page rotates correct
2020-10-24 03:10:59 -07:00
James R. Barlow
5ba56adb53
Fix page rotation issue (again)
...
Commit 1327ab3 introduced a fix for a regression, which was reported
in #581 , #634 . It appears that the actual cause of this issue was
default parameters to rasterize_pdf_page in pluggy not working as
expected, causing a default rotation=0 even when a rotation was needed.
As such the OCR image was generated with the wrong orientation,
causing the initial regression and fix in commit 1327ab3.
Now that the real problem is identified, it's apparent that the logic
prior to 1327ab3 was found and we can revert to 1327ab3 since it fixes
all known cases including #658 .
This reverts 1327ab3 except for retaining improves to rotation output.
2020-10-24 02:45:21 -07:00
James R. Barlow
ca735278e0
setup: Version pluggy better
2020-10-24 02:35:41 -07:00
James R. Barlow
b5ccbfdf25
Fix hookspec of rasterize_pdf_page to remove default parameters
2020-10-24 02:35:18 -07:00
James R. Barlow
8c35d6e6e4
Fix debug log messages being suppressed from child processes
2020-10-22 02:20:06 -07:00
James R. Barlow
d1e0c81eda
Ensure worker_pdf is closed after gathering info in a thread
...
This is hacky, uses global state, but it does improve the situation for now.
2020-10-22 00:38:24 -07:00
James R. Barlow
10c8e4f8b4
Only create debug.log when running from command line
...
When used as a library ocrmypdf shouldn't make policy decisions, like where to
put a log file. Unsurprisingly, creating it causes problems for library users
because we deleted the temporary folder which held the log file and made no
effort to move it to a new location.
Also update the documentation to better described how an application should
handle this.
Closes #657
2020-10-20 01:29:36 -07:00
James R. Barlow
6be2242c21
Describe "OCR" step as "Image processing" when --tesseract-timeout=0
...
Fixes #647
2020-10-08 01:03:42 -07:00
James R. Barlow
204c9d6ae1
Fix inverted colors during JBIG2 optimization on paletted images
...
Fixes #640
v11.2.1
2020-10-07 04:08:50 -07:00
James R. Barlow
6eb393590b
v11.2.0 release notes
...
Change v11.1.3 to v11.2.0 since it contains functional changes.
v11.2.0
2020-10-06 03:24:31 -07:00
James R. Barlow
07c6654057
v11.1.3 release notes
2020-10-06 03:22:48 -07:00
James R. Barlow
4e15eb8d14
Fix image optimization discarding image masks and soft masks associated with PNGs
...
Fixes #648
2020-10-06 03:20:54 -07:00
James R. Barlow
8b01ab8ad2
Better type checking on ocrmypdf.ocr(plugins=...)
2020-10-05 15:02:34 -07:00
James R. Barlow
e0a522ad50
Document the example plugin
2020-10-05 15:01:44 -07:00
James R. Barlow
a1a8788c5a
Merge branch 'master' of github.com:jbarlow83/OCRmyPDF
v11.1.2
2020-09-29 02:46:27 -07:00
James R. Barlow
cccdc178c3
v11.1.2 release notes
2020-09-29 02:46:18 -07:00
James R. Barlow
4eacb3454f
hOCR: write text in correct order
...
Fixes #642
2020-09-29 02:45:11 -07:00
Jimit Dholakia
82b8b41e80
docs: Add 'unpaper' optional dependency for Ubuntu 18.04 ( #639 )
2020-09-25 11:54:31 -07:00
James R. Barlow
581c5020ab
v11.1.1 release notes
v11.1.1
2020-09-25 00:28:38 -07:00
James R. Barlow
3ef8872a1e
pngquant driver: refactor, use streams instead of temporary files
2020-09-25 00:18:02 -07:00
James R. Barlow
28eec73eed
Tighten unpaper-args validation to exclude . and ..
...
Just in case
2020-09-25 00:18:02 -07:00