102 Commits

Author SHA1 Message Date
PunkPangolin
ee3da07710
Add appstream metainfo file + screenshot (#1462)
* Add io.ocrmypdf.ocrmypdf.metainfo.xml

* Create sample_screenshot.png

* Better screenshot

* Add screenshot to metainfo

* Move into /misc/flatpak

* Add screenshot URL

* Add icon and categories to metainfo

* Use installed icon instead of remote

* Add keywords to metainfo, change summary closer to Flathub Guildelines
2025-05-27 00:42:47 -07:00
James R. Barlow
6f16d0130a
Clarify that ocrmypdf-compare is a testing tool 2025-04-15 00:03:14 -07:00
James R. Barlow
d84c47816c
webservice: promote pages to primary option 2025-04-06 01:07:47 -07:00
James R. Barlow
6de6749062
webservice: fix download button downloads wrong file 2025-02-26 18:42:50 -08:00
James R. Barlow
b5bc1d209c
Remove ttyd 2025-02-26 14:53:13 -08:00
James R. Barlow
f02353686d
s/input/output 2025-01-04 12:18:07 -08:00
James R. Barlow
073a434ab3
Fix webservice interactions with Docker 2025-01-04 12:09:32 -08:00
James R. Barlow
55e7177dbe
Present similar interface in webservice.py 2025-01-04 01:04:58 -08:00
James R. Barlow
36c82e0659
Add debugging helper scripts 2025-01-01 18:03:15 -08:00
James R. Barlow
dd6ed4c5f8
Switch to streamlit based web app 2025-01-01 17:26:22 -08:00
James R. Barlow
a1b8113d56
Add bisect script 2024-11-08 11:09:13 -08:00
James R. Barlow
d5ff7f7db9
batch: fix issues flagged by ruff 2024-05-21 01:52:57 -07:00
James R. Barlow
579cef3649
watcher: Ensure output files are .pdf 2024-05-21 01:51:30 -07:00
James R. Barlow
065bddbc6c
Reformat with ruff format 2024-04-07 00:25:32 -07:00
NilsRo
feeb9f213f
batch example: added archive, small corrections and optimizations (#1277)
* Added archive, small corrections

Added a function to archive originals and avoid calling ocrmypdf if they are still is PDF/A.

* Added Copyright
2024-03-18 13:22:24 -07:00
James R. Barlow
8d30cff4ef
Undo future annotations from watcher.py till Typer fixes its issue
Fixes #1258
2024-02-20 19:14:39 -08:00
James R. Barlow
3a3635f7f9
Python 3.10 cleanup, manual fixes 2024-02-14 12:48:17 -08:00
James R. Barlow
f69267bb67
watcher: restore ability to read json from file or command line string 2023-11-07 18:05:29 -08:00
James R. Barlow
55566d9830
Fix watcher.py kwarg error 2023-11-05 13:58:24 -08:00
James R. Barlow
52d99732b1
Fix mistakes with watcher loglevel handling 2023-10-28 00:47:40 -07:00
James R. Barlow
c6be3ba076
watcher: Improve parameter validation 2023-10-20 20:11:00 -07:00
James R. Barlow
0565cb0b10
misc/watcher.py: use Typer and dotenv to improve ease of use 2023-10-20 19:56:39 -07:00
James R. Barlow
dc49906704
Improve wait_for_file_ready loop 2023-10-20 19:55:50 -07:00
James R. Barlow
0388c23ae7
Merge branch 'feature/jbig2thresh' into v15 2023-09-21 00:07:05 -07:00
James R. Barlow
be12f7a728
Make fish completion a bit smarter 2023-09-20 14:45:22 -07:00
James R. Barlow
e3c813fc67
Added support for changing color conversion strategy 2023-09-20 01:08:15 -07:00
James R. Barlow
330352aeed
Update completions for jbig2 threshold 2023-09-17 14:47:46 -07:00
Srikar Sundaram
4bee7355e9
Change skip-ocr to skip-text (#1146) 2023-09-14 17:22:34 -07:00
James R. Barlow
a6ce35b13a
Add argument to override digital signatures 2023-08-12 01:31:36 -07:00
James R. Barlow
e44a57aec0
Try a screencast/terminal demo 2023-06-20 00:48:42 -07:00
James R. Barlow
33b70be7d5
ruff: more fixes, mainly missing docstrings 2023-04-14 02:16:38 -07:00
James R. Barlow
4924b11b6b
Additional ruff fixes 2023-04-14 01:25:16 -07:00
James R. Barlow
9b8d14d16e
Accept most of ruff's delinting 2023-04-14 00:45:34 -07:00
comzine
2685f910b1
watcher: added setting RETRIES_LOADING_FILE to avoid giving up to early (#1063) 2023-01-25 17:36:54 -08:00
Doug Rinckes
d09f61d4fe
log completion message (#1044)
This logs the "done" message if neither delete nor archive options are set.
2022-12-14 17:24:41 -08:00
James R. Barlow
7da4e6ca7f
Address some linter warnings 2022-09-21 00:05:12 -07:00
James R. Barlow
4b9ea40a0c
spdx: move identifiers to files that support them
If the apparent license changed, take this commit as correct.
2022-08-04 03:26:54 -07:00
James R. Barlow
80ed2117cc
Change to SPDX license tracking 2022-07-28 01:10:07 -07:00
James R. Barlow
dc6f1a266a
Modernize type annotations 2022-07-23 00:39:24 -07:00
Julius Bullinger
7cabbb125f
watcher: Add an option to archive processed originals (#951)
* watcher: Add an option to archive processed originals

This adds a feature from existing OCRmyPDF watchdog Docker containers like meyay/ocrmypdf-batch and unze/ocrmypdf-watchdog. With this option, the input directory can be kept clean from already processed files, without losing the originals.

* docs: Improve watcher.py's Docker parameters documentation
2022-06-17 15:17:03 -07:00
James Barlow
776ada6713 Upgrade pre-commit and associated tools; various lints 2022-04-03 20:53:01 -07:00
James R. Barlow
0323738ada ocrmypdf.fish: fix indents
[ci skip]
2021-12-06 15:38:27 -08:00
FPille
aae5591f7e Update ocrmypdf.bash completion
Squashed commit of the following:

commit 974de2e8ccad7fd34694f2c3a7a17c64bb52cdab
Merge: a8d7f969 ee04aa72
Author: James R. Barlow <james@purplerock.ca>
Date:   Sat Dec 4 20:22:50 2021 -0800

    Merge branch 'update_bash-completion' of git://github.com/FPille/OCRmyPDF into FPille-update_bash-completion

commit ee04aa722504272891d8c74171f1de9bc954ca09
Author: FPille <f.pille@gmail.com>
Date:   Thu Oct 14 11:09:23 2021 +0200

    update

commit 76f64537aa5549278483ce338fe03764d0ce8065
Author: FPille <f.pille@gmail.com>
Date:   Thu Oct 14 11:04:10 2021 +0200

    updated and descriptions for arguments and choices added
    deprecated arguments removed
    bug fix: typo "_init_completion" instead of "_init_completions"

commit de9b93e852b3a6aca29b77ff7bdf433a07b42794
Merge: c23374de 42713b77
Author: Frank <50119297+FPille@users.noreply.github.com>
Date:   Thu Oct 14 08:08:11 2021 +0200

    Merge branch 'jbarlow83:master' into master

commit c23374de818edddb789073251386e5ee1cfaef84
Merge: 40b2ebcb c409fa58
Author: Frank <50119297+FPille@users.noreply.github.com>
Date:   Wed May 26 20:31:00 2021 +0200

    Merge branch 'jbarlow83:master' into master

commit 40b2ebcb37b6a21845e2733d4ad8078c09d08d0a
Merge: 79c84eef 7e388f59
Author: Frank <50119297+FPille@users.noreply.github.com>
Date:   Sat Jun 1 11:09:07 2019 +0200

    Merge pull request #1 from jbarlow83/master

    update master
2021-12-06 15:38:26 -08:00
James R. Barlow
f91faf9795 Add new argument --tesseract-thresholding to control tesseract thresholding where available
Also add missing test for --tesseract-oem
2021-12-06 15:38:14 -08:00
James R. Barlow
59642a98b2 Disable --remove-background so we can remove leptonica 2021-11-12 23:56:52 -08:00
James R. Barlow
30440104ba Remove --threshold argument
Tesseract is now included better thresholding (binarization) in v5. Users that have
thresholding issues should try that first. If we find further problems
this can be brought back as a plugin.
2021-11-12 20:09:55 -08:00
James R. Barlow
77f7621bbc batch.py: tidy 2021-10-15 15:03:40 -07:00
James R. Barlow
790d3022f6 Implement --output-type=none to skip producing the PDF and use only the sidecar
Closes #787
2021-09-26 01:07:34 -07:00
James R. Barlow
c725bf79da flake8 delinting 2021-09-21 16:37:03 -07:00
James R. Barlow
4eca0a165b pre-commit: pyupgrade modernizing 2021-08-26 18:04:38 -07:00