James R. Barlow
378f543619
TextPositionTracker: set boxes_flow=None
...
We don't care about the order of lines in our analysis, and this is an
expensive calculation in pdfminer.
2020-06-30 04:20:58 -07:00
James R. Barlow
62924ee280
Improve API documentation
2020-06-30 04:20:14 -07:00
James R. Barlow
86a73191b0
Plugin manager: accept Path(plugin)
2020-06-30 04:17:30 -07:00
James R. Barlow
86875997b8
Fix more mypy errors
2020-06-29 02:17:14 -07:00
James R. Barlow
b939584c7a
quality: fixing typing issues
2020-06-29 01:45:45 -07:00
James R. Barlow
bbd174071d
readme: markdown cleanup
2020-06-29 01:45:27 -07:00
James R. Barlow
e5b6fe1317
pyproject.toml: weird line wrapping?
2020-06-29 01:45:12 -07:00
James R. Barlow
f15d9049eb
install: add Mageia
...
Closes #586 . Thanks to @yannick56
2020-06-26 23:28:26 -07:00
James R. Barlow
7630c93e5b
install: drop Ubuntu 14.04 steps
...
Bit rot must have set in.
2020-06-26 23:27:42 -07:00
James R. Barlow
638d68aa8a
docs: move Windows ahead of FreeBSD
2020-06-26 22:49:34 -07:00
James R. Barlow
a92dde058a
docs: promote one liner installs, reorg Windows
2020-06-26 22:47:44 -07:00
James R. Barlow
580f2ebb4b
Python 3.9beta is now known to work (Fedora)
2020-06-26 00:06:58 -07:00
James R. Barlow
01cae7a584
docs: Update Fedora versions
2020-06-23 02:08:24 -07:00
James R. Barlow
66337813e6
Spell runslow correctly
v10.2.0
2020-06-22 23:32:09 -07:00
James R. Barlow
eb5a211e72
New hocrtransform test isn't platform stable - mark runslow
2020-06-22 16:59:59 -07:00
James R. Barlow
5142933120
v10.2.0 release notes
...
Closes #582 , #584 , #545
2020-06-22 16:37:51 -07:00
James R. Barlow
06ab114aa8
Update test cache
2020-06-22 16:31:34 -07:00
James R. Barlow
1257419465
test_hocrtransform: this test is worth not caching
2020-06-22 16:31:06 -07:00
James R. Barlow
30404f53f0
Add test to sanity check our pdf renderers
2020-06-22 16:18:38 -07:00
James R. Barlow
1ce8edbdfe
hocrtransform: some text not included in output after Tesseract changes
2020-06-22 15:48:23 -07:00
James R. Barlow
d4b704a0ae
hocrtransform: refactor colors
2020-06-22 15:22:48 -07:00
James R. Barlow
2d64e1536d
hocrtransform: refactor xpath manipulations
2020-06-22 14:44:34 -07:00
James R. Barlow
c8b581ac31
hoctransform: remove deprecated element.getchildren()
...
Breaks Python 3.9.
2020-06-22 14:28:18 -07:00
James R. Barlow
ad8dead7df
Document that API accepts streams now
2020-06-22 14:27:27 -07:00
James R. Barlow
c9bd87254e
A few minor typing issues
2020-06-22 02:31:53 -07:00
James R. Barlow
f4cb424451
Support input/output streams at API level
2020-06-22 02:02:18 -07:00
James R. Barlow
fef14778d5
Fix missing f-string in log message
2020-06-22 01:17:16 -07:00
James R. Barlow
86ec63f215
Decouple plugin manager forking from PdfContext/Pagecontext
2020-06-22 01:16:59 -07:00
James R. Barlow
5b10ec9d39
jobcontext.PdfContext: remove dead code, add annotations
2020-06-22 00:34:58 -07:00
James R. Barlow
800c75c4e5
Bump requirements (mainly for Docker's benefit)
2020-06-21 01:58:53 -07:00
James R. Barlow
24d64b04c3
Update Docker to Ubuntu 20.04 and jbig2-latest
2020-06-21 01:48:31 -07:00
James R. Barlow
48e2750551
Fix some tests that were failing in Docker
2020-06-21 01:48:13 -07:00
James R. Barlow
e182c5f63e
Update and sync .dockerignore, .gitignore
...
Also blacklist .* and whitelist the ones we want.
2020-06-21 01:25:59 -07:00
James R. Barlow
06d52326db
Fix deleted path in .coveragerc
2020-06-21 01:24:23 -07:00
James R. Barlow
ebfe4f0d29
Fix issue #582 - PDF/A acquires title "Untitled" after conversion
2020-06-20 02:01:16 -07:00
James R. Barlow
ad22977c84
v10.1.1 release notes
2020-06-17 14:45:32 -07:00
James R. Barlow
6ac50646f0
Fix OMP_THREAD_LIMIT rounded down to 0 in some cases
2020-06-17 14:43:19 -07:00
James R. Barlow
24b6a4ad50
v10.1.0 notes
v10.1.0
2020-06-16 00:55:28 -07:00
James R. Barlow
e802896d4d
unpaper: use PNG input where possible
...
Unpaper accepts PNG as input now, so avoid generating a huge
temporary PPM file if we can. If we
must create a PNG, compress it lightly to keep our temp usage down.
2020-06-16 00:50:18 -07:00
James R. Barlow
0b5a20e593
coverage: ignore type checking
2020-06-15 15:55:39 -07:00
James R. Barlow
642998ead6
sync: refactor preprocess image filtering
2020-06-15 15:26:41 -07:00
James R. Barlow
698aab4f75
Add a lot of type annotations
2020-06-15 15:20:50 -07:00
James R. Barlow
34231ac667
sync: refactor intermediate image production
2020-06-15 15:02:28 -07:00
James R. Barlow
ddedf7cd2e
For --clean-final, use same image as --clean if possible
2020-06-15 13:48:49 -07:00
James R. Barlow
9d127d354c
docs: improve description of plugins
2020-06-15 12:51:49 -07:00
James R. Barlow
2d2a4894ab
Some corrections to release notes
2020-06-15 12:51:28 -07:00
James R. Barlow
862861e3ca
Fix error message in logging from repeated filtering
...
If logging somehow triggers PageNumberFilter multiple times, it would fail on the second occurrence.
2020-06-13 14:50:58 -07:00
James R. Barlow
892db88f0e
test_two_languages: use narrower test
v10.0.1
2020-06-12 14:33:02 -07:00
James R. Barlow
eeb44f78cc
Fix tests that failed on other platforms from previous fix
2020-06-12 12:59:46 -07:00
James R. Barlow
863835f660
v10.0.1 release notes
2020-06-12 12:11:21 -07:00