James R. Barlow
7cf83c77ca
Merge branch 'feature/pdfa3'
2018-05-03 16:45:57 -07:00
James R. Barlow
8a9f174f63
Fix XMP validation issue with /CreationDate
...
Related to previous validation issue. If the /CreationDate had no
timezone, Ghostscript also creates invalid metadata. Work around this.
Also fix up PDF date decoding, and transcode dates to standardize them.
2018-05-03 16:30:20 -07:00
James R. Barlow
76276f61e5
Split out rotation related tests
2018-05-01 23:51:35 -07:00
James R. Barlow
bfd26e6ec6
Tests: confirm OCR layer copied
2018-05-01 23:16:41 -07:00
James R. Barlow
b5d7e9cbb0
Fix all issues with rotations
...
All tests now pass
2018-05-01 22:50:20 -07:00
James R. Barlow
a9abe13185
Remove the old tesseract pdf_renderer
2018-05-01 17:31:34 -07:00
James R. Barlow
6b315e8315
Add ability to disable cache
2018-05-01 15:52:00 -07:00
James R. Barlow
2131ad4670
Fix --remove-background error on PDFs with colormapped images
...
It's unclear how exactly a
colormapped image gets to this
spot given the tendency of other
image processing tools to flatten
such images, but someone made it happen, so now we make sure
the image is okay.
Closes #262
2018-04-27 17:21:01 -07:00
James R. Barlow
219fe2155b
test_pageinfo: remove duplicate import
2018-04-27 17:16:42 -07:00
James R. Barlow
0934905493
Don't suppress error message from config_notfound
...
Since it showed up in s390x bionic
2018-04-25 21:58:18 -07:00
James R. Barlow
df87e21c85
Add support for PDF/A-3
...
No ability to attach files however
2018-04-20 00:06:55 -07:00
Hugo
d761d80750
Use more standard __version__ rather than PILLOW_VERSION ( #257 )
2018-04-19 23:35:32 -07:00
James R. Barlow
0b10db91be
Fix regression: Disable Ghostscript JPEG passthrough entirely
2018-04-17 17:00:24 -07:00
James R. Barlow
1a516b2af9
Fix regression: time stamp test suite failures
2018-04-17 16:59:21 -07:00
James R. Barlow
7368399f8b
Clarify license of two test files - https://github.com/jbarlow83/OCRmyPDF/issues/254
2018-04-17 11:56:36 -07:00
James R. Barlow
34c78a892a
Fix list table for tests/resources
...
[ci skip]
2018-04-15 23:52:19 -07:00
James R. Barlow
10aa59f674
v6.1.4 fix test suite regression with Ghostscript 9.23
2018-04-12 15:16:54 -07:00
James R. Barlow
ba0535e3fb
Update test cache to account for unpaper --layout none change
2018-04-12 00:48:21 -07:00
James R. Barlow
49fa7f6b5c
tesseract_cache: don't reveal host system file paths in manifest file
2018-04-12 00:47:28 -07:00
James R. Barlow
7a1cd39b21
Fix creation date metadata lost from input
...
Closes #247
2018-04-02 17:53:39 -07:00
James R. Barlow
4f6bffb477
Update copyrights
2018-03-31 11:54:38 -07:00
James R. Barlow
8d9be43c60
test_bookmarks_preserved won't raise ImportError any more
...
Due to trapping this in ocrmypdf.lib
2018-03-28 23:22:55 -07:00
James R. Barlow
40ef4f0bbe
Add new argument --skip-repair to skip the repair step
2018-03-28 00:54:58 -07:00
James R. Barlow
5becfcf8ea
Refactor fitz ImportError trap
2018-03-27 21:38:02 -07:00
James R. Barlow
a9bd494cc0
Merge branch 'optional-fitz'
2018-03-27 13:36:33 -07:00
James R. Barlow
6a4df78bc0
Add _naive_find_text to search for text when fitz is not available
2018-03-27 13:36:17 -07:00
James R. Barlow
530eae3898
Fix test_main missing file_claims_pdfa
2018-03-26 15:33:53 -07:00
James R. Barlow
3e444f6a90
Make fitz optional
2018-03-26 13:22:09 -07:00
James R. Barlow
45dbff6401
Fix table of contents not preserved in PDF/A
2018-03-26 02:23:19 -07:00
James R. Barlow
bc56b8e058
Move metadata tests to new test_metadata
2018-03-26 01:49:25 -07:00
James R. Barlow
746969207a
Remove deprecated --pdf-renderer tess4, which was renamed to sandwich
...
Should have been cut in v6.0.0
2018-03-26 01:17:22 -07:00
James R. Barlow
230d301268
conftest: py3.5 path issue
2018-03-25 00:52:45 -07:00
James R. Barlow
a2d00f5f1d
tess cache: fix tess3 error for -psm instead of --psm
2018-03-25 00:43:02 -07:00
James R. Barlow
8c1c61f207
test cache: fix Path + str error
2018-03-25 00:02:03 -07:00
James R. Barlow
77476965ae
test cache: use .bin extension, fix .gitignore .gitattributes
2018-03-24 23:54:16 -07:00
James R. Barlow
ca51514046
Add test cache
2018-03-24 23:50:41 -07:00
James R. Barlow
8975b72a01
Fix test_testonly_pdf generating an output file in pwd
2018-03-24 22:34:35 -07:00
James R. Barlow
874ec6a87f
Add missing fixture to test_unpaper
2018-03-24 22:24:14 -07:00
James R. Barlow
909eaeeead
spoof: Allow tesseract cache to share cache
...
Previous incarnation was only suitable for generating a local cache
where the suite was executed repeatedly. Now the cache ignores
differences, so it can be checked into Github and shared.
2018-03-24 22:17:36 -07:00
James R. Barlow
c138161fae
Tests: more cleanup
2018-03-24 15:35:57 -07:00
James R. Barlow
e48590d66c
Refactor out unpaper-specific tests
2018-03-24 15:21:44 -07:00
James R. Barlow
5b1c8541fc
Review some skipped tests to make sure reasons still valid
2018-03-24 15:13:23 -07:00
James R. Barlow
e5e011021b
Remove the OCRMYPDF_program environment variables
...
Really, this was just replicating the functionality of the PATH
environment variable, and users probably do that anyway.
2018-03-24 15:09:08 -07:00
James R. Barlow
11d74dea09
Remove the OCRMYPDF_program environment variables
...
Really, this was just replicating the functionality of the PATH
environment variable, and users probably do that anyway.
2018-03-24 15:07:02 -07:00
James R. Barlow
6756016572
Add license notice to all files
...
Source files to GPL3
Exceptions:
-tests/spoof/* to MIT
-hocrtransform.py
-_unicodefun.py
Test resources to CC BY-SA 4.0 except when otherwise noted.
Add GPL license.
2018-03-24 02:33:24 -07:00
James R. Barlow
d700154e0e
Fix regressions after --skip-text improvements
2018-03-24 02:24:45 -07:00
James R. Barlow
8159cc6b88
Skip one test that fails for qpdf 8.0.[0,1], due to qpdf regression
2018-03-09 07:57:22 -08:00
James R. Barlow
4046766ca5
Fix Python 3.5 test suite failure on symlinks
...
Did not account for API difference in pathlib
2018-03-02 16:57:46 -08:00
James R. Barlow
74ca736333
Issue #223 : improve text of encrypted PDF error message
2018-02-27 15:08:22 -08:00
James R. Barlow
8ab8132411
lint: unused variables, wildcard imports
2018-02-24 12:48:52 -08:00