James R. Barlow
6a4df78bc0
Add _naive_find_text to search for text when fitz is not available
2018-03-27 13:36:17 -07:00
James R. Barlow
6756016572
Add license notice to all files
...
Source files to GPL3
Exceptions:
-tests/spoof/* to MIT
-hocrtransform.py
-_unicodefun.py
Test resources to CC BY-SA 4.0 except when otherwise noted.
Add GPL license.
2018-03-24 02:33:24 -07:00
James R. Barlow
45c7bd9a60
lint: Remove shebangs from non-executable files
2018-02-24 12:38:58 -08:00
James R. Barlow
6ff6c8614f
—output-type=pdf now outputs /UserUnit PDFs at the correct size
...
This currently distorts the output size because Tesseract assumes it
knows the DPI better than we do.
Does not work for Ghostscript, because it emerges that Ghostscript
honors /UserUnit for rasterizing but not in pdfwrite (resolve/wontfix).
https://bugs.ghostscript.com/show_bug.cgi?id=690781
Ghostscript’s output would need to be patched in a PDF/A safe way for
this to work. Temporary route may be to block Ghostscript if
/UserUnit.
2017-05-24 23:26:07 -07:00
James R. Barlow
d9005a1074
pdfinfo: replace most remaining dict-style access
2017-05-19 16:17:36 -07:00
James R. Barlow
08e47117a3
Rename pageinfo to pdfinfo
2017-05-19 15:48:23 -07:00
James R. Barlow
8694f8d2eb
Replace magic strings colorspace and encoding with Enums
2017-05-18 22:32:27 -07:00
James R. Barlow
56d2aae963
Refactor from ImageInfo index to attribute accessing
2017-05-18 18:39:14 -07:00
James R. Barlow
caee5b1428
Access PageInfo instance variables instead of dictionary
2017-05-18 17:12:04 -07:00
James R. Barlow
cd04ae6949
Refactor PdfInfo(str(filename)) -> PdfInfo(filename)
2017-05-18 16:43:50 -07:00
James R. Barlow
6a0b68298f
Refactor pdf_get_all_pageinfo to PdfInfo
2017-05-18 16:31:18 -07:00
James R. Barlow
96045e98f4
Update develop with master changes
...
We’re well out of the “trivial updates” zone
2017-05-11 22:54:27 -07:00
James R. Barlow
aa859a4139
Fix #156 - NoneType has no ‘getObject’ for pages with no /Contents
2017-05-01 15:46:15 -07:00
James R. Barlow
89599b4812
Drop Python 3.4 compatibility
2017-03-29 15:46:53 -07:00
James R. Barlow
d1a0065ef8
Create test case for Form XObjects
2017-02-14 12:51:15 -08:00
James R. Barlow
b889a89c36
Fix remaining 3.4/3.5 regressions
2017-01-26 17:53:27 -08:00
James R. Barlow
02fba02d31
Refactor test suite to use fixtures to manage paths
2017-01-26 16:38:59 -08:00
James R. Barlow
fb9e7c82f6
Move duplicate test code into common namespace
2017-01-26 13:36:52 -08:00
James R. Barlow
1c8b763d53
test_pageinfo: Remove bits per component test
...
The behavior of this test will ultimately depend on what version of
img2pdf is installed, since after my patch it will be able to produce
1bpp images.
2016-11-07 14:35:54 -08:00
James R. Barlow
570bbe9a05
Add comments and remove debugging, improve inline handling
...
Squashed commits:
[bfff3c9] pageinfo, have a main()
2016-02-27 00:18:36 -08:00
James R. Barlow
5cc3adb39a
Add support for inline images
2016-02-27 00:18:36 -08:00
James R. Barlow
3957a0606c
Compute image pixel density without performing rectangle intersection (+5 squashed commits)
...
Squashed commits:
[0e27904] Partially implement DPI calculation with rotation of the image
Fixes test suite
[a64f662] pageinfo: all tests pass
[c5b811a] Fix typos
[cdd2286] Can now find inline images for efficiently
[60dde8d] First cut at implementing intelligent DPI detection based on content stream
Broke many of the test cases
2016-02-27 00:18:36 -08:00
James R. Barlow
0dc96442d8
Fix img2pdf usage in test case (to make Travis CI happy again)
2016-02-06 23:41:32 -08:00
James R. Barlow
354e61946e
Use os.makedirs for test output directories
...
Broke Travis
2016-01-16 02:47:56 -08:00
James R. Barlow
7c558b3713
Move pageinfo test into tests folder
2016-01-11 17:40:44 -08:00