Madeesh Kannan
|
8faa3fa465
|
Revert "fix: make PyPDF backward compatible (#7996)" (#8014)
This reverts commit 58b48e36eb56a896365133ab4a9d8e327989948c.
|
2024-07-11 13:06:08 +00:00 |
|
Tobias Wochinger
|
58b48e36eb
|
fix: make PyPDF backward compatible (#7996)
* fix: make PyPDF backward compatible
* Add release note
---------
Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com>
|
2024-07-09 10:08:37 +02:00 |
|
Stefano Fiorucci
|
c51f8ffb86
|
PyPDFToDocument: remove deprecated converter_name and CONVERTERS_REGISTRY (#7910)
|
2024-06-21 16:52:03 +02:00 |
|
Massimiliano Pippi
|
10c675d534
|
chore: add license header to all modules (#7675)
* add license header to modules
* check license header at linting time
|
2024-05-09 13:40:36 +00:00 |
|
Stefano Fiorucci
|
6925e3a2e1
|
refactor!: Improve PyPDFToDocument (#7362)
* first draft
* rm kwargs from protocol
* Simplify
* no breaking changes
* reno
* one more test of the deprecated registry
|
2024-03-26 10:09:29 +01:00 |
|
Sebastian Husch Lee
|
c0b67432e4
|
feat: Add page breaks to default PDF to Document converter (#6755)
* Speedup tests for PyPDFToDocument
* Added unit test and removed skipping of empty pages
* add release note
* Add back some integration marks
|
2024-01-18 08:54:59 +01:00 |
|
ZanSara
|
c0f1dab454
|
feat: support single metadata dictionary in PyPDFToDocument (#6615)
* support single metadata dict in pypdf2document
* improve tests
* tests
* remove line
|
2023-12-22 14:13:11 +01:00 |
|
sahusiddharth
|
3d17e6ff76
|
changed metadata to meta (#6605)
|
2023-12-21 12:39:58 +01:00 |
|
Stefano Fiorucci
|
2f034d3c97
|
refactor!: Converters - standardize inputs (#6540)
* standardize converters inputs: first draft
* fix precommit
* fix precommit 2
* fix precommit 3
* add default for optional param
* rm leftover
* install boilerpy in linting workflow
* add boilerpy3 to the core dependencies
* add reno
* remove boilerpy3 installation from test workflow
* fix pylint: import order and unused import
* fix import order
* add release note
* better Tika docstring
* rm boilerpy from linting
* leftover
* md link brackets
* feat: Converters - allow passing `meta` in the `run` method (#6554)
* first impl for html
* progressing on other components
* fix test
* add tests - run with meta
* release note
* reintroduce patches wrongly deleted
* add patch in test
* fix tika test
* Update haystack/components/converters/azure.py
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
---------
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
* Update releasenotes/notes/converters-standardize-inputs-ed2ba9c97b762974.yaml
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
* simplify test
---------
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
Co-authored-by: Julian Risch <julian.risch@deepset.ai>
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
|
2023-12-15 16:41:35 +01:00 |
|
Massimiliano Pippi
|
7c05f37a53
|
remove unit marker (#6450)
|
2023-11-29 19:24:25 +01:00 |
|
Silvano Cerza
|
e6637f5ec2
|
Fix all tests
|
2023-11-24 14:48:43 +01:00 |
|
Massimiliano Pippi
|
8adb8bbab8
|
Remove preview folder in test/
---------
Co-authored-by: Silvano Cerza <silvanocerza@gmail.com>
|
2023-11-24 11:52:55 +01:00 |
|