David S. Batista
da60156174
chore: removing unused imports from tests ( #9446 )
2025-05-26 16:22:51 +00:00
Stefano Fiorucci
9ae7da8df3
test: workflow for slow/unstable integration tests ( #9267 )
...
* workflow for slow integration tests
* try changing skipper
* Trigger Build
* better names
* fix
* mv tika to slow
* try skipping slow workflow
* retry paths-ignore
* remove skipper
* Revert "remove skipper"
This reverts commit 302ed2f07f36b33fa61fde0843b5590d79b98d74.
* better skipper
* retry
* Revert "retry"
This reverts commit fe5dff68f496645cc45292d74fcd8d043e868392.
* try using one workflow
* trigger
* try to see if it fails
* cosmetic changes
* improvements
* try matrix
* retry
* fix
* clean up
* simplify datadog monitoring and trigger
* send event to datadog for nightly failures
* tests should run if: manual trigger, scheduled, PR has label, release branch, or relevant files changed
* clarify slow marker
* improve comments
* labels
2025-04-23 10:36:44 +02:00
Amna Mubashar
9302d3d9f0
feat: Add store_full_path to converters (2/3) ( #8573 )
2024-11-25 15:22:19 +05:00
Corentin Meyer
1c53aae8f0
fix: Tika converter not yielding page break tags (\f) ( #8082 )
...
* Fix TikaConverter not having \f page tag by using HTML mode of parsing and then parsing the HTML to text using the old Haystack 1.X integration as template.
* Add Reno
* Fix test by making Mock Tika return XML (before parsing)
* refinements and test
---------
Co-authored-by: anakin87 <stefanofiorucci@gmail.com>
2024-07-26 20:13:47 +02:00
Massimiliano Pippi
10c675d534
chore: add license header to all modules ( #7675 )
...
* add license header to modules
* check license header at linting time
2024-05-09 13:40:36 +00:00
ZanSara
974d65f30a
feat: support single metadata dictionary in TikaDocumentConverter ( #6698 )
...
* reno
* converter
* test
* comment
2024-01-09 09:49:47 +01:00
sahusiddharth
3d17e6ff76
changed metadata to meta ( #6605 )
2023-12-21 12:39:58 +01:00
Stefano Fiorucci
2f034d3c97
refactor!: Converters - standardize inputs ( #6540 )
...
* standardize converters inputs: first draft
* fix precommit
* fix precommit 2
* fix precommit 3
* add default for optional param
* rm leftover
* install boilerpy in linting workflow
* add boilerpy3 to the core dependencies
* add reno
* remove boilerpy3 installation from test workflow
* fix pylint: import order and unused import
* fix import order
* add release note
* better Tika docstring
* rm boilerpy from linting
* leftover
* md link brackets
* feat: Converters - allow passing `meta` in the `run` method (#6554 )
* first impl for html
* progressing on other components
* fix test
* add tests - run with meta
* release note
* reintroduce patches wrongly deleted
* add patch in test
* fix tika test
* Update haystack/components/converters/azure.py
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
---------
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
* Update releasenotes/notes/converters-standardize-inputs-ed2ba9c97b762974.yaml
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
* simplify test
---------
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
Co-authored-by: Julian Risch <julian.risch@deepset.ai>
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
2023-12-15 16:41:35 +01:00
Massimiliano Pippi
7c05f37a53
remove unit marker ( #6450 )
2023-11-29 19:24:25 +01:00
Silvano Cerza
e6637f5ec2
Fix all tests
2023-11-24 14:48:43 +01:00
Massimiliano Pippi
8adb8bbab8
Remove preview folder in test/
...
---------
Co-authored-by: Silvano Cerza <silvanocerza@gmail.com>
2023-11-24 11:52:55 +01:00