26 Commits

Author SHA1 Message Date
ZanSara
ff55985e2d
feat: support single metadata dictionary in HTMLToDocument (#6613)
* support single metadata in HTMLToDocument

* reno

* docstring
2023-12-21 16:45:31 +01:00
Vladimir Blagojevic
4d08be0c2a
feat: Update OpenAI Python Client in Haystack 2.x (#6584)
* Update openai python client

* Add release note

* Consolidate multiple mock_chat_completion into one

* Ensure all components have api_base_url, organization params

* Update tests

* Enable function calling

* Oversight

* Minor fixes, add streaming test mocks

* Apply suggestions from code review

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* metadata -> meta

---------

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
2023-12-21 16:21:24 +01:00
ZanSara
cf79aa1485
feat: add support for single meta dict in TextFileToDocument (#6606)
* add support for single meta dict

* reno

* reno

* mypy

* extract to function

* docstring

* mypy
2023-12-21 14:21:17 +01:00
Stefano Fiorucci
7cc6080dfa
chore: replace metadata w meta in tests/examples (#6612)
* replace metadata w meta in tests/examples

* do not touch already broken e2e tests

* Revert "do not touch already broken e2e tests"

This reverts commit 1f911920d98954b57daacfe8d8ed02fd77d136db.
2023-12-21 14:09:31 +01:00
sahusiddharth
3d17e6ff76
changed metadata to meta (#6605) 2023-12-21 12:39:58 +01:00
Ashwin Mathur
fc88ef7076
feat: Add HuggingFace TEI Embedders - HuggingFaceTEITextEmbedder and HuggingFaceTEIDocumentEmbedder (#6602)
* Add TEI Embedders

* Add release notes

* Update release notes with usage examples
2023-12-21 12:16:36 +01:00
ZanSara
5a0f0ce22f
feat: Multiplexer (#6592)
* move functions

* tests

* reno

* add component

* reno

* add tests

* mypy

* pylint

* logger

* module name
2023-12-20 11:03:22 +01:00
Silvano Cerza
f224f991be
Change DocumentWriter default policy from DuplicatePolicy.FAIL to DuplicatePolicy.NONE (#6596) 2023-12-19 17:46:16 +01:00
ZanSara
f877704839
chore: extract type serialization (#6586)
* move functions

* tests

* reno
2023-12-19 14:16:20 +01:00
Vladimir Blagojevic
2dd5a94b04
feat: Add RAG based OpenAPI service integration (#6555)
* Add OpenAPIServiceConnector and OpenAPIServiceToFunctions

* Add release note

* Add test deps

* Better docs on OpenAPI spec reqs, improve tests

* Silvano PR feedback
2023-12-19 13:27:41 +01:00
Stefano Fiorucci
94cfe5d9ae
feat!: HTMLToDocument - allow choosing the boilerpy3 extractor (#6582)
* allow extractor customizability

* release note

* typo
2023-12-19 10:52:12 +01:00
Sebastian Husch Lee
dcf37c5173
feat: Extractive QA answer deduplication (#6459)
* Add answer deduplication

* Fix test

* Handle None case

* Release notes

* Handle cases where documents or answer spans could be None

* Adding checks for Nones and satisfying mypy

* Add option to turn off deduplication

* Adding unit tests

* Refactored tests to use fixtures

* Added overlap_threshold to run

* Update test

* Fixes related to the merge

* Remove casting, use direct variable names

* Move out if statement and add new test for it

* Update if statement to match comment

* Update how if statements work
2023-12-18 19:27:04 +01:00
Sebastian Husch Lee
c294b8ac8c
feat: Add auto device checks and model_kwargs to TransformersSimilarityRanker (#6561)
* Add device checking and model_kwargs like we do in ExtractiveReader

* Add release notes

* Make a utility function for the device checking

* Better warning message and updated ExtractiveReader to use the util function

* Add unit tests for get_device

* Fix pylint
2023-12-18 15:13:42 +01:00
Sebastian Husch Lee
3e0e81b1e0
feat: Add meta_fields_to_embed to TransformersSimilarityRanker (#6564)
* Add initial implementation following SentenceTransformersDocumentEmbedder

* Add test for embedding metadata

* Add release notes

* Update name

* Fix tests and to dict

* Fix release notes
2023-12-18 11:28:16 +01:00
Massimiliano Pippi
0ac1bdc6a0
refactor!: uniform run api for LocalWhisperTranscriber (#6542)
* uniform run api for LocalWhisperTranscriber

* add relnote

* fix linter
2023-12-18 10:47:46 +01:00
Stefano Fiorucci
2f034d3c97
refactor!: Converters - standardize inputs (#6540)
* standardize converters inputs: first draft

* fix precommit

* fix precommit 2

* fix precommit 3

* add default for optional param

* rm leftover

* install boilerpy in linting workflow

* add boilerpy3 to the core dependencies

* add reno

* remove boilerpy3 installation from test workflow

* fix pylint: import order and unused import

* fix import order

* add release note

* better Tika docstring

* rm boilerpy from linting

* leftover

* md link brackets

* feat: Converters - allow passing `meta` in the `run` method (#6554)

* first impl for html

* progressing on other components

* fix test

* add tests - run with meta

* release note

* reintroduce patches wrongly deleted

* add patch in test

* fix tika test

* Update haystack/components/converters/azure.py

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>

---------

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>

* Update releasenotes/notes/converters-standardize-inputs-ed2ba9c97b762974.yaml

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>

* simplify test

---------

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
Co-authored-by: Julian Risch <julian.risch@deepset.ai>
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
2023-12-15 16:41:35 +01:00
Vladimir Blagojevic
c642695ec0
feat: Add FileTypeRouter markdown support (#6551)
* Add FileTypeRouter markdown support

* Add releae note
2023-12-14 16:30:57 +01:00
Julian Risch
25a6eaae05
feat!: Rename ExtractiveReader's confidence_threshold to score_threshold (#6532)
* rename to score_threshold

* Update haystack/components/readers/extractive.py

Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com>

---------

Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com>
2023-12-12 15:12:28 +01:00
Silvano Cerza
18dbce25fc
refacotr: Refactor answer dataclasses (#6523)
* Refactor answer dataclasses

* Add release notes

* Fix tests

* Fix end to end tests

* Enhance ExtractiveReader
2023-12-11 18:50:49 +01:00
bogdankostic
728383a149
fix: Make TransformersSimilarityRanker run with single document list (#6503)
* Make `TransformersSimilarityRanker` run with single document list

* Add release note

* Remove unused import in test
2023-12-08 16:18:46 +01:00
Bijay Gurung
c5342d1110
fix: Prevent invalid answer from being selected in ExtractiveReader (#6460)
* Fix invalid answer being selected issue on ExtractiveReader

* Rename variables to not shadow arguments
2023-12-06 09:49:02 +01:00
Stefano Fiorucci
4912f7cb58
refactor!: improve the deserialization logic for components that use a Document Store (#6466)
* improve deserialization

* rm ds decorator

* improve tests

* fix pylint

* rm decorator from module init

* rm decorator

* rm decorator from factory

* fix tests

* release note

* rm print
2023-12-04 15:17:28 +01:00
Massimiliano Pippi
a86807b834
move Cohere generator into dedicated integration (#6475) 2023-12-04 11:16:12 +01:00
Massimiliano Pippi
7c05f37a53
remove unit marker (#6450) 2023-11-29 19:24:25 +01:00
Silvano Cerza
e6637f5ec2 Fix all tests 2023-11-24 14:48:43 +01:00
Massimiliano Pippi
8adb8bbab8
Remove preview folder in test/
---------

Co-authored-by: Silvano Cerza <silvanocerza@gmail.com>
2023-11-24 11:52:55 +01:00