138 Commits

Author SHA1 Message Date
Ashwin Mathur
1c7d1618d8
Add truncate and normalize parameters to TEI Embedders (#7460) 2024-04-03 16:41:30 +02:00
Vladimir Blagojevic
d83af92270
feat: Update searchapi format, default to Google, allow search engine selection (#7453)
* Update searchapi payload

* Add release note

* PR feedback - Stefano

* Adjust unit test for mandatory engine search_param field
2024-04-03 10:48:50 +02:00
Nicola Procopio
42c5b7af32
feat: added dimensions parameters to Azure OpenAI Embedders (#7449)
* added dimensions parameter to AzureOpenAIEmbedders

* created releasenote

* update release note

---------

Co-authored-by: Julian Risch <julian.risch@deepset.ai>
2024-04-02 14:04:16 +02:00
Vladimir Blagojevic
ce8e114769
feat: DynamicChatPromptBuilder add templating to all user/system messages (#7423) 2024-03-27 15:34:50 +01:00
Silvano Cerza
685343d13f
feat: Add DocumentRecallEvaluator (#7399)
* Add DocumentRecallEvaluator

* Fix mypy error

* Simplify recall logic and change output for single hit mode

* Remove unused import

* Add comment for RecallMode fields

* Reword RecallMode comments

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>

---------

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>
2024-03-26 16:15:03 +01:00
Stefano Fiorucci
e26ee0f1db
refactor!: make TGI generators compatible with huggingface_hub>=0.22.0 (#7425)
* progress

* progress

* better lazy imports

* fixes

* reno
2024-03-26 16:10:06 +01:00
David S. Batista
fcd48d662c
test: HuggingFaceLocalGenerator test stopwords (#7416)
* initial import

* Update test/components/generators/test_hugging_face_local_generator.py

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>

* attending PR comments

---------

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
2024-03-26 12:39:02 +01:00
Silvano Cerza
f398b29e7f
feat: Change outputs of AnswerExactMatchEvaluator (#7390)
* Change outputs of AnswerExactMatchEvaluator

* Changes scores to return the number of matches per question

* Revert "Changes scores to return the number of matches per question"

This reverts commit e4358720793d4584b0b961402d4557c50c4c2381.

* Change output names
2024-03-26 10:57:59 +01:00
Stefano Fiorucci
6925e3a2e1
refactor!: Improve PyPDFToDocument (#7362)
* first draft

* rm kwargs from protocol

* Simplify

* no breaking changes

* reno

* one more test of the deprecated registry
2024-03-26 10:09:29 +01:00
Julian Risch
bfd0d3eacd
feat: Add new LLMEvaluator component (#7401)
* draft llm evaluator

* docstrings

* flexible inputs; validate inputs and outputs

* add tests

* add release note

* remove example

* docstrings

* make outputs parameter optional. default:

* validate init parameters

* linting

* remove mention of binary scores from template

* make examples and outputs params non-optional

* removed leftover from optional outputs param

* simplify building examples section for template

* validate inputs and outputs in examples are dict with str as key

* fix pylint too-many-boolean-expressions

* increase test coverage
2024-03-25 07:05:27 +01:00
Vladimir Blagojevic
e779d43384
feat: Add streaming to HuggingFaceLocalGenerator (#7377)
* Inital streaming impl

* Add unit tests

* Add release note
2024-03-21 15:49:18 +01:00
Silvano Cerza
610ad6f6b2
Add AnswerExactMatchEvaluator (#7381)
* Add AnswerExactMatchEvaluator

* Add release notes

* Fix linting

* Update docstrings

* Update docstrings

* Remove to_dict and from_dict

* Fix linting
2024-03-19 16:58:01 +01:00
Christopher Keibel
f69c3e5cd2
refactor: default for max_new_tokens to 512 in Hugging Face generators (#7370)
* set default for max_new_tokens to 512 in Hugging Face generators

* add release notes

* fix tests

* remove issues from release note

---------

Co-authored-by: christopherkeibel <christopher.keibel@karakun.com>
Co-authored-by: Julian Risch <julian.risch@deepset.ai>
2024-03-19 08:47:53 +01:00
Mohit Lal
280719339c
bug: run parameter "ranking_mode" does not override init param in meta field ranker (#7375)
* bug: run parameter ranking_mode does not override init param in metafield ranker

* Added a release note

* Used pytest.approx for comparing floating point numbers in unit test
2024-03-19 07:53:26 +01:00
Sebastian Husch Lee
85c1e39fab
feat: Add Zero Shot Transformers Text Router (#7018)
* Starting to add TransformersTextRouter

* First pass at a TextRouter based off of the zero shot classification model on HuggingFace

* Fix pylint

* Remove unneeded imports

* Update documentation example

* Update error message strings

* Starting to add unit tests

* Release notes

* Fix pylint

* Add tests for to dict and from dict

* Update patches in tests to be correct with respect to changes

* Doc strings and fixes

* Adding more tests

* Change name

* Adding to init

* Use Haystack logger

* Beef up docstrings

* Make example runnable

* Rename to huggingface_pipeline_kwargs

* Fix example
2024-03-15 13:56:07 +01:00
Stefano Fiorucci
abda78c122
unpin OpenAI and fix problem with mock (#7364) 2024-03-15 08:32:28 +01:00
Vladimir Blagojevic
2aae8472e7
feat: Add trust_remote_code init param to SentenceTransformer embedders (#7356)
* Add trust_remote_code init param to SentenceTransformer embedders

* Add release note

* Go with no kwargs solution

* Update haystack/components/embedders/sentence_transformers_document_embedder.py

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>

* Pydoc fix

---------

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
2024-03-14 11:14:04 +01:00
Yudhajit Sinha
41dbbdb3fc
feat: Add support for matching mime types using regex (#7303)
* feat: Add support for matching mime types using regex
---------

Co-authored-by: Silvano Cerza <silvanocerza@gmail.com>
2024-03-11 14:58:08 +01:00
Ashwin Mathur
38b3472bb2
feat: Add SentenceTransformersDiversityRanker (#7095)
* Add Diversity Ranker

* Update tests

* Add separate suffix, prefix params for query and documents; allow empty query

* Update docstrings

* Make changes based on review

* Add additional tests

* Add test for warm up

* Update release notes

---------

Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com>
2024-03-11 13:14:59 +01:00
Ashwin Mathur
8d7a58347d
fix: HuggingFaceTEITextEmbedder returning embedding of incorrect shape when used with Docker endpoint (#7319)
* Fix HuggingFaceTEITextEmbedder

* Update haystack/components/embedders/hugging_face_tei_text_embedder.py

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>

* Improve imports; Add additional tests

---------

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
2024-03-07 16:23:57 +01:00
Tobias Wochinger
23c65c250f
chore: migrate ExtractiveReader to use secret management (#7309)
* chore: migrate `ExtractiveReader` to use secret management

* docs: add release notes
2024-03-05 13:04:53 +01:00
Stefano Fiorucci
38a80b0235
fix: MetaFieldRanker - use weight if passed in the run method (#7305)
* fix:  - use  if passed in the  method

* reno
2024-03-05 12:13:56 +01:00
Julian Risch
c1c0cbfde4
docs: Update docs of MetaFieldRanker, TransformersSimilarityRanker (#7301)
* docs: Update docstrings of MetaFieldRanker and TransformersSimilarityRanker

* add warm_up() call to usage example

* Apply suggestions from code review

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>

* show result of usage example

---------

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
2024-03-05 10:20:18 +01:00
Julian Risch
9a0e2e58fd
docs: Added LostInTheMiddleRanker usage example and updated docstrings (#7294)
* docs: Added LostInTheMiddleRanker usage example

* remove to_dict test

* explain LITM in more detail
2024-03-04 15:42:51 +01:00
Vladimir Blagojevic
0e7c41be5e
feat: Improve OpenAPIServiceToFunctions signature (#7257)
* Convert OpenAPIServiceToFunctions run interface
---------
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
2024-03-04 14:38:58 +01:00
Massimiliano Pippi
25a1a97be0
restore to_dict method (#7261) 2024-02-29 14:30:06 +01:00
Massimiliano Pippi
cf1e28431a
fix docstrings for the builder package (#7248)
* fix docstrings for the builder package

* remove dead test

* Apply suggestions from code review

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>

* review feedback

* pylint

---------

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>
2024-02-28 18:22:29 +01:00
Tobias Wochinger
e5f0e248b6
docs: review docstrings in haystack.components.validators (#7238)
* chore: make private

* docs: review and normalize docstrings

* docs: fix format and unused import
2024-02-28 17:46:30 +01:00
Tobias Wochinger
f22d49944d
docs: review and normalize haystack.components.websearch (#7236)
* docs: review and normalize `haystack.components.websearch`

* fix: use correct type annotations

* refactor: use type from protocol

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>

* Revert "refactor: use type from protocol"

This reverts commit 23d6f45cd763c39b98be1bff03639a90f2a01fac.

* docs: refactor according to comments

* build: correctly pin to 4.7

---------

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
2024-02-28 16:43:08 +01:00
Stefano Fiorucci
7b9704a93a
docs: review Routers docstrings (#7234)
* wip

* review routers

* small fixes

* Update haystack/components/routers/conditional_router.py

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>

* Update haystack/components/routers/conditional_router.py

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>

* Update haystack/components/routers/file_type_router.py

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>

* Update haystack/components/routers/file_type_router.py

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>

* Update haystack/components/routers/file_type_router.py

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>

* Update haystack/components/routers/file_type_router.py

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>

* Update haystack/components/routers/metadata_router.py

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>

* Update haystack/components/routers/metadata_router.py

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>

* Update haystack/components/routers/text_language_router.py

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>

* Update haystack/components/routers/text_language_router.py

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>

* Update haystack/components/routers/text_language_router.py

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>

---------

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>
2024-02-28 11:26:22 +01:00
Tobias Wochinger
ac4f458e2b
docs: review and normalize haystack.components.fetchers (#7232)
* docs: review and normalize `haystack.components.fetchers`

* docs: drop defaults
2024-02-28 11:24:12 +01:00
Tobias Wochinger
419009b495
fix: move sensitive log to debug mode (#7230) 2024-02-28 09:45:50 +01:00
Vladimir Blagojevic
d871bbbfbd
feat: Add complex types in OpenAPI support (#7065)
* Add complex types OpenAPI support

* Add release note
---------

Co-authored-by: Julian Risch <julian.risch@deepset.ai>
2024-02-27 18:11:06 +01:00
Stefano Fiorucci
e194c08316
docs: review DocumentLanguageClassifier docstrings (#7210)
* review DocumentLanguageClassifier docstrings

* fix

* improve pydoc config
2024-02-27 16:02:53 +01:00
Silvano Cerza
0a7dfc1b32
Revert "Add AnswerExactMatchEvaluator (#7050)" (#7075)
This reverts commit b4011af8e9bc4ae2f72e51db254bfda69e20b651.
2024-02-23 14:05:57 +01:00
Silvano Cerza
b4011af8e9
Add AnswerExactMatchEvaluator (#7050)
* Add AnswerExactMatchEvaluator

* Add release notes

* Fix linting

* Update docstrings
2024-02-23 10:37:18 +01:00
Vladimir Blagojevic
49cad21a2e
chore: Adjust json_schema.py slightly (#7055)
* Slighly adjust json_schema.py

* Adjust test structures
2024-02-22 14:33:07 +01:00
Vladimir Blagojevic
cb6389d7a2
feat: Improve OpenAPI integration (#7034)
* Simplify and improve OpenAPIServiceConnector and OpenAPIServiceToFunctions, add unit tests

* Add reno note

* Add flask test dependency

* Initial PR feedback - Julian

* Remove indirection - Silvano

* Remove flask end-to-end tests

* Remove unused import

* Add mixed body unit test

* Update unit test, mock properly
2024-02-22 14:03:50 +01:00
Silvano Cerza
8ca4bf405b
Remove all evaluator components (#7053) 2024-02-21 18:24:14 +01:00
Varun Mathur
b335b5d723
feat: Add Lost In The Middle Ranker (#6995)
* add lost in the middle ranker

* update

* add release notes

* update release notes

* fix mypy

* Update

* fix mypy

* fix mypy [union-attr] for content.split

* remove e2e tests and negative topk param

* remove query param, validate params

---------

Co-authored-by: Julian Risch <julian.risch@deepset.ai>
2024-02-20 19:55:41 +01:00
Ashwin Mathur
327c2d260d
feat: Add Mean Reciprocal Rank (MRR) metric to StatisticalEvaluator (#7042)
* Add MRR Metric

* Add release notes

* Update logic
2024-02-20 13:58:48 +01:00
Silvano Cerza
05af9c3439
test: Simplify OpenAPIServiceConnector run test (#7043)
* Simplify OpenAPIServiceConnector run test

* Fix linting
2024-02-20 11:54:51 +01:00
Silvano Cerza
9215882779
Add Recall Multi Hit and Single Hit metric (#7038) 2024-02-19 18:00:39 +01:00
Stefano Fiorucci
d00f171f8b
refactor!: Sentence Transformers Embedders - new devices mgmt (#7033)
* new device mgmt for Sentence Transformers embedders

* reno
2024-02-19 14:52:44 +01:00
Stefano Fiorucci
44b5ae291c
specify CPU device in warm_up test (#7014) 2024-02-16 13:01:57 +01:00
Stefano Fiorucci
0aa788facc
refactor!: LocalWhisperTranscriber - new devices mgmt (#7008)
* wip

* whisper local transcriber: use new device mgmt

* better from_dict + test

* reno
2024-02-16 11:25:53 +01:00
Silvano Cerza
a7209f6413
Mark OpenAPIServiceConnector integration test as flaky (#7007) 2024-02-15 19:33:34 +01:00
Tuana Çelik
e2cee468fc
fix: Adding api_base_url to OpenAITextEmbeder self assignments (#7004)
* assigning api_base_url

This fix resolves issues with the MistralTextEmbedder integration

* adding base url to `to_dict` and the tests

* adding release note

* Update fix-openai-base-url-assignment-0570a494d88fe365.yaml

---------

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
2024-02-15 17:35:28 +01:00
Silvano Cerza
6fe1d3b595
refactor: Clean eval components (#7005)
* Remove preprocess.py

* Rename eval components to evaluators
2024-02-15 17:17:59 +01:00
Silvano Cerza
2b8a606cb8
refactor: Refactor StatisticalEvaluator (#6999)
* Refactor StatisticalEvaluator

* Update StatisticalEvaluator

* Rename StatisticalMetric.from_string to from_str and change internal logic

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>

* Fix tests

---------

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>
2024-02-15 16:47:35 +01:00