623 Commits

Author SHA1 Message Date
Madeesh Kannan
b1760add56
feat: Add support for pipeline deserialization callbacks (#7518)
* feat: Add support for deserialization callbacks

* Lint

* Fix type hint for older Python versions

* Apply suggestions from code review

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>

* Lint

---------

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
2024-04-10 17:47:14 +02:00
Madeesh Kannan
fd84cd5f9a
feat: Add support for returning intermediate outputs of pipeline components (#7504)
* feat: Add support for returning intermediate outputs of pipeline components

The `pipeline.run` method has been extended to accept a set of component
names whose inputs are returned in addition to the outputs of leaf components.

* Add reno

* Lint

---------

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
2024-04-10 17:16:00 +02:00
David S. Batista
9a9c8aa1c8
feat: implementing evalualtion results API (#7520)
* initial import

* adding tests

* attending PR comments

* fixing tests

* updating tests

* updating tests and code

* renaming

* fixing linting issues

* adding release notes

* adding docstrings

* latest fixes
2024-04-10 13:34:03 +00:00
Stefano Fiorucci
843376bb1b
forward declaration of AnalyzeResult (#7523) 2024-04-10 09:02:08 +02:00
Vladimir Blagojevic
988c360b6d
feat: Azure converter updates (#7409)
* Initial commit

* Remove old mock tests

* Fix current_last_page_number calculation

* Carry over unit tests from the other side

* Update pydocs, skip failing tests

* Fix pylint and mypy

* Minor adjustments

* Add release note

* Minor touch ups

* Resolve Document unique id issue by using custom id calculation

* Better hashing, add unit tests

* Small fixes
2024-04-09 09:45:06 +02:00
Stefano Fiorucci
eff53a9131
feat: HuggingFaceAPIDocumentEmbedder (#7485)
* add HuggingFaceAPITextEmbedder

* add HuggingFaceAPITextEmbedder

* rm unneeded else

* wip

* small fixes

* deprecation; reno

* Apply suggestions from code review

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>

* make params mandatory

* changes requested

* fix test

* fix test

---------

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>
2024-04-08 15:06:26 +02:00
Stefano Fiorucci
c91bd49cae
feat: HuggingFaceAPITextEmbedder (#7484)
* add HuggingFaceAPITextEmbedder

* add HuggingFaceAPITextEmbedder

* rm unneeded else

* small fixes

* changes requested

* fix test
2024-04-08 14:22:54 +02:00
Stefano Fiorucci
0dbb98c0a0
feat: HuggingFaceAPIChatGenerator (#7480)
* draft

* docstrings and more tests

* deprecation; reno

* pydoc config

* better error messages

* wip

* add test

* better docstrings

* deprecation; reno

* pylint

* typo

* rm unneeded else

* rm unneeded else

* fixes from feedback

* docstring showing the enum

* improve docstring

* make params mandatory

* Apply suggestions from code review

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>

* document enum

* Update haystack/utils/hf.py

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>

* mandatory params

* fix test

* fix test

---------

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>
2024-04-05 18:48:34 +02:00
Stefano Fiorucci
1d083861ff
feat: HuggingFaceAPIGenerator (#7464)
* draft

* docstrings and more tests

* deprecation; reno

* pydoc config

* better error messages

* rm unneeded else

* make params mandatory

* Apply suggestions from code review

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>

* document enum

* Update haystack/utils/hf.py

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>

* fix test

---------

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>
2024-04-05 18:48:13 +02:00
Vladimir Blagojevic
c3b96392fd
feat: Use all HTMLToDocument extractors until content is extracted (#7452)
* Use all HTMLToDocument extractors until content is extracted

* Add release note

* Minor doc update

* Improvements, unit test fixes

* Add try_others init param, update tests

* Update haystack/components/converters/html.py

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>

* PR feedback - Stefano

* Improve reno release note, add  reference

* little fixes

---------

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
2024-04-05 16:02:34 +02:00
Julian Risch
9d02dc607a
feat: Add FaithfulnessEvaluator component (#7424)
* draft FaithfulnessEvaluator

* reno

* calculate score per statement and aggregate

* Update release note

* update default values in tests and fix import path

* remove instructions, inputs, outputs params

* remove unused imports

* add expected format example to docstring

* remove name 'llm' from tests and docstring
2024-04-04 16:33:59 +00:00
Silvano Cerza
8b8a93bc0d
refactor: Rename DocumentMeanAveragePrecision and DocumentMeanReciprocalRank (#7470)
* Rename DocumentMeanAveragePrecision and DocumentMeanReciprocalRank

* Update releasenotes

* Simplify names
2024-04-04 17:04:59 +02:00
Silvano Cerza
bdc25ca2a0
feat: Add DocumentMeanReciprocalRank (#7468)
* Add DocumentMeanReciprocalRank

* Fix float precision error
2024-04-04 14:55:37 +02:00
Silvano Cerza
7799909069
feat: Add DocumentMeanAveragePrecision (#7461)
* Add DocumentMeanAveragePrecision

* Remove questions input

* Update docstrings

* Update haystack/components/evaluators/document_map.py

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>

---------

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>
2024-04-04 14:15:45 +02:00
Silvano Cerza
12acb3f12e
feat: Add SASEvaluator (#7428)
* Add SASEvaluator

* Add release notes

* Apply suggestions from code review

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>

* Simplify similarity calculation with bi-encoders models

* Fix linting

* Update docstrings

* Move tensor to CPU after calculating cosine similarity

* Fix CI failing

---------

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>
2024-04-04 10:10:41 +02:00
Ashwin Mathur
1c7d1618d8
Add truncate and normalize parameters to TEI Embedders (#7460) 2024-04-03 16:41:30 +02:00
Vladimir Blagojevic
d83af92270
feat: Update searchapi format, default to Google, allow search engine selection (#7453)
* Update searchapi payload

* Add release note

* PR feedback - Stefano

* Adjust unit test for mandatory engine search_param field
2024-04-03 10:48:50 +02:00
Nicola Procopio
42c5b7af32
feat: added dimensions parameters to Azure OpenAI Embedders (#7449)
* added dimensions parameter to AzureOpenAIEmbedders

* created releasenote

* update release note

---------

Co-authored-by: Julian Risch <julian.risch@deepset.ai>
2024-04-02 14:04:16 +02:00
Massimiliano Pippi
47f38c340e
Ensure test_comparison_in checks an actual subset of documents (#7427)
* fix test_comparison_in

* relnotes
2024-03-27 17:31:05 +01:00
Vladimir Blagojevic
ce8e114769
feat: DynamicChatPromptBuilder add templating to all user/system messages (#7423) 2024-03-27 15:34:50 +01:00
Silvano Cerza
58d91b64dc
Fix: Fix Pipeline.run() running components with only defaults in the wrong order (#7426)
* Fix Pipeline.run() running components with only defaults in the wrong order

* Add release notes
2024-03-26 16:55:31 +01:00
Silvano Cerza
685343d13f
feat: Add DocumentRecallEvaluator (#7399)
* Add DocumentRecallEvaluator

* Fix mypy error

* Simplify recall logic and change output for single hit mode

* Remove unused import

* Add comment for RecallMode fields

* Reword RecallMode comments

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>

---------

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>
2024-03-26 16:15:03 +01:00
Stefano Fiorucci
e26ee0f1db
refactor!: make TGI generators compatible with huggingface_hub>=0.22.0 (#7425)
* progress

* progress

* better lazy imports

* fixes

* reno
2024-03-26 16:10:06 +01:00
Stefano Fiorucci
6925e3a2e1
refactor!: Improve PyPDFToDocument (#7362)
* first draft

* rm kwargs from protocol

* Simplify

* no breaking changes

* reno

* one more test of the deprecated registry
2024-03-26 10:09:29 +01:00
Julian Risch
bfd0d3eacd
feat: Add new LLMEvaluator component (#7401)
* draft llm evaluator

* docstrings

* flexible inputs; validate inputs and outputs

* add tests

* add release note

* remove example

* docstrings

* make outputs parameter optional. default:

* validate init parameters

* linting

* remove mention of binary scores from template

* make examples and outputs params non-optional

* removed leftover from optional outputs param

* simplify building examples section for template

* validate inputs and outputs in examples are dict with str as key

* fix pylint too-many-boolean-expressions

* increase test coverage
2024-03-25 07:05:27 +01:00
Vladimir Blagojevic
e779d43384
feat: Add streaming to HuggingFaceLocalGenerator (#7377)
* Inital streaming impl

* Add unit tests

* Add release note
2024-03-21 15:49:18 +01:00
Stefano Fiorucci
6e69d4f188
fix: Pipeline - disable autoshow on Jupyter (#7397)
* try

* fix docstring

* simplify tests

* add release note
2024-03-21 12:55:06 +01:00
Stefano Fiorucci
b0a9508116
fix: add the @component decorator to HuggingFaceTGIChatGenerator (#7396)
* add component decorator

* reno
2024-03-21 09:28:21 +01:00
Stefano Fiorucci
dbfd351da7
feat: introduce SparseEmbedding (#7382)
* introduce SparseEmbedding

* reno

* add to pydoc config
2024-03-19 18:04:16 +01:00
Silvano Cerza
610ad6f6b2
Add AnswerExactMatchEvaluator (#7381)
* Add AnswerExactMatchEvaluator

* Add release notes

* Fix linting

* Update docstrings

* Update docstrings

* Remove to_dict and from_dict

* Fix linting
2024-03-19 16:58:01 +01:00
Christopher Keibel
f69c3e5cd2
refactor: default for max_new_tokens to 512 in Hugging Face generators (#7370)
* set default for max_new_tokens to 512 in Hugging Face generators

* add release notes

* fix tests

* remove issues from release note

---------

Co-authored-by: christopherkeibel <christopher.keibel@karakun.com>
Co-authored-by: Julian Risch <julian.risch@deepset.ai>
2024-03-19 08:47:53 +01:00
Mohit Lal
280719339c
bug: run parameter "ranking_mode" does not override init param in meta field ranker (#7375)
* bug: run parameter ranking_mode does not override init param in metafield ranker

* Added a release note

* Used pytest.approx for comparing floating point numbers in unit test
2024-03-19 07:53:26 +01:00
Sebastian Husch Lee
85c1e39fab
feat: Add Zero Shot Transformers Text Router (#7018)
* Starting to add TransformersTextRouter

* First pass at a TextRouter based off of the zero shot classification model on HuggingFace

* Fix pylint

* Remove unneeded imports

* Update documentation example

* Update error message strings

* Starting to add unit tests

* Release notes

* Fix pylint

* Add tests for to dict and from dict

* Update patches in tests to be correct with respect to changes

* Doc strings and fixes

* Adding more tests

* Change name

* Adding to init

* Use Haystack logger

* Beef up docstrings

* Make example runnable

* Rename to huggingface_pipeline_kwargs

* Fix example
2024-03-15 13:56:07 +01:00
Vladimir Blagojevic
2aae8472e7
feat: Add trust_remote_code init param to SentenceTransformer embedders (#7356)
* Add trust_remote_code init param to SentenceTransformer embedders

* Add release note

* Go with no kwargs solution

* Update haystack/components/embedders/sentence_transformers_document_embedder.py

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>

* Pydoc fix

---------

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
2024-03-14 11:14:04 +01:00
Yudhajit Sinha
41dbbdb3fc
feat: Add support for matching mime types using regex (#7303)
* feat: Add support for matching mime types using regex
---------

Co-authored-by: Silvano Cerza <silvanocerza@gmail.com>
2024-03-11 14:58:08 +01:00
Ashwin Mathur
38b3472bb2
feat: Add SentenceTransformersDiversityRanker (#7095)
* Add Diversity Ranker

* Update tests

* Add separate suffix, prefix params for query and documents; allow empty query

* Update docstrings

* Make changes based on review

* Add additional tests

* Add test for warm up

* Update release notes

---------

Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com>
2024-03-11 13:14:59 +01:00
Stefano Fiorucci
f8b9f71b7a
make weight defined in run to be used even if 0 (#7343) 2024-03-11 12:52:55 +01:00
Ashwin Mathur
8d7a58347d
fix: HuggingFaceTEITextEmbedder returning embedding of incorrect shape when used with Docker endpoint (#7319)
* Fix HuggingFaceTEITextEmbedder

* Update haystack/components/embedders/hugging_face_tei_text_embedder.py

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>

* Improve imports; Add additional tests

---------

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
2024-03-07 16:23:57 +01:00
Tobias Wochinger
23c65c250f
chore: migrate ExtractiveReader to use secret management (#7309)
* chore: migrate `ExtractiveReader` to use secret management

* docs: add release notes
2024-03-05 13:04:53 +01:00
Julian Risch
50ad1fa2c4
fix: Remove pipeline serialization from telemetry code (#7289)
* remove pipeline serialization from telemetry

* simplify getting component instance from pipeline

* reno

* add unit test with non-serializable component

* generate qualified class names

* added pipeline.walk()

* fix imports

* sort Iterator import

* remove bfs

* add test for pipeline.walk() with cycles

* Apply suggestions from code review

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>

* raise TypeError if telemetry_data is no dict

* Update haystack/telemetry/_telemetry.py

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>

---------

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>
2024-03-05 12:45:53 +01:00
Stefano Fiorucci
38a80b0235
fix: MetaFieldRanker - use weight if passed in the run method (#7305)
* fix:  - use  if passed in the  method

* reno
2024-03-05 12:13:56 +01:00
Silvano Cerza
72d776c390
fix: Fix run order of variadic greedy components in Pipeline.run() (#7258)
* Fix run order of variadic greedy components in Pipeline.run()

* Add release notes
2024-03-01 17:39:13 +01:00
Massimiliano Pippi
b011bfcc9c
chore: update the INDEXING and RAG pipeline templates (#7272)
* update the index pipeline

* amend relnote

* update the RAG template
2024-03-01 11:55:44 +01:00
Silvano Cerza
d6597952a2
fix: Update Component protocol to fix some type checking issues (#7270)
* Update Component protocol to fix some type checking issues

* Add release notes

* Fix logline in test

* Fix run type definition
2024-03-01 10:56:47 +01:00
Massimiliano Pippi
232544472f
feat: Add new predefined template: chat with website (#7259)
* add new predefined template: chat with website

* fix raw prompts

* Fix raw prompt

* fix typos in tempalte
2024-02-29 15:55:53 +01:00
Tobias Wochinger
fe0ac5c4a2
chore: enforce kwarg logging (#7207)
* chore: add logger which eases logging of extras

* chore: start migrating to key value

* fix: import fixes

* tests: temporarily comment out breaking test

* refactor: move to kwarg based logging

* style: fix import order

* chore: implement self-review comments

* test: drop failing test

* chore: fix more import orders

* docs: add changelog

* tests: fix broken tests

* chore: fix getting the frames

* chore: add comment

* chore: cleanup

* chore: adapt remaining `%s` usages
2024-02-29 14:31:20 +01:00
Massimiliano Pippi
e7809b6fea
feat: Add from_template class method to Pipeline (#7240)
* move templating code under the core package

* make from_predefined part of the Pipeline API

* add tests

* amend release notes

* import under haystack package

* Apply suggestions from code review

Co-authored-by: David S. Batista <dsbatista@gmail.com>

* from_predefined -> from_template

* remove template inheritance for more readability

---------

Co-authored-by: David S. Batista <dsbatista@gmail.com>
2024-02-29 12:23:32 +01:00
Tobias Wochinger
f22d49944d
docs: review and normalize haystack.components.websearch (#7236)
* docs: review and normalize `haystack.components.websearch`

* fix: use correct type annotations

* refactor: use type from protocol

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>

* Revert "refactor: use type from protocol"

This reverts commit 23d6f45cd763c39b98be1bff03639a90f2a01fac.

* docs: refactor according to comments

* build: correctly pin to 4.7

---------

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
2024-02-28 16:43:08 +01:00
Tobias Wochinger
c2a9528595
build: pin typing-extensions (#7245)
https://community.openai.com/t/error-while-importing-openai-from-open-import-openai/578166/26
2024-02-28 14:34:41 +01:00
Massimiliano Pippi
f812048713
remove the override feature (#7227) 2024-02-28 11:33:40 +01:00