3582 Commits

Author SHA1 Message Date
Daria Fokina
741dd07227
clean up docstrings: TextCleaner (#8202)
* update textcleaner strings

* Update haystack/components/preprocessors/text_cleaner.py

Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>

---------

Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>
2024-08-13 12:02:58 +00:00
Amna Mubashar
373de97426
Deprecate SentenceWindowRetrieval (#8206) 2024-08-13 13:49:41 +02:00
Haystack Bot
565d802db9
Update unstable version to 2.5.0-rc0 (#8195)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2024-08-12 13:45:59 +02:00
Vladimir Blagojevic
21c507331c
feat: Implement apply_filter_policy and FilterPolicy.MERGE for the new filters (#8042) v2.5.0-rc0 2024-08-09 12:04:24 +02:00
Nicola Procopio
4c798470b2
added precision parameter to sentence transformers embeddings (#8179)
* added `precision` parameter to sentence transformers embeddings

* fixed test

* Update haystack/components/embedders/sentence_transformers_document_embedder.py

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>

* Update test/components/embedders/test_sentence_transformers_text_embedder.py

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>

* Update test/components/embedders/test_sentence_transformers_text_embedder.py

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>

* fix format

* Update sentence_transformers_text_embedder.py

---------

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
2024-08-09 11:38:47 +02:00
Marie-Luise Klaus
ec02817f14
fix: OutputAdapter from_dict with custom_filters None (#8173)
Co-authored-by: Marie-Luise Klaus <marieluise.klaus@deepset.ai>
2024-08-08 14:02:40 +02:00
Stefano Fiorucci
a4eb88e7ea
rm serialize callback handler (#8172) 2024-08-08 11:54:31 +02:00
Corentin Meyer
58517014ec
fix: DocumentCleaner: keep the \f in text (#8078)
* Keep the \f in Document Cleaner

* Add Reno

* Add Test

* Simplified _remove_empty_lines() code
2024-08-07 14:50:14 +02:00
Marie-Luise Klaus
031b0bfbd8
fix: ChatPromptBuilder from_dict if template is None (#8165)
* fix ChatPromptBuilder from dict if template=None

* fix ChatPromptBuilder from dict if template=None

* leave template None

---------

Co-authored-by: Marie-Luise Klaus <marieluise.klaus@deepset.ai>
2024-08-06 14:48:04 +02:00
Tim Wellbrock
2e2f5f17bb
feat: add unicode normalization & ascii_only mode for DocumentCleaner (#8103)
* feat: add unicode normalization & ascii_only mode for DocumentCleaner.

* feat: add unicode_normalization parameter valdiation to DocumentCleaner.

* test: fix the unit test to work after code linting.
2024-08-05 13:00:39 +02:00
Stefano Fiorucci
e17d0c4192
chore: deprecate to_openai_format and create similar utility functions (#8146)
* deprecate and add new specific functions

* reno
2024-08-02 16:47:17 +02:00
Agnieszka Marzec
c50a60561b
Docs: Update HFAPIGenerator docstrings (#8154)
* update docstrings

* Update haystack/components/generators/hugging_face_api.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

---------

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
2024-08-02 12:51:07 +00:00
Tobias Wochinger
5a3ea75196
docs: document Python 3.11 and 3.12 support (#8159)
* docs: add Python 3.11 and 3.12 to supported versions

* docs: add release notes
2024-08-02 14:46:20 +02:00
Agnieszka Marzec
153afe77c5
update docstrings (#8156) 2024-08-02 14:30:47 +02:00
Agnieszka Marzec
a75dc89690
update docstrings (#8158) 2024-08-02 14:28:59 +02:00
Agnieszka Marzec
0cda96eedf
Docs: Update AzureOpenAIGenerator docstrings (#8149)
* update docstrings

* Update haystack/components/generators/azure.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Update haystack/components/generators/azure.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Update haystack/components/generators/azure.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Update haystack/components/generators/azure.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

---------

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
2024-08-02 14:26:37 +02:00
Agnieszka Marzec
ba5d105a78
update docstrings (#8150) 2024-08-02 14:25:56 +02:00
Agnieszka Marzec
d441c2faab
Docs: Update HuggingFaceAPIChatGEnertaor docstrings (#8152)
* update docstrings

* Update haystack/components/generators/chat/hugging_face_api.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Update haystack/components/generators/chat/hugging_face_api.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

---------

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
2024-08-02 09:31:59 +00:00
Sebastian Husch Lee
c90495c2e8
feat: Add model and tokenizer kwargs to TransformersSimilarityRanker, SentenceTransformersDocumentEmbedder, SentenceTransformersTextEmbedder (#8145)
* Start adding model and tokenizer kwargs support

* Add model and tokenizer kwargs to doc embedder

* Some updates and fixes in tests

* Fix more tests

* Fix tests

* Add release note

* Fix test

* Add from_dict tests
2024-08-02 10:37:10 +02:00
Agnieszka Marzec
d9a7a7a4db
Docs: Update ConditionalRouter docstrings (#8140)
* update docstrings

* Update haystack/components/routers/conditional_router.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* add reviewer's comments

---------

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
2024-08-02 10:34:01 +02:00
Agnieszka Marzec
c670f0fbee
Docs: update SentenceWindowRetriever docstrings (#8138)
* update docstrings

* Update haystack/components/retrievers/sentence_window_retriever.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Update haystack/components/retrievers/sentence_window_retriever.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

---------

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
2024-08-01 18:05:31 +02:00
Agnieszka Marzec
ffbaed85de
update docstrings (#8142) 2024-08-01 16:27:30 +02:00
Agnieszka Marzec
bec822c361
Docs: Update FilterRetriever docstrings (#8133)
* update docstrings

* Update haystack/components/retrievers/filter_retriever.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

---------

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
2024-08-01 13:16:44 +02:00
Vladimir Blagojevic
25d3520f5a
feat: Add AnswerJoiner new component (#8122)
* Initial AnswerJoiner

* Initial tests

* Add release note

* Resove mypy warning

* Add custom join function

* Serialize custom join function

* Handle all Answer types, add integration test, improve pydoc

* Make fixes

* Add to API docs

* Add more tests

* Update haystack/components/joiners/answer_joiner.py

Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>

* Update docstrings and release notes

* update docstrings

---------

Co-authored-by: Sebastian Husch Lee <sjrl423@gmail.com>
Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com>
Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>
Co-authored-by: Darja Fokina <daria.fokina@deepset.ai>
2024-08-01 12:51:17 +02:00
Stefano Fiorucci
3d1ad10385
fix html test (#8127) 2024-07-31 10:59:53 +02:00
Daria Fokina
bc153c233c
docs: clean up docstrings of TransformersSimilarityRanker (#8124)
* update transformerssimilarityranker docstrings

* Apply suggestions from code review

Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>

* upd device param

---------

Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>
2024-07-31 09:54:32 +02:00
Daria Fokina
ac51885fe8
docs: clean up docstrings of OpenAITextEmbedder (#8120)
* update docstrings

* update capitalization

Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>

---------

Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>
2024-07-31 09:53:25 +02:00
Daria Fokina
28141ec6b9
docs: clean up docstrings of OpenAIChatGenerator (#8125)
* openaichatgen-docstrings

* link update
2024-07-31 09:45:14 +02:00
Silvano Cerza
c7e29a83c1
fix: Fix infinite loop when running Pipeline (#8123)
* Fix infinite loop when running Pipeline

* Simplify if
2024-07-30 15:00:12 +02:00
Agnieszka Marzec
1d4883f178
update docstrings (#8117) 2024-07-30 11:10:36 +02:00
Agnieszka Marzec
42f59fc022
update docstrings (#8115) 2024-07-30 11:08:45 +02:00
Daria Fokina
21de1f87d4
docs: clean up docstrings of AnswerBuilder (#8094)
* answerbuilder docstrings

* update the `replies`

* Apply suggestions from code review

Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>

* Update answer_builder.py

---------

Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>
2024-07-30 11:06:39 +02:00
Agnieszka Marzec
e8598befb6
Docs: Update OpenAIGen docstrings and add missing headers (#8105)
* update docstrings

* Update haystack/components/generators/openai.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

---------

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
2024-07-30 11:06:17 +02:00
Daria Fokina
92e2377eff
docs: clean up docstrings of FileTypeRouter (#8098)
* upd filetyperouter docstrings

* Suggestions from code review

Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>

* aga's suggestions

---------

Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>
2024-07-30 08:39:08 +02:00
Agnieszka Marzec
8ce7bedf25
Docs: Update DocSplitter docstrings (#8081)
* update docstrings

* Update haystack/components/preprocessors/document_splitter.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Update haystack/components/preprocessors/document_splitter.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* fix article

---------

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
2024-07-29 15:11:12 +02:00
Agnieszka Marzec
abb24c61c2
Docs: Update DocumentEmbedder docstrings (#8112)
* update docstrings

* Update haystack/components/embedders/sentence_transformers_document_embedder.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Update haystack/components/embedders/sentence_transformers_document_embedder.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Update haystack/components/embedders/sentence_transformers_document_embedder.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* fix casing

---------

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
2024-07-29 15:10:49 +02:00
Agnieszka Marzec
950c632009
Docs: Update DocumentCleaner docstrings (#8106)
* update docstrings

* Update haystack/components/preprocessors/document_cleaner.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* fix article

---------

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
2024-07-29 14:45:15 +02:00
Agnieszka Marzec
da81d10060
Docs: Update DocumentJoiner docstrings (#8109)
* update docstrings

* Update haystack/components/joiners/document_joiner.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Update haystack/components/joiners/document_joiner.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Update haystack/components/joiners/document_joiner.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Update haystack/components/joiners/document_joiner.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Update haystack/components/joiners/document_joiner.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* fix typo

* fix typo

---------

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
2024-07-29 14:39:44 +02:00
Corentin Meyer
1c53aae8f0
fix: Tika converter not yielding page break tags (\f) (#8082)
* Fix TikaConverter not having \f page tag by using HTML mode of parsing and then parsing the HTML to text using the old Haystack 1.X integration as template.

* Add Reno

* Fix test by making Mock Tika return XML (before parsing)

* refinements and test

---------

Co-authored-by: anakin87 <stefanofiorucci@gmail.com>
2024-07-26 20:13:47 +02:00
Amna Mubashar
e0de423ee0
Rename SentenceWindowRetrieval to SentenceWindowRetriever 2024-07-26 17:46:44 +02:00
Silvano Cerza
3fed1366c4
fix: Fix issue that could lead to RCE if using unsecure Jinja templates (#8095)
* Fix issue that could lead to RCE if using unsecure Jinja templates

* Add comment explaining exception suppression

* Update release note

* Update release note
2024-07-26 14:02:09 +00:00
Nicola Procopio
47f4db8698
added truncate_dim to sentence transformers embedder (#8077)
* added truncate_dim to sentence transformers embedder

* Update haystack/components/embedders/sentence_transformers_document_embedder.py

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>

* Update releasenotes/notes/release-note-2b603a123cd36214.yaml

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>

* fixed parameter description

* added test for truncation to text embedder

* fix format

---------

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
2024-07-26 10:39:48 +02:00
Madeesh Kannan
b2aef217da
chore: Remove deprecated DynamicPromptBuilder and DynamicChatPromptBuilder components (#8085) 2024-07-26 10:00:59 +02:00
Daria Fokina
f372ca443c
bm25 retriever docstrings (#8087) 2024-07-25 17:28:21 +02:00
Agnieszka Marzec
1f58ec20a8
Docs: Standardize and improve SentenceTransformersTextEmbedder docstrings (#8060)
* Update docstrings

* format

* add Daria's comments

* Update haystack/components/embedders/sentence_transformers_text_embedder.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Update haystack/components/embedders/sentence_transformers_text_embedder.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

---------

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
2024-07-25 13:56:51 +02:00
Agnieszka Marzec
de728b4877
Docs: Simplify lg + standardize docstrings (#8057)
* Simplify lg + standardize

* Format

* Update formatting

* Fix formatting again

* Fix empty line

* Change formatting

* Format with black

---------

Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com>
2024-07-25 13:24:42 +02:00
Agnieszka Marzec
855f8e61f3
Docs: Update InMemoryEmbeddingRetriever docstrings (#8068)
* update docstrings

* Update documents to lowercase
2024-07-25 13:24:00 +02:00
Madeesh Kannan
f9e4d5dc58
chore: Deprecate the debug parameter in Pipeline.run (#8075) 2024-07-25 09:58:57 +00:00
Tobias Wochinger
4dde6fbaec
build: unpin structlog (#8071) 2024-07-24 20:58:34 +02:00
Amna Mubashar
b374c528b2
Assign streaming_callback to OpenAIGenerator and OpenAIChatGenerator in run() method (#8054)
* Add optional parameter for streaming_callback in run() method
2024-07-24 15:49:19 +02:00