haystack

mirror of https://github.com/deepset-ai/haystack.git synced 2025-11-09 14:23:43 +00:00

Author	SHA1	Message	Date
Sebastian Husch Lee	14895f6573	chore: Use token instead of use_auth_token because of deprecation warning (#8552 ) * Use token instead of use_auth_token because of deprecation warning * Fix test * pylint * fix linting --------- Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>	2024-11-18 11:58:22 +00:00
Ivo Bellin Salarin	c78545dfc0	feat(openai): be tolerant to exceptions (#8526 ) * feat: be tolerant to exceptions if ever an error is raised by the OpenAI API, don't fail the entire processing * fix: missing import, string separator * Enhance error handling * Use batched from more_itertools for compatibility with older Python versions * Fix batching and add test --------- Co-authored-by: Silvano Cerza <silvanocerza@gmail.com>	2024-11-15 10:52:44 +01:00
Ajit Singh	6cf13e8b98	enhancement: reduced usage of numpy and substituted built-in libraries (#8418 ) * reduced usage of numpy and substituted built-in libraries * added release note * edited expit function to support both float as well as list (this case was giving error CI) * revert code , numpy can't be removed here * more cleaning * fix relnote --------- Co-authored-by: anakin87 <stefanofiorucci@gmail.com>	2024-10-18 15:42:19 +02:00
Alper	b40f0c8b5d	feat: SentenceTransformersTextEmbedder supports `config_kwargs` (#8432 ) * add config_kwargs * disable PLR0913 for a specific function * add a release note * refer to AutoConfig in config_kwargs docstring --------- Co-authored-by: David S. Batista <dsbatista@gmail.com> Co-authored-by: Julian Risch <julianrisch@gmx.de>	2024-10-14 16:08:53 +00:00
David S. Batista	b81abc0c85	feat: SentenceTransformersDocumentEmbedder supports `config_kwargs` (#8433 ) * initial import * adding release notes	2024-10-14 17:43:04 +02:00
David S. Batista	97126eb544	fix: changing default model to `gpt-4o-mini` on OpenAI API calls (#8360 ) * chaning default model to gpt-4o-mini * adding release notes * fixing some missed tests * fixing some more missed tests * fixing one last missed test * fixing linting issues * making pylint happy about an end2end test * chaning if test to walruss operator * fixing azure embedder from ada to text-embedding-ada-002	2024-09-17 10:36:42 +02:00
Sebastian Husch Lee	06dd5c2f37	feat (v2): Update so `model_max_length` updates `max_seq_length` for Sentence Transformers (#8334 ) * Update so model_max_length does what is expected * Add release notes * Some fixes * Another test	2024-09-06 11:37:56 +02:00
Nicola Procopio	4c798470b2	added `precision` parameter to sentence transformers embeddings (#8179 ) * added `precision` parameter to sentence transformers embeddings * fixed test * Update haystack/components/embedders/sentence_transformers_document_embedder.py Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com> * Update test/components/embedders/test_sentence_transformers_text_embedder.py Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com> * Update test/components/embedders/test_sentence_transformers_text_embedder.py Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com> * fix format * Update sentence_transformers_text_embedder.py --------- Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>	2024-08-09 11:38:47 +02:00
Sebastian Husch Lee	c90495c2e8	feat: Add model and tokenizer kwargs to `TransformersSimilarityRanker`, `SentenceTransformersDocumentEmbedder`, `SentenceTransformersTextEmbedder` (#8145 ) * Start adding model and tokenizer kwargs support * Add model and tokenizer kwargs to doc embedder * Some updates and fixes in tests * Fix more tests * Fix tests * Add release note * Fix test * Add from_dict tests	2024-08-02 10:37:10 +02:00
Nicola Procopio	47f4db8698	added truncate_dim to sentence transformers embedder (#8077 ) * added truncate_dim to sentence transformers embedder * Update haystack/components/embedders/sentence_transformers_document_embedder.py Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com> * Update releasenotes/notes/release-note-2b603a123cd36214.yaml Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com> * fixed parameter description * added test for truncation to text embedder * fix format --------- Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>	2024-07-26 10:39:48 +02:00
Sebastian Husch Lee	c121c86c4c	fix: Fix from_dict methods of components using HF models to work with default values (#8003 ) * Fix from_dict to work if device isn't provided in init params * Minor refactoring of from_dict for components that load HF models * Add tests * Update tests to test loading with all default parameters * Add more tests * Add release notes * Add unit test for whisper local * Update reno * Add fix for ExtractiveReader * Fix NamedEntityExtractor	2024-07-10 12:18:05 +02:00
Nitanshu Vashistha	cd8a5b98fe	feat: Configure max_retries & timeout for AzureOpenAITextEmbedder (#7993 ) max_retries: if not set is read from the OPENAI_MAX_RETRIES env variable or set to 5. timeout: if not set is read from the OPENAI_TIMEOUT env variable or set to 30. Signed-off-by: Nitanshu Vashistha <nitanshu.vzard@gmail.com>	2024-07-09 09:56:46 +02:00
Nitanshu Vashistha	f9d53c5ca8	feat: Configure max_retries and timeout for AzureOpenAIDocumentEmbedder (#7994 ) * feat: Configure max_retries & timeout for AzureOpenAIDocumentEmbedder max_retries: if not set is read from the OPENAI_MAX_RETRIES env variable or set to 5. timeout: if not set is read from the OPENAI_TIMEOUT env variable or set to 30. Signed-off-by: Nitanshu Vashistha <nitanshu.vzard@gmail.com> * Update retries-and-timeout-for-AzureOpenAIDocumentEmbedder-006fd84204942e43.yaml * Update haystack/components/embedders/azure_document_embedder.py * Update haystack/components/embedders/azure_document_embedder.py --------- Signed-off-by: Nitanshu Vashistha <nitanshu.vzard@gmail.com> Co-authored-by: David S. Batista <dsbatista@gmail.com>	2024-07-08 22:35:25 +02:00
Vladimir Blagojevic	535a281eec	feat: Add option to use `HF_TOKEN` as env var for authentication across all HF components (#7942 ) * Read both HF_API_TOKEN and HF_TOKEN env vars in all HF related components * Add reno note * Test fixes * More test updates * More test updates	2024-06-27 10:31:58 +02:00
Stefano Fiorucci	75ad76a7ce	chore: remove deprecated TEI embedders (#7907 ) * remove deprecated TEI embedders * rm from the embedders init * rm related tests	2024-06-21 10:36:12 +02:00
Carlos Fernández	686a4999cf	feat: widen support of env vars in OpenAI components (#7653 ) * add enviroment variables to the _enviroment.py file * add support for two of the three variables * Add support for 'OPENAI_TIMEOUT' and 'OPENAI_MAX_RETRIES' on OpenAIDocument Ebedder. * Replicate support for env vars in OpenAITextEmbedder. * Add support for env vars in OpenAIGenerator.. * Add support for env vars in OpenAIChatGenerator. * add docstrings and reno * add params to __init__ in OpenAIDocumentEmbedder * add params to __init__ in OpenAITextEmbedder * make fully functional implementation of env vars and unit tests * update reno * Update haystack/components/embedders/openai_text_embedder.py * reverse changes to telemetry/_enviroment.py * Update haystack/components/embedders/openai_text_embedder.py --------- Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>	2024-05-15 21:58:41 +00:00
Sebastian Husch Lee	a2be90b95a	fix: Update device deserialization for components that use local models (#7686 ) * fix: Update device deserializtion for SentenceTransformersTextEmbedder * Add unit test * Fix unit test * Make same change to doc embedder * Add release notes * Add same change to Diversity Ranker and Named Entity Extractor * Add unit test * Add the same for whisper local * Update release notes	2024-05-14 08:36:14 +02:00
Massimiliano Pippi	10c675d534	chore: add license header to all modules (#7675 ) * add license header to modules * check license header at linting time	2024-05-09 13:40:36 +00:00
Stefano Fiorucci	7c9532b200	fix broken serialization of HFAPI components (#7661 )	2024-05-08 17:14:37 +02:00
Stefano Fiorucci	39be515ba6	skip HF integrations tests if running from fork (#7517 )	2024-04-09 17:47:13 +02:00
Stefano Fiorucci	eff53a9131	feat: `HuggingFaceAPIDocumentEmbedder` (#7485 ) * add HuggingFaceAPITextEmbedder * add HuggingFaceAPITextEmbedder * rm unneeded else * wip * small fixes * deprecation; reno * Apply suggestions from code review Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com> * make params mandatory * changes requested * fix test * fix test --------- Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>	2024-04-08 15:06:26 +02:00
Stefano Fiorucci	c91bd49cae	feat: `HuggingFaceAPITextEmbedder` (#7484 ) * add HuggingFaceAPITextEmbedder * add HuggingFaceAPITextEmbedder * rm unneeded else * small fixes * changes requested * fix test	2024-04-08 14:22:54 +02:00
Ashwin Mathur	1c7d1618d8	Add truncate and normalize parameters to TEI Embedders (#7460 )	2024-04-03 16:41:30 +02:00
Nicola Procopio	42c5b7af32	feat: added dimensions parameters to Azure OpenAI Embedders (#7449 ) * added dimensions parameter to AzureOpenAIEmbedders * created releasenote * update release note --------- Co-authored-by: Julian Risch <julian.risch@deepset.ai>	2024-04-02 14:04:16 +02:00
Vladimir Blagojevic	2aae8472e7	feat: Add trust_remote_code init param to SentenceTransformer embedders (#7356 ) * Add trust_remote_code init param to SentenceTransformer embedders * Add release note * Go with no kwargs solution * Update haystack/components/embedders/sentence_transformers_document_embedder.py Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com> * Pydoc fix --------- Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>	2024-03-14 11:14:04 +01:00
Ashwin Mathur	8d7a58347d	fix: `HuggingFaceTEITextEmbedder` returning embedding of incorrect shape when used with Docker endpoint (#7319 ) * Fix HuggingFaceTEITextEmbedder * Update haystack/components/embedders/hugging_face_tei_text_embedder.py Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com> * Improve imports; Add additional tests --------- Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>	2024-03-07 16:23:57 +01:00
Stefano Fiorucci	d00f171f8b	refactor!: Sentence Transformers Embedders - new devices mgmt (#7033 ) * new device mgmt for Sentence Transformers embedders * reno	2024-02-19 14:52:44 +01:00
Tuana Çelik	e2cee468fc	fix: Adding `api_base_url` to `OpenAITextEmbeder` self assignments (#7004 ) * assigning api_base_url This fix resolves issues with the MistralTextEmbedder integration * adding base url to `to_dict` and the tests * adding release note * Update fix-openai-base-url-assignment-0570a494d88fe365.yaml --------- Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>	2024-02-15 17:35:28 +01:00
sahusiddharth	3bd6ba93ca	feat:Add dimensions parameter to OpenAI Embedders to fully support th… (#6841 ) * feat:Add dimensions parameter to OpenAI Embedders to fully support the new models * fixed linting * changed != None to is not None	2024-02-05 16:20:46 +01:00
Madeesh Kannan	27d1af3068	feat!: Use `Secret` for passing authentication secrets to components (#6887 ) * feat!: Use `Secret` for passing authentication secrets to components * Add comment to clarify type ignore	2024-02-05 13:17:01 +01:00
Vladimir Blagojevic	6e86f4e26a	Update embedding integration tests (#6823 )	2024-01-24 15:22:47 +01:00
ZanSara	288ed150c9	feat!: Rename `model_name` or `model_name_or_path` to `model` in all Embedder classes (#6733 ) * rename model parameter in the openai doc embedder * fix tests for openai doc embedder * rename model parameter in the openai text embedder * fix tests for openai text embedder * rename model parameter in the st doc embedder * fix tests for st doc embedder * rename model parameter in the st backend * fix tests for st backend * rename model parameter in the st text embedder * fix tests for st text embedder * fix docstring * fix pipeline utils * fix e2e * reno * fix the indexing pipeline _create_embedder function * fix e2e eval rag pipeline * pytest	2024-01-12 15:30:17 +01:00
Massimiliano Pippi	93b2aaee09	chore: move `DocumentJoiner` to new `joiners` package (#6692 ) * move DocumentJoiner to new joiners package * relnote * leftovers * fix docstrings generation * fix unrelated pydoc misconfiguration * more unrelated work, yay! * fix assertions	2024-01-08 22:06:27 +01:00
Silvano Cerza	9445b2d466	Fix skipif with empty env var (#6704 )	2024-01-08 19:19:14 +01:00
Silvano Cerza	607e7d1488	Skip integration tests if env var is missing (#6703 )	2024-01-08 17:15:10 +01:00
Vladimir Blagojevic	552f0e394b	feat: Add Azure embedders support (#6676 ) * Add Azure embedders --------- Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>	2024-01-05 15:49:25 +01:00
Stefano Fiorucci	c773c30c66	refactor!: rename all remaining `metadata` to `meta` (#6650 ) * change metadata to meta * release note	2023-12-28 12:18:15 +01:00
Vladimir Blagojevic	4d08be0c2a	feat: Update OpenAI Python Client in Haystack 2.x (#6584 ) * Update openai python client * Add release note * Consolidate multiple mock_chat_completion into one * Ensure all components have api_base_url, organization params * Update tests * Enable function calling * Oversight * Minor fixes, add streaming test mocks * Apply suggestions from code review Co-authored-by: Daria Fokina <daria.fokina@deepset.ai> * metadata -> meta --------- Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>	2023-12-21 16:21:24 +01:00
Ashwin Mathur	fc88ef7076	feat: Add HuggingFace TEI Embedders - `HuggingFaceTEITextEmbedder` and `HuggingFaceTEIDocumentEmbedder` (#6602 ) * Add TEI Embedders * Add release notes * Update release notes with usage examples	2023-12-21 12:16:36 +01:00
Massimiliano Pippi	7c05f37a53	remove unit marker (#6450 )	2023-11-29 19:24:25 +01:00
Silvano Cerza	e6637f5ec2	Fix all tests	2023-11-24 14:48:43 +01:00
Massimiliano Pippi	8adb8bbab8	Remove preview folder in test/ --------- Co-authored-by: Silvano Cerza <silvanocerza@gmail.com>	2023-11-24 11:52:55 +01:00

42 Commits