* AzureOCR: convert integration test to unit test and simplify
* clean up HuggingFaceAPITextEmbedder
* clean up LinkContentFetcher
* simplify HuggingFaceLocalGenerator
* clean up OpenAIGenerator
* OpenAIChatGenerator
* SentenceTransformersDiversityRanker
* TransformersSimilarityRanker
* ChatMessage: rm outdated tests
* fail fast false
* typo
* added async support for HuggingFaceAPIDocumentEmbedder
* added type anotations, removed unused import
* Trigger mark test complited
* Apply suggestions from code review
* utility function
---------
Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
* feat: make AzureOpenAIDocumentEmbedder inherit from OpenAIDocumentEmbedder - async support
* fix type
* rm unused import
* do not replace newlines
* fix test
* fix: manage max_retries=0 in AzureOpenAIGenerator and AzureOpenAIChatGenerator
* fix: manage max_retries=0 in AzureOpenAITextEmbedder and AzureOpenAIDocumentEmbedder
* Start adding support for passing callable to Azure components
* Add to chat version
* Fix test
* Add reno
* Add support to azure doc and text embedder
* Rename
* update llm metadata extractor
* Add tests for text embedder
* Update tests
* Remove unused fixture and import
* Update reno
* Revert "test: skip HF API live integration tests (#8889)"
This reverts commit 56a3a9bd61b7391ae91e3d8179b3b33918ef4932.
* Replace zephyr-7b-beta model with SmolLM2-1.7B-Instruct
* Use zephyr-7b-beta model but extend instructions
---------
Co-authored-by: David S. Batista <dsbatista@gmail.com>
* initial rough draft
* expose backend instead of extracting from model_kwargs
* explictly set backend model path
* add reno
* expose backend for ST diversity backend
* add dtype tests and expose kwargs to ST ranker for backend parameters
* skip dtype tests as torch isnt compiled with cuda
* add new openvino dependency release, unskip tests
* resolve suggestion
* mock calls, turn integrations into unit tests
* remove unnecessary test dependencies
* feat: SentenceTransformersDocumentEmbedder and SentenceTransformersTextEmbedder can accept and pass any arguments to SentenceTransformer.encode
* refactor: encode_kwargs parameter of SentenceTransformersDocumentEmbedder and SentenceTransformersTextEmbedder mae to be the last positional parameter for backward compatibility reasons
* docs: added explanation for encode_kwargs in SentenceTransformersTextEmbedder and SentenceTransformersDocumentEmbedder
* test: added tests for encode_kwargs in SentenceTransformersTextEmbedder and SentenceTransformersDocumentEmbedder
* doc: removed empty lines from docstrings of SentenceTransformersTextEmbedder and SentenceTransformersDocumentEmbedder
* refactor: encode_kwargs parameter of SentenceTransformersDocumentEmbedder and SentenceTransformersTextEmbedder mae to be the last positional parameter for backward compatibility (part II.)
* HF API Embedders: refactoring
* rename variables
* rm leftovers
* rm pin
* rm unused import
* relnote
* warning with truncate/normalize and serverless inference API
* test that warnings are raised
* Use token instead of use_auth_token because of deprecation warning
* Fix test
* pylint
* fix linting
---------
Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
* feat: be tolerant to exceptions
if ever an error is raised by the OpenAI API, don't fail the entire processing
* fix: missing import, string separator
* Enhance error handling
* Use batched from more_itertools for compatibility with older Python versions
* Fix batching and add test
---------
Co-authored-by: Silvano Cerza <silvanocerza@gmail.com>
* reduced usage of numpy and substituted built-in libraries
* added release note
* edited expit function to support both float as well as list (this case was giving error CI)
* revert code , numpy can't be removed here
* more cleaning
* fix relnote
---------
Co-authored-by: anakin87 <stefanofiorucci@gmail.com>
* add config_kwargs
* disable PLR0913 for a specific function
* add a release note
* refer to AutoConfig in config_kwargs docstring
---------
Co-authored-by: David S. Batista <dsbatista@gmail.com>
Co-authored-by: Julian Risch <julianrisch@gmx.de>
* chaning default model to gpt-4o-mini
* adding release notes
* fixing some missed tests
* fixing some more missed tests
* fixing one last missed test
* fixing linting issues
* making pylint happy about an end2end test
* chaning if test to walruss operator
* fixing azure embedder from ada to text-embedding-ada-002
* Start adding model and tokenizer kwargs support
* Add model and tokenizer kwargs to doc embedder
* Some updates and fixes in tests
* Fix more tests
* Fix tests
* Add release note
* Fix test
* Add from_dict tests
* Fix from_dict to work if device isn't provided in init params
* Minor refactoring of from_dict for components that load HF models
* Add tests
* Update tests to test loading with all default parameters
* Add more tests
* Add release notes
* Add unit test for whisper local
* Update reno
* Add fix for ExtractiveReader
* Fix NamedEntityExtractor
max_retries: if not set is read from the OPENAI_MAX_RETRIES
env variable or set to 5.
timeout: if not set is read from the OPENAI_TIMEOUT
env variable or set to 30.
Signed-off-by: Nitanshu Vashistha <nitanshu.vzard@gmail.com>
* feat: Configure max_retries & timeout for AzureOpenAIDocumentEmbedder
max_retries: if not set is read from the OPENAI_MAX_RETRIES
env variable or set to 5.
timeout: if not set is read from the OPENAI_TIMEOUT
env variable or set to 30.
Signed-off-by: Nitanshu Vashistha <nitanshu.vzard@gmail.com>
* Update retries-and-timeout-for-AzureOpenAIDocumentEmbedder-006fd84204942e43.yaml
* Update haystack/components/embedders/azure_document_embedder.py
* Update haystack/components/embedders/azure_document_embedder.py
---------
Signed-off-by: Nitanshu Vashistha <nitanshu.vzard@gmail.com>
Co-authored-by: David S. Batista <dsbatista@gmail.com>
* add enviroment variables to the _enviroment.py file
* add support for two of the three variables
* Add support for 'OPENAI_TIMEOUT' and 'OPENAI_MAX_RETRIES' on OpenAIDocument Ebedder.
* Replicate support for env vars in OpenAITextEmbedder.
* Add support for env vars in OpenAIGenerator..
* Add support for env vars in OpenAIChatGenerator.
* add docstrings and reno
* add params to __init__ in OpenAIDocumentEmbedder
* add params to __init__ in OpenAITextEmbedder
* make fully functional implementation of env vars and unit tests
* update reno
* Update haystack/components/embedders/openai_text_embedder.py
* reverse changes to telemetry/_enviroment.py
* Update haystack/components/embedders/openai_text_embedder.py
---------
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
* fix: Update device deserializtion for SentenceTransformersTextEmbedder
* Add unit test
* Fix unit test
* Make same change to doc embedder
* Add release notes
* Add same change to Diversity Ranker and Named Entity Extractor
* Add unit test
* Add the same for whisper local
* Update release notes