* fix ChatPromptBuilder from dict if template=None
* fix ChatPromptBuilder from dict if template=None
* leave template None
---------
Co-authored-by: Marie-Luise Klaus <marieluise.klaus@deepset.ai>
* feat: add unicode normalization & ascii_only mode for DocumentCleaner.
* feat: add unicode_normalization parameter valdiation to DocumentCleaner.
* test: fix the unit test to work after code linting.
* Start adding model and tokenizer kwargs support
* Add model and tokenizer kwargs to doc embedder
* Some updates and fixes in tests
* Fix more tests
* Fix tests
* Add release note
* Fix test
* Add from_dict tests
* Fix TikaConverter not having \f page tag by using HTML mode of parsing and then parsing the HTML to text using the old Haystack 1.X integration as template.
* Add Reno
* Fix test by making Mock Tika return XML (before parsing)
* refinements and test
---------
Co-authored-by: anakin87 <stefanofiorucci@gmail.com>
* Fix issue that could lead to RCE if using unsecure Jinja templates
* Add comment explaining exception suppression
* Update release note
* Update release note
* Fix bug in DocumentSplitter and expand tests to catch said bug
* Fix split overlap information calc and actually test it
* Add release notes
* Remove comments
* Same fix in SentenceWindowRetrieval
---------
Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com>
* Pin structlog to 24.2.0 due to unit test failures
* Remove object init parameter in huggingface_hub unit tests
* Use less restrictive structlog pin
* Add release note
* initial support for api_params
* add tests and reno
* resolve suggestions and add integration test
* fix mypy
---------
Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>
* initial import
* adding tests
* adding license and release notes
* adding missing release notes
* working with any type of doc store
* nit
* adding get_class_object to serialization package
* nit
* refactoring get_class_object()
* refactoring get_class_object()
* chaning type and var names
* more refactoring
* Update haystack/core/serialization.py
Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com>
* Update haystack/core/serialization.py
Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com>
* Update test/core/test_serialization.py
Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com>
* more refactoring
* more refactoring
* Pydoc syntax
---------
Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com>
* Fix from_dict to work if device isn't provided in init params
* Minor refactoring of from_dict for components that load HF models
* Add tests
* Update tests to test loading with all default parameters
* Add more tests
* Add release notes
* Add unit test for whisper local
* Update reno
* Add fix for ExtractiveReader
* Fix NamedEntityExtractor
* Fix default value for huggingface_pipeline_kwargs
* Add reno note
* Update HuggingFaceLocalGenerator.from_dict to use the same logic as HuggingFaceLocalChatGenerator.from_dict
* Update tests slightly
* Update release note
max_retries: if not set is read from the OPENAI_MAX_RETRIES
env variable or set to 5.
timeout: if not set is read from the OPENAI_TIMEOUT
env variable or set to 30.
Signed-off-by: Nitanshu Vashistha <nitanshu.vzard@gmail.com>
* feat: Configure max_retries & timeout for AzureOpenAIDocumentEmbedder
max_retries: if not set is read from the OPENAI_MAX_RETRIES
env variable or set to 5.
timeout: if not set is read from the OPENAI_TIMEOUT
env variable or set to 30.
Signed-off-by: Nitanshu Vashistha <nitanshu.vzard@gmail.com>
* Update retries-and-timeout-for-AzureOpenAIDocumentEmbedder-006fd84204942e43.yaml
* Update haystack/components/embedders/azure_document_embedder.py
* Update haystack/components/embedders/azure_document_embedder.py
---------
Signed-off-by: Nitanshu Vashistha <nitanshu.vzard@gmail.com>
Co-authored-by: David S. Batista <dsbatista@gmail.com>
* feat: Configure max_retries & timeout for AzureOpenAIChatGenerator
max_retries: if not set is read from the OPENAI_MAX_RETRIES
env variable or set to 5.
timeout: if not set is read from the OPENAI_TIMEOUT
env variable or set to 30.
Signed-off-by: Nitanshu Vashistha <nitanshu.vzard@gmail.com>
* Update haystack/components/generators/chat/azure.py
* Update haystack/components/generators/chat/azure.py
* Update max_retries-for-AzureOpenAIChatGenerator-9e49b4c7bec5c72b.yaml
---------
Signed-off-by: Nitanshu Vashistha <nitanshu.vzard@gmail.com>
Co-authored-by: David S. Batista <dsbatista@gmail.com>
max_retries: if not set is read from the OPENAI_MAX_RETRIES
env variable or set to 5.
timeout: if not set is read from the OPENAI_TIMEOUT
env variable or set to 30.
Signed-off-by: Nitanshu Vashistha <nitanshu.vzard@gmail.com>
* initial implementation
* add support for meta and add ChatMessage tests
* explictly cast types for mypy and update reno
* leave inputs unchanged avoiding side effects
---------
Co-authored-by: Julian Risch <julian.risch@deepset.ai>
* add custom jinja filter handling to ConditionalRouter
* add release notes for custom filters
* align sede to existing patterns and update docstring example
* update sede unit test route condition to be more explicit
---------
Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com>
* Update LocalWhisperTranscriber, add tests
* Final touches
* Update haystack/components/audio/whisper_local.py
Co-authored-by: David S. Batista <dsbatista@gmail.com>
* Fix prev commit
* Relax test for tiny model to work
---------
Co-authored-by: David S. Batista <dsbatista@gmail.com>