* assigning api_base_url
This fix resolves issues with the MistralTextEmbedder integration
* adding base url to `to_dict` and the tests
* adding release note
* Update fix-openai-base-url-assignment-0570a494d88fe365.yaml
---------
Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
* Update MetaFieldRanker to parse string meta values based on meta_value_type
* Add some unit tests
* Add another unit test
* Add release notes
* Fix mypy
* Fix pylint
* Add more unit tests
* Update release notes
* Update docs
* Further improve doc strings
* Add FilterRetriever draft
* Implement FilterRetriever and add tests
* Update comparison to compare whole docs instead of just contents
* Expose FilterRetriever at the retrievers level
* Update docstring (add example usage)
* Add filter_retriever in the API reference docs config
Update retriever search path to start one dir level higher
* simplify _documents_equal
* improve usage example
---------
Co-authored-by: anakin87 <stefanofiorucci@gmail.com>
* Getting device_map working to support 8bit loading and multi device inference
* Update to take account the device specified by the user
* add release notes
* Add device_map support for ExtractiveReader
* Update test
* Update to model that doesn't have issues
* Update test
* Update pytest approx
* Update release notes
* Start supporting device map
* Update ExtractiveReader to use new ComponentDevice
* Update similarity ranker to follow extractive reader implementation
* Fixing pylint
* Make mypy mostly happy
* Add new unit test to test device_map
* Adding unit tests
* Some refactoring
* Add more tests
* Add more tests
* Add another unit test
* Update first_device property to return a ComponentDevice to be able to use the to methods
* Updating tests for test_device
* Update tests and now explicitly modify device_map in model_kwargs
* Update haystack/utils/hf.py
Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>
* Make mypy happy
* mypy
* Remove unneeded optional flag
* Update ExtractiveReader with new logic
* Update ranker to follow new logic
* Removing unneeded code
* Make mypy happy
* fxi pylint
* Fix test
* Adding unit tests for device_map="auto"
* Add unit tests for ranker
* PR comments
* Make util method
* Adding unit tests
* Fix type annotation
* Fix pylint
* Fix test
---------
Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>
* feat-added-split-by-page-to-DocumentSplitter
* added test case and the suggested changes
* Update document_splitter.py
* Update haystack/components/preprocessors/document_splitter.py
* Update test_document_splitter.py
---------
Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com>
* Add weight and ranking_mode as params to run for easier experimentation
* renaming of metadata to meta
* User logger.warning instead of warnings
* Add another unit test
* Add support for sort_order and fix formatting of error messages
* Make MetaFieldRanker more robust. Doesn't crash pipeline if some Documents are missing keys.
* Don't print same warning message twice
* Add another test
* Making MetaFieldRanker more robust
* Move up if return statement to earlier in the function
* Setting up infer_type
* Remove infer_type for now
* Release notes
* Add init file
* Update releasenotes/notes/metafieldranker_sort-order_refactor-2000d89dc40dc15a.yaml
Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
---------
Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
* rename model parameter and internam model attribute in ExtractiveReader
* fix tests for ExtractiveReader
* fix e2e
* reno
* another fix
* review feedback
* Update releasenotes/notes/rename-model-param-reader-b8cbb0d638e3b8c2.yaml
* fix!: `InMemoryBM25Retriever` no longer returns documents that have a score of 0.0
Also update tests to accommodate the new behavior.
* Remove superfluous code
* rename model parameter in the openai doc embedder
* fix tests for openai doc embedder
* rename model parameter in the openai text embedder
* fix tests for openai text embedder
* rename model parameter in the st doc embedder
* fix tests for st doc embedder
* rename model parameter in the st backend
* fix tests for st backend
* rename model parameter in the st text embedder
* fix tests for st text embedder
* fix docstring
* fix pipeline utils
* fix e2e
* reno
* fix the indexing pipeline _create_embedder function
* fix e2e eval rag pipeline
* pytest
* rename model parameter in local transcriber
* fix tests for local transcriber
* rename model parameter in remote transcriber
* fix tests for remote transcriber
* reno
---------
Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
* first draft for ranker
* same for the reader
* consider also bnb_4bit_compute_dtype
* dtype serialization in hugging_face_local_generator
* add release note
* address dtype defined in huggingface_pipeline_kwargs
* test quantization options in reader
* fix
* serialize quantization_config
* test quantization_config serialization
* address feedback
* fix typo