* Start adding model and tokenizer kwargs support
* Add model and tokenizer kwargs to doc embedder
* Some updates and fixes in tests
* Fix more tests
* Fix tests
* Add release note
* Fix test
* Add from_dict tests
* Fix from_dict to work if device isn't provided in init params
* Minor refactoring of from_dict for components that load HF models
* Add tests
* Update tests to test loading with all default parameters
* Add more tests
* Add release notes
* Add unit test for whisper local
* Update reno
* Add fix for ExtractiveReader
* Fix NamedEntityExtractor
* Add `missing_meta` param to `MetaFieldRanker`, plus checks for validation.
* Implement `missing_meta` functionality in `run()`.
* Finish first draft of revised `MetaFieldRanker` functionality.
* Add tests for `MetaFieldRanker` `missing_meta` functionality.
* Add `missing_meta` param to `MetaFieldRanker`, plus checks for validation.
* Implement `missing_meta` functionality in `run()`.
* Finish first draft of revised `MetaFieldRanker` functionality.
* Add tests for `MetaFieldRanker` `missing_meta` functionality.
* Add release notes for new `missing_meta` param of `MetaFieldRanker`
* Move part of docs_missing_meta_field warning string outside of `if...elif...else`.
* fix: Update device deserializtion for SentenceTransformersTextEmbedder
* Add unit test
* Fix unit test
* Make same change to doc embedder
* Add release notes
* Add same change to Diversity Ranker and Named Entity Extractor
* Add unit test
* Add the same for whisper local
* Update release notes
* bug: run parameter ranking_mode does not override init param in metafield ranker
* Added a release note
* Used pytest.approx for comparing floating point numbers in unit test
* Add Diversity Ranker
* Update tests
* Add separate suffix, prefix params for query and documents; allow empty query
* Update docstrings
* Make changes based on review
* Add additional tests
* Add test for warm up
* Update release notes
---------
Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com>
* docs: Update docstrings of MetaFieldRanker and TransformersSimilarityRanker
* add warm_up() call to usage example
* Apply suggestions from code review
Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
* show result of usage example
---------
Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
* Update MetaFieldRanker to parse string meta values based on meta_value_type
* Add some unit tests
* Add another unit test
* Add release notes
* Fix mypy
* Fix pylint
* Add more unit tests
* Update release notes
* Update docs
* Further improve doc strings
* Getting device_map working to support 8bit loading and multi device inference
* Update to take account the device specified by the user
* add release notes
* Add device_map support for ExtractiveReader
* Update test
* Update to model that doesn't have issues
* Update test
* Update pytest approx
* Update release notes
* Start supporting device map
* Update ExtractiveReader to use new ComponentDevice
* Update similarity ranker to follow extractive reader implementation
* Fixing pylint
* Make mypy mostly happy
* Add new unit test to test device_map
* Adding unit tests
* Some refactoring
* Add more tests
* Add more tests
* Add another unit test
* Update first_device property to return a ComponentDevice to be able to use the to methods
* Updating tests for test_device
* Update tests and now explicitly modify device_map in model_kwargs
* Update haystack/utils/hf.py
Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>
* Make mypy happy
* mypy
* Remove unneeded optional flag
* Update ExtractiveReader with new logic
* Update ranker to follow new logic
* Removing unneeded code
* Make mypy happy
* fxi pylint
* Fix test
* Adding unit tests for device_map="auto"
* Add unit tests for ranker
* PR comments
* Make util method
* Adding unit tests
* Fix type annotation
* Fix pylint
* Fix test
---------
Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>
* Add weight and ranking_mode as params to run for easier experimentation
* renaming of metadata to meta
* User logger.warning instead of warnings
* Add another unit test
* Add support for sort_order and fix formatting of error messages
* Make MetaFieldRanker more robust. Doesn't crash pipeline if some Documents are missing keys.
* Don't print same warning message twice
* Add another test
* Making MetaFieldRanker more robust
* Move up if return statement to earlier in the function
* Setting up infer_type
* Remove infer_type for now
* Release notes
* Add init file
* Update releasenotes/notes/metafieldranker_sort-order_refactor-2000d89dc40dc15a.yaml
Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
---------
Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
* first draft for ranker
* same for the reader
* consider also bnb_4bit_compute_dtype
* dtype serialization in hugging_face_local_generator
* add release note
* address dtype defined in huggingface_pipeline_kwargs
* test quantization options in reader
* fix
* serialize quantization_config
* test quantization_config serialization
* address feedback
* fix typo
* Add scale_score functionality to the TransformersSimilarityRanker
* Updated test to check scores
* Use pytest approx when comparing floats
* Updated how scale score works and added calibration factor. Started to add score threshold.
* Add support for score_threshold
* Add some parameters to the run method
* Add release notes
* Fix mypy
* Be more tolerant on the score values
* Adding unit test for scale_score=False
* Add unit test for score threshold
* Update tests
* Rename test
* Fix typo
* PR comments
* Add device checking and model_kwargs like we do in ExtractiveReader
* Add release notes
* Make a utility function for the device checking
* Better warning message and updated ExtractiveReader to use the util function
* Add unit tests for get_device
* Fix pylint
* Add initial implementation following SentenceTransformersDocumentEmbedder
* Add test for embedding metadata
* Add release notes
* Update name
* Fix tests and to dict
* Fix release notes