21 Commits

Author SHA1 Message Date
Stefano Fiorucci
f2b5f123b3
del HF token in tests (#8634) 2024-12-13 09:50:23 +01:00
David S. Batista
248dccbdd3
chore: fixing pylint issues (#8610)
* initial import

* fixing internal methods

* fixing some internal methods

* modify _preprocess

* fixed internal methods

---------

Co-authored-by: anakin87 <stefanofiorucci@gmail.com>
2024-12-09 16:53:37 +00:00
Sebastian Husch Lee
c121c86c4c
fix: Fix from_dict methods of components using HF models to work with default values (#8003)
* Fix from_dict to work if device isn't provided in init params

* Minor refactoring of from_dict for components that load HF models

* Add tests

* Update tests to test loading with all default parameters

* Add more tests

* Add release notes

* Add unit test for whisper local

* Update reno

* Add fix for ExtractiveReader

* Fix NamedEntityExtractor
2024-07-10 12:18:05 +02:00
Vladimir Blagojevic
535a281eec
feat: Add option to use HF_TOKEN as env var for authentication across all HF components (#7942)
* Read both HF_API_TOKEN and HF_TOKEN env vars in all HF related components

* Add reno note

* Test fixes

* More test updates

* More test updates
2024-06-27 10:31:58 +02:00
Massimiliano Pippi
482f60ec99
fix: exit early if the component receives no documents (#7732)
* exit early if the component receives no documents

* relnote
2024-05-23 09:35:10 +02:00
Massimiliano Pippi
10c675d534
chore: add license header to all modules (#7675)
* add license header to modules
* check license header at linting time
2024-05-09 13:40:36 +00:00
Julian Risch
b0284977db
feat: Add document page number of ExtractedAnswer to meta (#7572)
* calculate page number of answer and add to meta

* fix mypy, add reno

* add test

* simplify unit test

* update release note

* undo @patch updates

* extend tests, check page_number type
2024-05-02 14:48:27 +02:00
Silvano Cerza
ff269db12d
Fix unit tests failing if HF_API_TOKEN is set (#7491) 2024-04-05 18:05:43 +02:00
Tobias Wochinger
23c65c250f
chore: migrate ExtractiveReader to use secret management (#7309)
* chore: migrate `ExtractiveReader` to use secret management

* docs: add release notes
2024-03-05 13:04:53 +01:00
ZanSara
9af6c7e442
add some tolerance to Roberta test (#6880) 2024-01-31 17:19:07 +01:00
Sebastian Husch Lee
ceda4cd655
feat: Add support for device_map (#6679)
* Getting device_map working to support 8bit loading and multi device inference

* Update to take account the device specified by the user

* add release notes

* Add device_map support for ExtractiveReader

* Update test

* Update to model that doesn't have issues

* Update test

* Update pytest approx

* Update release notes

* Start supporting device map

* Update ExtractiveReader to use new ComponentDevice

* Update similarity ranker to follow extractive reader implementation

* Fixing pylint

* Make mypy mostly happy

* Add new unit test to test device_map

* Adding unit tests

* Some refactoring

* Add more tests

* Add more tests

* Add another unit test

* Update first_device property to return a ComponentDevice to be able to use the to methods

* Updating tests for test_device

* Update tests and now explicitly modify device_map in model_kwargs

* Update haystack/utils/hf.py

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>

* Make mypy happy

* mypy

* Remove unneeded optional flag

* Update ExtractiveReader with new logic

* Update ranker to follow new logic

* Removing unneeded code

* Make mypy happy

* fxi pylint

* Fix test

* Adding unit tests for device_map="auto"

* Add unit tests for ranker

* PR comments

* Make util method

* Adding unit tests

* Fix type annotation

* Fix pylint

* Fix test

---------

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>
2024-01-30 13:47:57 +01:00
Madeesh Kannan
7376838922
feat!: Framework-agnostic device management (#6748)
* feat: Framework-agnostic device management

* Add release note

* Linting

* Fix test

* Add `first_device` property, expand release notes, validate `ComponentDevice` state
2024-01-17 10:41:34 +01:00
ZanSara
96c0b59aaa
feat!: Rename model_name_or_path to model in ExtractiveReader (#6736)
* rename model parameter and internam model attribute in ExtractiveReader

* fix tests for ExtractiveReader

* fix e2e

* reno

* another fix

* review feedback

* Update releasenotes/notes/rename-model-param-reader-b8cbb0d638e3b8c2.yaml
2024-01-15 14:48:33 +01:00
Stefano Fiorucci
80c3e6825a
fix: serialize/deserialize torch dtype in the components that need it (#6713)
* first draft for ranker

* same for the reader

* consider also bnb_4bit_compute_dtype

* dtype serialization in hugging_face_local_generator

* add release note

* address dtype defined in huggingface_pipeline_kwargs

* test quantization options in reader

* fix

* serialize quantization_config

* test quantization_config serialization

* address feedback

* fix typo
2024-01-12 12:22:45 +01:00
Sebastian Husch Lee
dcf37c5173
feat: Extractive QA answer deduplication (#6459)
* Add answer deduplication

* Fix test

* Handle None case

* Release notes

* Handle cases where documents or answer spans could be None

* Adding checks for Nones and satisfying mypy

* Add option to turn off deduplication

* Adding unit tests

* Refactored tests to use fixtures

* Added overlap_threshold to run

* Update test

* Fixes related to the merge

* Remove casting, use direct variable names

* Move out if statement and add new test for it

* Update if statement to match comment

* Update how if statements work
2023-12-18 19:27:04 +01:00
Julian Risch
25a6eaae05
feat!: Rename ExtractiveReader's confidence_threshold to score_threshold (#6532)
* rename to score_threshold

* Update haystack/components/readers/extractive.py

Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com>

---------

Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com>
2023-12-12 15:12:28 +01:00
Silvano Cerza
18dbce25fc
refacotr: Refactor answer dataclasses (#6523)
* Refactor answer dataclasses

* Add release notes

* Fix tests

* Fix end to end tests

* Enhance ExtractiveReader
2023-12-11 18:50:49 +01:00
Bijay Gurung
c5342d1110
fix: Prevent invalid answer from being selected in ExtractiveReader (#6460)
* Fix invalid answer being selected issue on ExtractiveReader

* Rename variables to not shadow arguments
2023-12-06 09:49:02 +01:00
Massimiliano Pippi
7c05f37a53
remove unit marker (#6450) 2023-11-29 19:24:25 +01:00
Silvano Cerza
e6637f5ec2 Fix all tests 2023-11-24 14:48:43 +01:00
Massimiliano Pippi
8adb8bbab8
Remove preview folder in test/
---------

Co-authored-by: Silvano Cerza <silvanocerza@gmail.com>
2023-11-24 11:52:55 +01:00