17 Commits

Author SHA1 Message Date
Stefano Fiorucci
bcc4104729
refactor: utility function for docstore deserialization (#8226)
* refactor docstore deserialization

* more tests

* reno; headers

* expose key
2024-08-14 13:29:27 +02:00
Amna Mubashar
373de97426
Deprecate SentenceWindowRetrieval (#8206) 2024-08-13 13:49:41 +02:00
Amna Mubashar
e0de423ee0
Rename SentenceWindowRetrieval to SentenceWindowRetriever 2024-07-26 17:46:44 +02:00
Sebastian Husch Lee
baed478f23
fix: Fix split_start_idx and _split_overlap information in DocumentSplitter (#8046)
* Fix bug in DocumentSplitter and expand tests to catch said bug

* Fix split overlap information calc and actually test it

* Add release notes

* Remove comments

* Same fix in SentenceWindowRetrieval

---------

Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com>
2024-07-24 15:15:36 +02:00
David S. Batista
431aa4a406
updating sentence window retriever tests (#8034)
* updating sentence window retriever tests

* fix
2024-07-16 22:10:55 +02:00
David S. Batista
ebfeb571d7
feat: add sentence window retrieval (#7997)
* initial import

* adding tests

* adding license and release notes

* adding missing release notes

* working with any type of doc store

* nit

* adding get_class_object to serialization package

* nit

* refactoring get_class_object()

* refactoring get_class_object()

* chaning type and var names

* more refactoring

* Update haystack/core/serialization.py

Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com>

* Update haystack/core/serialization.py

Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com>

* Update test/core/test_serialization.py

Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com>

* more refactoring

* more refactoring

* Pydoc syntax

---------

Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com>
2024-07-10 13:13:46 +00:00
Vladimir Blagojevic
678f193f10
feat: Add filter_policy init parameter to in memory retrievers (#7795)
* Add filter_policy init parameter to in-memory retrievers
2024-06-04 17:51:16 +02:00
Silvano Cerza
854c4173f2
feat: Add memory sharing between different instances of InMemoryDocumentStore (#7781)
* Add memory sharing between different instances of InMemoryDocumentStore

* Fix FilterRetriever tests

* Fix InMemoryBM25Retriever tests
2024-05-31 16:44:14 +02:00
Massimiliano Pippi
10c675d534
chore: add license header to all modules (#7675)
* add license header to modules
* check license header at linting time
2024-05-09 13:40:36 +00:00
Bijay Gurung
74683fe74d
Feat: Add FilterRetriever (#6836)
* Add FilterRetriever draft

* Implement FilterRetriever and add tests

* Update comparison to compare whole docs instead of just contents

* Expose FilterRetriever at the retrievers level

* Update docstring (add example usage)

* Add filter_retriever in the API reference docs config

Update retriever search path to start one dir level higher

* simplify _documents_equal

* improve usage example

---------

Co-authored-by: anakin87 <stefanofiorucci@gmail.com>
2024-02-08 08:48:46 +01:00
ZanSara
1182c08daf
fix: Dont filter negative scores when using BM25Okapi and scale_score=False (#6889)
* dont filter negatives for unscaled Okapi

* change BM25 algorithm default to BM25L

* Update haystack/document_stores/in_memory/document_store.py

* improve comment
2024-02-06 11:07:27 +01:00
Madeesh Kannan
a5189dd035
fix!: InMemoryBM25Retriever no longer returns documents that have a score of 0.0 (#6717)
* fix!: `InMemoryBM25Retriever` no longer returns documents that have a score of 0.0

Also update tests to accommodate the new behavior.

* Remove superfluous code
2024-01-12 17:50:55 +01:00
Massimiliano Pippi
e1ec4e5e4d
refact!: Remove symbols under the haystack.document_stores namespace (#6714)
* remove symbols under the haystack.document_stores namespace

* Update haystack/document_stores/types/protocol.py

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>

* fix

* same for retrievers

* leftovers

* more leftovers

* add relnote

* leftovers

* one more

* fix examples

---------

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
2024-01-10 21:20:42 +01:00
Stefano Fiorucci
4912f7cb58
refactor!: improve the deserialization logic for components that use a Document Store (#6466)
* improve deserialization

* rm ds decorator

* improve tests

* fix pylint

* rm decorator from module init

* rm decorator

* rm decorator from factory

* fix tests

* release note

* rm print
2023-12-04 15:17:28 +01:00
Massimiliano Pippi
7c05f37a53
remove unit marker (#6450) 2023-11-29 19:24:25 +01:00
Silvano Cerza
e6637f5ec2 Fix all tests 2023-11-24 14:48:43 +01:00
Massimiliano Pippi
8adb8bbab8
Remove preview folder in test/
---------

Co-authored-by: Silvano Cerza <silvanocerza@gmail.com>
2023-11-24 11:52:55 +01:00