Silvano Cerza
|
a7f742fdbd
|
refactor: Rename docstore fixture to document_store (#6360)
* Prevent pytest_generate_tests from polluting preview tests
* Rename docstore fixture to document_store
|
2023-11-20 17:41:48 +01:00 |
|
ZanSara
|
e905066458
|
feat: make InMemoryDocumentStore return the number of docs actually written (#6274)
* make InMemoryDocumentStore return the number of documents actually written
* add fixme
* reno
* add missing continue
|
2023-11-20 10:03:22 +01:00 |
|
ZanSara
|
dfc1d452bb
|
feat: upgrade canals to 0.10.1 (#6309)
* upgrade canals
* reno
* trigger preview e2e
* bump canals
* fix decorator
* fix test
* test factory
* tests inmemory
* tests writer
* test audio
* tests builders
* tests caching
* tests embedders
* tests converters
* tests generators
* tests rankers
* tests retrievers
* fix pipeline and telemetry tests
* remove trigger
|
2023-11-17 14:46:23 +01:00 |
|
Silvano Cerza
|
6dda6e5b2d
|
Change Document.__eq__ to compare all fields (#6323)
|
2023-11-16 17:17:43 +01:00 |
|
Silvano Cerza
|
7287657f0e
|
refactor: Rename Document 's text field to content (#6181)
* Rework Document serialisation
Make Document backward compatible
Fix InMemoryDocumentStore filters
Fix InMemoryDocumentStore.bm25_retrieval
Add release notes
Fix pylint failures
Enhance Document kwargs handling and docstrings
Rename Document's text field to content
Fix e2e tests
Fix SimilarityRanker tests
Fix typo in release notes
Rename Document's metadata field to meta (#6183)
* fix bugs
* make linters happy
* fix
* more fix
* match regex
---------
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
|
2023-10-31 12:44:04 +01:00 |
|
Silvano Cerza
|
ae812617fd
|
Remove Document.array field (#6139)
|
2023-10-23 13:01:15 +02:00 |
|
Silvano Cerza
|
c8d162ced9
|
refactor: Change Document.embedding type to list of floats (#6135)
* Change Document.embedding type
* Add release notes
* Fix document_store testing
* Fix pylint
* Fix tests
|
2023-10-23 12:26:05 +02:00 |
|
Silvano Cerza
|
3f98bd9137
|
refactor: Rework Document.id generation (#6122)
* Rework Document id generation
* Fix tests
* Add release notes
* Fix failing integration test
* Remove score from Document id generation
* Enhance tests
* Update release notes
---------
Co-authored-by: Julian Risch <julian.risch@deepset.ai>
|
2023-10-20 10:34:28 +02:00 |
|
Stefano Fiorucci
|
4e4af99a5e
|
refactor!: rename MemoryDocumentStore and related Retrievers (#6076)
* rename doc store and retrievers
* release note
* fix patch
|
2023-10-17 16:15:16 +02:00 |
|
ZanSara
|
6e70d403f8
|
feat: Improve Document for Haystack 2.0 (#5738)
* initial draft
* tests
* add proposal
* proposal number
* reno
* fix tests and usage of content and content_type
* update branch & fix more tests
* mypy
* add docstring
* fix more tests
* review feedback
* improve __str__
* Apply suggestions from code review
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
* Update haystack/preview/dataclasses/document.py
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
* improve __str__
* fix tests
* fix more tests
* Update haystack/preview/document_stores/memory/document_store.py
---------
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
|
2023-09-11 17:40:00 +02:00 |
|
Stefano Fiorucci
|
b7bea3ae9c
|
MemoryDocumentStore - Embedding retrieval (2.0) (#5715)
* MemoryDocumentStore - Embedding retrieval draft
* add release notes
* fix mypy
* better comment
* improve return_embeddings handling
* address PR comments
* update docstrings
* incorporated feeback
---------
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
|
2023-09-07 15:44:07 +02:00 |
|
ZanSara
|
b1daa7c647
|
chore: migrate to canals==0.7.0 (#5647)
* add default_to_dict and default_from_dict placeholders to ease migration to canals 0.7.0
* canals==0.7.0
* whisper components
* add to_dict/from_dict stubs
* import serialization methods in init to hide canals imports
* reno
* export deserializationerror too
* Update haystack/preview/__init__.py
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
* serialization methods for LocalWhisperTranscriber (#5648)
* chore: serialization methods for `FileExtensionClassifier` (#5651)
* serialization methods for FileExtensionClassifier
* Update test_file_classifier.py
* chore: serialization methods for `SentenceTransformersDocumentEmbedder` (#5652)
* serialization methods for SentenceTransformersDocumentEmbedder
* fix device management
* serialization methods for SentenceTransformersTextEmbedder (#5653)
* serialization methods for TextFileToDocument (#5654)
* chore: serialization methods for `RemoteWhisperTranscriber` (#5650)
* serialization methods for RemoteWhisperTranscriber
* remove patches
* Add default to_dict and from_dict in document stores built with factory (#5674)
* fix tests (#5671)
* chore: simplify serialization methods for `MemoryDocumentStore` (#5667)
* simplify serialization for MemoryDocumentStore
* remove redundant tests
* pylint
* chore: serialization methods for `MemoryRetriever` (#5663)
* serialization method for MemoryRetriever
* more tests
* remove hash from default_document_store_to_dict
* remove diff in factory.py
* chore: serialization methods for `DocumentWriter` (#5661)
* serialization methods for DocumentWriter
* more tests
* use factory
* black
---------
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
|
2023-08-29 18:15:07 +02:00 |
|
Massimiliano Pippi
|
f9bd64ba9e
|
make code layout consistent (#5561)
|
2023-08-14 16:35:34 +02:00 |
|
Massimiliano Pippi
|
714b944dc2
|
chore: rename store to document_store for clarity (#5547)
* store -> document_store
* fix leftovers
* fix import name
* moar leftovers
* rebase on main, update MemoryDocumentStore to the new protocol
* Update haystack/preview/pipeline.py
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
---------
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
|
2023-08-12 08:44:36 +02:00 |
|
Silvano Cerza
|
a7416bcf89
|
Add to_dict and from_dict methods for Stores (#5541)
* Add to_dict and from_dict methods for Stores
* Add release notes
* Add tests with custom init parameters
|
2023-08-11 14:45:56 +02:00 |
|
Massimiliano Pippi
|
c079576a87
|
chore: move base test class into haystack core (#5509)
* move base test class into haystack core
* fix linter
* do not compute coverage of testing code
|
2023-08-04 12:42:13 +02:00 |
|
ZanSara
|
f49bd3a12f
|
feat: introduce Store protocol (v2) (#5259)
* add protocol and adapt pipeline
* review feedback & update tests
* pylint
* Update haystack/preview/document_stores/protocols.py
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
* Update haystack/preview/document_stores/memory/document_store.py
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
* docstring of Store
* adapt memorydocumentstore
* fix tests
---------
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
|
2023-07-07 12:10:08 +02:00 |
|
Vladimir Blagojevic
|
bc86f57715
|
feat: BM25 retrieval for MemoryDocumentStore (#5151)
|
2023-06-27 17:42:23 +02:00 |
|
ZanSara
|
3a6db68408
|
feat: allow filtering documents on all fields (v2) (#4773)
* extend tests
* remove stray test
* pylint
* mypy
* review feedback
* fix tests
* fix last tests
* remove comment
* remove print statement
* pylint
* add flatten test
* remove direct acces/ direct write in docstore tests
* fix tests
|
2023-05-10 16:33:47 +02:00 |
|
ZanSara
|
9cb153d0f4
|
fix: add unit markers to several v2 tests (#4851)
* add markers
* remove stray marker
|
2023-05-10 13:46:13 +02:00 |
|
ZanSara
|
a9ec954c45
|
bug: fix filtering in MemoryDocumentStore (v2) (#4768)
* fix filtering bug
* pylint
* improve asserts
|
2023-05-03 09:33:12 +02:00 |
|
ZanSara
|
f2106ab37b
|
feat: initial implementation of MemoryDocumentStore for new Pipelines (#4447)
* add stub implementation
* reimplementation
* test files
* docstore tests
* tests for document
* better testing
* remove mmh3
* readme
* only store, no retrieval yet
* linting
* review feedback
* initial filters implementation
* working on filters
* linters
* filtering works and is isolated by document store
* simplify filters
* comments
* improve filters matching code
* review feedback
* pylint
* move logic into_create_id
* mypy
|
2023-04-13 09:36:23 +02:00 |
|