ZanSara
|
e888852aec
|
Standardize TextFileToDocument (#6232)
* simplify textfiletodocument
* fix error handling and tests
* stray print
* reno
* streams->sources
* reno
* feedback
* test
* fix tests
|
2023-11-17 15:39:39 +01:00 |
|
Silvano Cerza
|
7287657f0e
|
refactor: Rename Document 's text field to content (#6181)
* Rework Document serialisation
Make Document backward compatible
Fix InMemoryDocumentStore filters
Fix InMemoryDocumentStore.bm25_retrieval
Add release notes
Fix pylint failures
Enhance Document kwargs handling and docstrings
Rename Document's text field to content
Fix e2e tests
Fix SimilarityRanker tests
Fix typo in release notes
Rename Document's metadata field to meta (#6183)
* fix bugs
* make linters happy
* fix
* more fix
* match regex
---------
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
|
2023-10-31 12:44:04 +01:00 |
|
Julian Risch
|
9f3b6512be
|
refactor: Remove reimplementations of default from_dict /to_dict and corresponding tests in 2.0 (#6108)
* whisper transcriber
* remove from/to_dict from builders
* remove from/to_dict from embedders
* remove from/to_dict from fetcher, file_converters
* remove from/to_dict from generators, preprocessors
* remove from/to_dict from ranker, reader
* remove from/to_dict from router, sampler, websearch
* pylint
* reno
* refactor import
* remove unused import
|
2023-10-19 11:17:02 +02:00 |
|
Christian Clauss
|
bf6d306d68
|
ci: Simplify Python code with ruff rules SIM (#5833)
* ci: Simplify Python code with ruff rules SIM
* Revert #5828
* ruff --select=I --fix haystack/modeling/infer.py
---------
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
|
2023-09-20 08:32:44 +02:00 |
|
ZanSara
|
6e70d403f8
|
feat: Improve Document for Haystack 2.0 (#5738)
* initial draft
* tests
* add proposal
* proposal number
* reno
* fix tests and usage of content and content_type
* update branch & fix more tests
* mypy
* add docstring
* fix more tests
* review feedback
* improve __str__
* Apply suggestions from code review
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
* Update haystack/preview/dataclasses/document.py
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
* improve __str__
* fix tests
* fix more tests
* Update haystack/preview/document_stores/memory/document_store.py
---------
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
|
2023-09-11 17:40:00 +02:00 |
|
ZanSara
|
b1daa7c647
|
chore: migrate to canals==0.7.0 (#5647)
* add default_to_dict and default_from_dict placeholders to ease migration to canals 0.7.0
* canals==0.7.0
* whisper components
* add to_dict/from_dict stubs
* import serialization methods in init to hide canals imports
* reno
* export deserializationerror too
* Update haystack/preview/__init__.py
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
* serialization methods for LocalWhisperTranscriber (#5648)
* chore: serialization methods for `FileExtensionClassifier` (#5651)
* serialization methods for FileExtensionClassifier
* Update test_file_classifier.py
* chore: serialization methods for `SentenceTransformersDocumentEmbedder` (#5652)
* serialization methods for SentenceTransformersDocumentEmbedder
* fix device management
* serialization methods for SentenceTransformersTextEmbedder (#5653)
* serialization methods for TextFileToDocument (#5654)
* chore: serialization methods for `RemoteWhisperTranscriber` (#5650)
* serialization methods for RemoteWhisperTranscriber
* remove patches
* Add default to_dict and from_dict in document stores built with factory (#5674)
* fix tests (#5671)
* chore: simplify serialization methods for `MemoryDocumentStore` (#5667)
* simplify serialization for MemoryDocumentStore
* remove redundant tests
* pylint
* chore: serialization methods for `MemoryRetriever` (#5663)
* serialization method for MemoryRetriever
* more tests
* remove hash from default_document_store_to_dict
* remove diff in factory.py
* chore: serialization methods for `DocumentWriter` (#5661)
* serialization methods for DocumentWriter
* more tests
* use factory
* black
---------
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
|
2023-08-29 18:15:07 +02:00 |
|
Silvano Cerza
|
66f615a3a4
|
Remove BaseTestComponent (#5613)
* Remove BaseTestComponent
* Add release notes
|
2023-08-23 17:03:37 +02:00 |
|
ZanSara
|
5ca4874df9
|
Migrate existing v2 components to Canals 0.4.0 (#5532)
* pin canals==0.4.0
* update audio components
* allow audio components to receive whisper_params in init too
* migrating memoryretriever
* migrate memoryretriever
* migrate TextFileToDocument
* fix TextFileToDocument tests
* fix pipeline tests
* fix defaults management
* reno
* inverted assignments
* Simplify release notes
---------
Co-authored-by: Silvano Cerza <silvanocerza@gmail.com>
|
2023-08-09 15:51:32 +02:00 |
|
bogdankostic
|
a51ca19fe4
|
feat: Add TextFileToDocument component (v2) (#5467)
* Add TextfileToDocument component
* Add docstrings
* Add unit tests
* Add release note file
* Make use of progress bar
* Add TextfileToDocument to __init__.py
* Use lazy % formatting in logging functions
* Remove f from non-f-string
* Add TextfileToDocument to __init__.py
* Use correct dependency extra
* Compare file path against path object
* PR feedback
* PR feedback
* Update haystack/preview/components/file_converters/txt.py
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
* Update docstrings
* Add error handling
* Add unit test
* Reintroduce falsely removed caplog
---------
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
|
2023-08-01 11:34:52 +02:00 |
|