Sara Zan 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							96c05c34e4 
							
						 
					 
					
						
						
							
							Pipeline node names validation ( #1601 )  
						
						 
						
						... 
						
						
						
						* Add node names validation
* Add tests
* Improve test and test that params exists before validating
* Fix the REST API
* Use minilm-uncased-squad2 instead of roberta-base-squad2
* Use roberta model for test_pipeline.yaml
* Turn off TOKENIZERS_PARALLELISM in generator tests (#1605 )
* Account for non-targeted parameters
* Restore previous parameters handling in the rest api
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Julian Risch <julian.risch@deepset.ai> 
						
						
					 
					
						2021-10-19 15:22:44 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Sara Zan 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							575e64333c 
							
						 
					 
					
						
						
							
							Delete documents by ID in all document stores ( #1606 )  
						
						 
						
						... 
						
						
						
						* Modify BaseDocumentStore.delete_documents() signature, implement ElasticSearch, and add tests
* Add implementation for InMemory
* Implement for SQL, FAISS and Milvus too
* Add tests for faiss and milvus
* Fix delete_all_documents
* Implement deletion by ID for weaviate
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: sarthakj2109 <54064348+sarthakj2109@users.noreply.github.com>
Co-authored-by: prafgup <prafulgupta6@gmail.com>
Co-authored-by: ankh6 <andynzemokalumu@live.be> 
						
						
					 
					
						2021-10-19 12:30:15 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Malte Pietsch 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							4a6c9302b3 
							
						 
					 
					
						
						
							
							Redesign primitives - Document, Answer, Label  ( #1398 )  
						
						 
						
						... 
						
						
						
						* first draft / notes on new primitives
* wip label / feedback refactor
* rename doc.text -> doc.content. add doc.content_type
* add datatype for content
* remove faq_question_field from ES and weaviate. rename text_field -> content_field in docstores. update tutorials for content field
* update converters for . Add warning for empty
* renam label.question -> label.query. Allow sorting of Answers.
* WIP primitives
* update ui/reader for new Answer format
* Improve Label. First refactoring of MultiLabel. Adjust eval code
* fixed workflow conflict with introducing new one (#1472 )
* Add latest docstring and tutorial changes
* make add_eval_data() work again
* fix reader formats. WIP fix _extract_docs_and_labels_from_dict
* fix test reader
* Add latest docstring and tutorial changes
* fix another test case for reader
* fix mypy in farm reader.eval()
* fix mypy in farm reader.eval()
* WIP ORM refactor
* Add latest docstring and tutorial changes
* fix mypy weaviate
* make label and multilabel dataclasses
* bump mypy env in CI to python 3.8
* WIP refactor Label ORM
* WIP refactor Label ORM
* simplify tests for individual doc stores
* WIP refactoring markers of tests
* test alternative approach for tests with existing parametrization
* WIP refactor ORMs
* fix skip logic of already parametrized tests
* fix weaviate behaviour in tests - not parametrizing it in our general test cases.
* Add latest docstring and tutorial changes
* fix some tests
* remove sql from document_store_types
* fix markers for generator and pipeline test
* remove inmemory marker
* remove unneeded elasticsearch markers
* add dataclasses-json dependency. adjust ORM to just store JSON repr
* ignore type as dataclasses_json seems to miss functionality here
* update readme and contributing.md
* update contributing
* adjust example
* fix duplicate doc handling for custom index
* Add latest docstring and tutorial changes
* fix some ORM issues. fix get_all_labels_aggregated.
* update drop flags where get_all_labels_aggregated() was used before
* Add latest docstring and tutorial changes
* add to_json(). add + fix tests
* fix no_answer handling in label / multilabel
* fix duplicate docs in memory doc store. change primary key for sql doc table
* fix mypy issues
* fix mypy issues
* haystack/retriever/base.py
* fix test_write_document_meta[elastic]
* fix test_elasticsearch_custom_fields
* fix test_labels[elastic]
* fix crawler
* fix converter
* fix docx converter
* fix preprocessor
* fix test_utils
* fix tfidf retriever. fix selection of docstore in tests with multiple fixtures / parameterizations
* Add latest docstring and tutorial changes
* fix crawler test. fix ocrconverter attribute
* fix test_elasticsearch_custom_query
* fix generator pipeline
* fix ocr converter
* fix ragenerator
* Add latest docstring and tutorial changes
* fix test_load_and_save_yaml for elasticsearch
* fixes for pipeline tests
* fix faq pipeline
* fix pipeline tests
* Add latest docstring and tutorial changes
* fix weaviate
* Add latest docstring and tutorial changes
* trigger CI
* satisfy mypy
* Add latest docstring and tutorial changes
* satisfy mypy
* Add latest docstring and tutorial changes
* trigger CI
* fix question generation test
* fix ray. fix Q-generation
* fix translator test
* satisfy mypy
* wip refactor feedback rest api
* fix rest api feedback endpoint
* fix doc classifier
* remove relation of Labels -> Docs in SQL ORM
* fix faiss/milvus tests
* fix doc classifier test
* fix eval test
* fixing eval issues
* Add latest docstring and tutorial changes
* fix mypy
* WIP replace dataclasses-json with manual serialization
* Add latest docstring and tutorial changes
* revert to dataclass-json serialization for now. remove debug prints.
* update docstrings
* fix extractor. fix Answer Span init
* fix api test
* keep meta data of answers in reader.run()
* fix meta handling
* adress review feedback
* Add latest docstring and tutorial changes
* make document=None for open domain labels
* add import
* fix print utils
* fix rest api
* adress review feedback
* Add latest docstring and tutorial changes
* fix mypy
Co-authored-by: Markus Paff <markuspaff.mp@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> 
						
						
					 
					
						2021-10-13 14:23:23 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Sara Zan 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							a30a826c6c 
							
						 
					 
					
						
						
							
							Standardize delete_documents(filter=...) across all document stores ( #1509 )  
						
						 
						
						... 
						
						
						
						* Make InMemoryDocumentStore accept and apply filters in delete_documents()
* Modify test_document_store.py to test the filtered deletion in memory, sql and milvus too
* Make FAISSDocumentStore accept and properly apply filters in delete_documents()
* Add latest docstring and tutorial changes
* Remove accidentally duplicated test
* Remove unnecessary decorators from test/test_document_store.py::test_delete_documents_with_filters
* Add embeddings count test for FAISS and Milvus; Milvus fails it.
* Fixed a bug that made Milvus not deleting embeddings
* Remove batch size parametrization in tests & update all documentstore's docstrings with a filter example
* Add latest docstring and tutorial changes
Co-authored-by: prafgup <prafulgupta6@gmail.com> 
						
						
					 
					
						2021-09-29 09:27:06 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Sara Zan 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							1cd17022af 
							
						 
					 
					
						
						
							
							Fix bug when loading FAISS from supplied config file path ( #1506 )  
						
						 
						
						... 
						
						
						
						* Fix the bug found in issue 135
* Add a test for the custom path 
						
						
					 
					
						2021-09-27 11:25:05 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Sara Zan 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							21513532e5 
							
						 
					 
					
						
						
							
							Improve save/load of FAISS document store by saving its configuration alongside the index ( #1459 )  
						
						 
						
						... 
						
						
						
						* Saves the FAISSDocumentStore init params to JSON at save() and loads them at load() if they're found. First draft, to be tested.
* Fixing issue with string/Path objects in a few string operations, thanks mypy
* Leverage self.set_config instead of saving the parameters in a separate attribute
* Modify test_faiss_and_milvus:test_faiss_index_save_and_load to test that init params are preserved
* Add assert to verify that the SQL doc count and FAISS vector count is equal. Needs to always specify the name of the SQL db for this to work
* Simplified the implementation a bit, add better comments
* Forgot a return at the end of the file
* Fixing some of the suggestions from the review
* Add a try-catch in the load method and fix the tests
* Typo 
						
						
					 
					
						2021-09-20 08:32:14 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								mathislucka 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							9c4e67d9b6 
							
						 
					 
					
						
						
							
							Enable cosine similarity metric in FAISSDocumentStore ( #1352 )  
						
						 
						
						... 
						
						
						
						* feat: normalize embeddings for cosine sim
* WIP add test case for faiss cosine
* input to faiss normalize needs to be an array of vectors
* fix: test should compare correct result embedding to original embedding
* add sanity check for cosine sim
* fix typo
* normalize cosine score
* Update docstring
Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai> 
						
						
					 
					
						2021-09-20 07:54:26 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								oryx1729 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							9dd7c74f4f 
							
						 
					 
					
						
						
							
							Refactor communication between Pipeline Components ( #1321 )  
						
						 
						
						
						
						
					 
					
						2021-09-10 11:41:16 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								ramgarg102 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							51f0a56e5d 
							
						 
					 
					
						
						
							
							delete_all_documents() replaced by delete_documents() ( #1377 )  
						
						 
						
						... 
						
						
						
						* [UPDT] delete_all_documents() replaced by delete_documents()
* [UPDT] warning logs to be fixed
* [UPDT] delete_all_documents() renamed and the same method added
Co-authored-by: Ram Garg <ramgarg102@gmai.com> 
						
						
					 
					
						2021-08-30 15:18:28 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Malte Pietsch 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							a0921f0c35 
							
						 
					 
					
						
						
							
							Remove Finder ( #1326 )  
						
						 
						
						... 
						
						
						
						* deprecate finder
* remove import
* add doc section for moving from finder to pipelines 
						
						
					 
					
						2021-08-09 13:41:40 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Ikram Ali 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							b76ed4c5a4 
							
						 
					 
					
						
						
							
							Add options for handling duplicate documents (skip, fail, overwrite) ( #1088 )  
						
						 
						
						... 
						
						
						
						* [document_stores] Duplicate document implmentation added for memorystore.
* [document_stores]duplicate documents implementation done for faiss store.
* [document_store] Duplicate document feature added for elasticsearch document store fixed  #1069 
* [document_store] Duplicate documents feature added for milvus document store and bug fixed in faiss document store fixed  #1069 
* [document_store] Code refactored fixed  #1069 
* [document_store]Test cases refactored.
* [document_store] mypy issue fixed.
* [test_case] faiss and milvus test case refactored to support duplicate documents implementation. fixed  #1069 
* [document_store] duplicate_documents_options code refactored.
* [document_store] Code refactored. 
						
						
					 
					
						2021-05-25 13:30:06 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								oryx1729 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							8a57f6b16a 
							
						 
					 
					
						
						
							
							Update tests for FAISSDocumentStore ( #999 )  
						
						 
						
						
						
						
					 
					
						2021-04-27 09:55:31 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Tanay Soni 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							fd5c5dd23c 
							
						 
					 
					
						
						
							
							Introduce incremental updates for embeddings in document stores ( #812 )  
						
						 
						
						
						
						
					 
					
						2021-02-09 21:25:01 +01:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Lalit Pagaria 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							9f7f95221f 
							
						 
					 
					
						
						
							
							Milvus integration ( #771 )  
						
						 
						
						... 
						
						
						
						* Initial commit for Milvus integration
* Add latest docstring and tutorial changes
* Updating implementation of Milvus document store
* Add latest docstring and tutorial changes
* Adding tests and updating doc string
* Add latest docstring and tutorial changes
* Fixing issue caught by tests
* Addressing review comments
* Fixing mypy detected issue
* Fixing issue caught in test about sorting of vector ids
* fixing test
* Fixing generator test failure
* update docstrings
* Addressing review comments about multiple network call while fetching embedding from milvus server
* Add latest docstring and tutorial changes
* Ignoring mypy issue while converting vector_id to int
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai> 
						
						
					 
					
						2021-01-29 13:29:12 +01:00