haystack

mirror of https://github.com/deepset-ai/haystack.git synced 2025-07-18 22:42:24 +00:00

Author	SHA1	Message	Date
Lalit Pagaria	5dbd899a93	Experimental changes to support Milvus 2.x (#1473 ) * Experimental changes to support Milvus 2.x * Milvus 2.0 need other containers hence adding them * Add latest docstring and tutorial changes * Fixing tests * Correcting use of list collections * correcting connection close * Removing connection close logic * removing flush * using collection instead of connection * fixing describe collection * Fixing insert, query and search based on new signature * Making mypy happy * Fixing one test case * Fixing search and embedding fetch based on newer api * Implementing delete vector id function * Wrapping up final changes * Add latest docstring and tutorial changes * Correcting requirements.txt * removing empty line in requirements.txt * add docstring and exception for delete * add docstring. condition import on env var. raise exception for deletion * fix typo * change delete signature * ignore typing for import Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>	2021-10-25 10:39:48 +02:00
Julian Risch	6033319cfe	Fix parameter names in tutorial 5 and 12 (#1639 ) * Fix parameter names in tutorial 5 * Update parameters in tutorial notebook * Add latest docstring and tutorial changes * Add latest docstring and tutorial changes Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2021-10-22 17:22:51 +02:00
Timo Moeller	6da2c73611	Add nltk download, add folder for file upload (#1633 )	2021-10-22 16:03:33 +02:00
Timo Moeller	9dc125df9d	Bugfix Tutorial 5 parameters, adjust default split length (#1635 ) Bugfix parameters, adjust default split length, add sentencetransformers	2021-10-22 16:03:12 +02:00
Sara Zan	f67b213797	Make EntityExtractor work when loaded from YAML (#1636 ) * Add set_config to EntityExtractor * Import EntityExtractor in pipeline.py, or it won't be properly registered as a subclass	2021-10-22 14:41:26 +02:00
Julian Risch	0aba5ca57d	Update jobs link in readme (#1629 )	2021-10-21 12:10:18 +02:00
Julian Risch	52e1fc991e	Update jobs link to personio (#1611 ) * Update jobs link to personio * Add latest docstring and tutorial changes * Change jobs link to main website * Add latest docstring and tutorial changes Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2021-10-21 11:42:32 +02:00
Julian Risch	9de140110f	Use smaller model for one generator test case (#1622 ) * Use smaller model for one generator test case * Reduce max_length of generated sequences in tests	2021-10-20 17:57:15 +02:00
Sara Zan	bb066c0a2c	Fix for the Streamlit demo (was sending parameters to a non-existing node of the pipeline) (#1620 )	2021-10-20 11:55:29 +02:00
Julian Risch	f2a3f95ab6	add note on gpu runtime to tutorial 13 (#1614 ) * add note on gpu runtime to tutorial 13 * Add latest docstring and tutorial changes Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2021-10-20 09:55:56 +02:00
Julian Risch	4ed2b90bca	Add delete_labels() except for weaviate doc store (#1604 ) * Add delete_labels() except for weaviate doc store * Add latest docstring and tutorial changes * Add test for delete_labels() * Adapt filter for label deletion to different doc stores in test * Allow delete labels by _id in elasticsearch * Add latest docstring and tutorial changes * Add latest docstring and tutorial changes * re-add bugfix after merge * Add ids as optional parameter * Add latest docstring and tutorial changes Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2021-10-19 17:20:28 +02:00
Sara Zan	9722bbf1e1	DPR training: Rename `TransformersAdamW` to `AdamW` (#1613 ) * Rename TransformersAdamW into simply AdamW (probably changed in transformers at some point) * Add latest docstring and tutorial changes Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2021-10-19 16:18:30 +02:00
Sara Zan	96c05c34e4	Pipeline node names validation (#1601 ) * Add node names validation * Add tests * Improve test and test that params exists before validating * Fix the REST API * Use minilm-uncased-squad2 instead of roberta-base-squad2 * Use roberta model for test_pipeline.yaml * Turn off TOKENIZERS_PARALLELISM in generator tests (#1605) * Account for non-targeted parameters * Restore previous parameters handling in the rest api Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Julian Risch <julian.risch@deepset.ai>	2021-10-19 15:22:44 +02:00
Malte Pietsch	3a7d029fdd	Fix Opensearch field type (flattened -> nested) (#1609 ) * fix field type flattened -> nested. change default port from 9201 to 9200 * change port in benchmarks	2021-10-19 14:40:53 +02:00
Girish A Koushik	5a6285f23f	Add checkpointing for reader.train() to allow stopping + resuming training (#1554 ) * adding create checkpoint feature for train function in farm reader * added arguments for create_or_load_checkpoint function * accessing class method inside Trainer class * added default value for checkpoint_root_dir and checkpoint_every, checkpoints_to_keep as arguments for reader.train() * change in default value for checkpoint_root_dir and checkpoint_every * update docstring and add Path conversion Co-authored-by: girish.koushik <girish.koushik@diatoz.com> Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>	2021-10-19 12:36:32 +02:00
Sara Zan	575e64333c	Delete documents by ID in all document stores (#1606 ) * Modify BaseDocumentStore.delete_documents() signature, implement ElasticSearch, and add tests * Add implementation for InMemory * Implement for SQL, FAISS and Milvus too * Add tests for faiss and milvus * Fix delete_all_documents * Implement deletion by ID for weaviate Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: sarthakj2109 <54064348+sarthakj2109@users.noreply.github.com> Co-authored-by: prafgup <prafulgupta6@gmail.com> Co-authored-by: ankh6 <andynzemokalumu@live.be>	2021-10-19 12:30:15 +02:00
Malte Pietsch	eb95f0e8aa	Add more flexible options for model downloads (Proxies, resume_download, local_files_only...) (#1256 ) * allow passing more options for model/tokenizer download from remote * temporarily change dependency to current farm master * Add latest docstring and tutorial changes * add kwargs * add docstrings * Add latest docstring and tutorial changes Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2021-10-18 15:47:36 +02:00
Malte Pietsch	3d58e81b5e	Switch from dataclass to pydantic dataclass & Fix Swagger API Docs (#1598 ) * test pydantic dataclasses * Add latest docstring and tutorial changes * enable pydantic mypy plugin * switch to pydentic dataclasses and implement custom to_json from_json * clean up Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2021-10-18 14:38:14 +02:00
Malte Pietsch	3a4b3cd59d	Update CONTRIBUTING.md	2021-10-18 09:14:03 +02:00
bogdankostic	655d721371	Add Table Reader (#1446 ) * first draft / notes on new primitives * wip label / feedback refactor * rename doc.text -> doc.content. add doc.content_type * add datatype for content * remove faq_question_field from ES and weaviate. rename text_field -> content_field in docstores. update tutorials for content field * update converters for . Add warning for empty * Add first draft of TableReader * renam label.question -> label.query. Allow sorting of Answers. * Add calculation of answer scores * WIP primitives * Adapt input and output to new primitives * Add doc strings * Add tests * update ui/reader for new Answer format * Improve Label. First refactoring of MultiLabel. Adjust eval code * fixed workflow conflict with introducing new one (#1472) * Add latest docstring and tutorial changes * make add_eval_data() work again * fix reader formats. WIP fix _extract_docs_and_labels_from_dict * fix test reader * Add latest docstring and tutorial changes * fix another test case for reader * fix mypy in farm reader.eval() * fix mypy in farm reader.eval() * WIP ORM refactor * Add latest docstring and tutorial changes * fix mypy weaviate * make label and multilabel dataclasses * bump mypy env in CI to python 3.8 * WIP refactor Label ORM * WIP refactor Label ORM * simplify tests for individual doc stores * WIP refactoring markers of tests * test alternative approach for tests with existing parametrization * WIP refactor ORMs * fix skip logic of already parametrized tests * fix weaviate behaviour in tests - not parametrizing it in our general test cases. * Add latest docstring and tutorial changes * fix some tests * remove sql from document_store_types * fix markers for generator and pipeline test * remove inmemory marker * remove unneeded elasticsearch markers * add dataclasses-json dependency. adjust ORM to just store JSON repr * ignore type as dataclasses_json seems to miss functionality here * update readme and contributing.md * update contributing * adjust example * fix duplicate doc handling for custom index * Add latest docstring and tutorial changes * fix some ORM issues. fix get_all_labels_aggregated. * update drop flags where get_all_labels_aggregated() was used before * Add latest docstring and tutorial changes * add to_json(). add + fix tests * fix no_answer handling in label / multilabel * fix duplicate docs in memory doc store. change primary key for sql doc table * fix mypy issues * fix mypy issues * haystack/retriever/base.py * fix test_write_document_meta[elastic] * fix test_elasticsearch_custom_fields * fix test_labels[elastic] * fix crawler * fix converter * fix docx converter * fix preprocessor * fix test_utils * fix tfidf retriever. fix selection of docstore in tests with multiple fixtures / parameterizations * Add latest docstring and tutorial changes * fix crawler test. fix ocrconverter attribute * fix test_elasticsearch_custom_query * fix generator pipeline * fix ocr converter * fix ragenerator * Add latest docstring and tutorial changes * fix test_load_and_save_yaml for elasticsearch * fixes for pipeline tests * fix faq pipeline * fix pipeline tests * Add latest docstring and tutorial changes * fix weaviate * Add latest docstring and tutorial changes * trigger CI * satisfy mypy * Add latest docstring and tutorial changes * satisfy mypy * Add latest docstring and tutorial changes * trigger CI * fix question generation test * fix ray. fix Q-generation * fix translator test * satisfy mypy * wip refactor feedback rest api * fix rest api feedback endpoint * fix doc classifier * remove relation of Labels -> Docs in SQL ORM * fix faiss/milvus tests * fix doc classifier test * fix eval test * fixing eval issues * Add latest docstring and tutorial changes * fix mypy * WIP replace dataclasses-json with manual serialization * Add latest docstring and tutorial changes * revert to dataclass-json serialization for now. remove debug prints. * update docstrings * fix extractor. fix Answer Span init * fix api test * Adapt answer format * Add latest docstring and tutorial changes * keep meta data of answers in reader.run() * Fix mypy * fix meta handling * adress review feedback * Add latest docstring and tutorial changes * Allow inference on GPU * Remove automatic aggregation * Add automatic aggregation * Add latest docstring and tutorial changes * Add torch-scatter dependency * Add wheel to torch-scatter dependency * Fix requirements * Fix requirements * Fix requirements * Adapt setup.py to allow for wheels * Fix requirements * Fix requirements * Add type hints and code snippet * Add latest docstring and tutorial changes Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai> Co-authored-by: Markus Paff <markuspaff.mp@gmail.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2021-10-15 16:34:48 +02:00
Julian Risch	5ec29a5283	Limit generator tests to memory doc store; split pipeline tests (#1602 ) * Limit generator tests to memory doc store; split pipeline tests * Add latest docstring and tutorial changes Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2021-10-15 15:37:46 +02:00
CandiceYu8	5cfdabda2c	[fix] MySQL connection 'check_same_thread' error (#1585 ) * [fix] sql mysql connection 'check_same_thread' error * adjust sql connection if-block logic	2021-10-15 10:29:36 +02:00
Malte Pietsch	451e51a224	Update code snippet in readme	2021-10-14 18:15:20 +02:00
ju-gu	bd823c9a6f	Update Crawler documentation (#1588 ) Typo in crawling the documentation website.	2021-10-14 12:24:36 +02:00
Malte Pietsch	99c8046367	Fix Tutorials (#1594 ) * fix response format of DocumentSearchPipeline * Add latest docstring and tutorial changes * fix typos * change prints in tutorial 4 * Add latest docstring and tutorial changes * fix tutorial 13 * Add latest docstring and tutorial changes * remove unused import Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2021-10-14 11:49:35 +02:00
Malte Pietsch	d0b71d39e6	adjust startup sequence in docker compose	2021-10-13 19:43:58 +02:00
Malte Pietsch	caba590576	Fix answer format in ui (#1591 ) * fix answer format in ui * Add latest docstring and tutorial changes Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2021-10-13 16:48:33 +02:00
Malte Pietsch	34de811594	make farm logging less verbose	2021-10-13 14:45:54 +02:00
Malte Pietsch	82c2cdf7cd	Merge branch 'master' of github.com:deepset-ai/haystack	2021-10-13 14:45:17 +02:00
Malte Pietsch	db2b5d913b	Fix param in tutorial 8	2021-10-13 14:45:09 +02:00
Malte Pietsch	4a6c9302b3	Redesign primitives - `Document`, `Answer`, `Label` (#1398 ) * first draft / notes on new primitives * wip label / feedback refactor * rename doc.text -> doc.content. add doc.content_type * add datatype for content * remove faq_question_field from ES and weaviate. rename text_field -> content_field in docstores. update tutorials for content field * update converters for . Add warning for empty * renam label.question -> label.query. Allow sorting of Answers. * WIP primitives * update ui/reader for new Answer format * Improve Label. First refactoring of MultiLabel. Adjust eval code * fixed workflow conflict with introducing new one (#1472) * Add latest docstring and tutorial changes * make add_eval_data() work again * fix reader formats. WIP fix _extract_docs_and_labels_from_dict * fix test reader * Add latest docstring and tutorial changes * fix another test case for reader * fix mypy in farm reader.eval() * fix mypy in farm reader.eval() * WIP ORM refactor * Add latest docstring and tutorial changes * fix mypy weaviate * make label and multilabel dataclasses * bump mypy env in CI to python 3.8 * WIP refactor Label ORM * WIP refactor Label ORM * simplify tests for individual doc stores * WIP refactoring markers of tests * test alternative approach for tests with existing parametrization * WIP refactor ORMs * fix skip logic of already parametrized tests * fix weaviate behaviour in tests - not parametrizing it in our general test cases. * Add latest docstring and tutorial changes * fix some tests * remove sql from document_store_types * fix markers for generator and pipeline test * remove inmemory marker * remove unneeded elasticsearch markers * add dataclasses-json dependency. adjust ORM to just store JSON repr * ignore type as dataclasses_json seems to miss functionality here * update readme and contributing.md * update contributing * adjust example * fix duplicate doc handling for custom index * Add latest docstring and tutorial changes * fix some ORM issues. fix get_all_labels_aggregated. * update drop flags where get_all_labels_aggregated() was used before * Add latest docstring and tutorial changes * add to_json(). add + fix tests * fix no_answer handling in label / multilabel * fix duplicate docs in memory doc store. change primary key for sql doc table * fix mypy issues * fix mypy issues * haystack/retriever/base.py * fix test_write_document_meta[elastic] * fix test_elasticsearch_custom_fields * fix test_labels[elastic] * fix crawler * fix converter * fix docx converter * fix preprocessor * fix test_utils * fix tfidf retriever. fix selection of docstore in tests with multiple fixtures / parameterizations * Add latest docstring and tutorial changes * fix crawler test. fix ocrconverter attribute * fix test_elasticsearch_custom_query * fix generator pipeline * fix ocr converter * fix ragenerator * Add latest docstring and tutorial changes * fix test_load_and_save_yaml for elasticsearch * fixes for pipeline tests * fix faq pipeline * fix pipeline tests * Add latest docstring and tutorial changes * fix weaviate * Add latest docstring and tutorial changes * trigger CI * satisfy mypy * Add latest docstring and tutorial changes * satisfy mypy * Add latest docstring and tutorial changes * trigger CI * fix question generation test * fix ray. fix Q-generation * fix translator test * satisfy mypy * wip refactor feedback rest api * fix rest api feedback endpoint * fix doc classifier * remove relation of Labels -> Docs in SQL ORM * fix faiss/milvus tests * fix doc classifier test * fix eval test * fixing eval issues * Add latest docstring and tutorial changes * fix mypy * WIP replace dataclasses-json with manual serialization * Add latest docstring and tutorial changes * revert to dataclass-json serialization for now. remove debug prints. * update docstrings * fix extractor. fix Answer Span init * fix api test * keep meta data of answers in reader.run() * fix meta handling * adress review feedback * Add latest docstring and tutorial changes * make document=None for open domain labels * add import * fix print utils * fix rest api * adress review feedback * Add latest docstring and tutorial changes * fix mypy Co-authored-by: Markus Paff <markuspaff.mp@gmail.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2021-10-13 14:23:23 +02:00
Malte Pietsch	9650f7aed1	Add `debug` and `debug_logs` params to standard pipelines (#1586 ) * add debug and debug_logs to standard pipelines * Add latest docstring and tutorial changes * fix params Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2021-10-12 16:00:48 +02:00
Sara Zan	6354528336	Add `/documents/get_by_filters` endpoint (#1580 ) * Add endpoint to get documents by filter * Add test for /documents/get_by_filter and extend the delete documents test * Add rest_api/file-upload to .gitignore * Make sure the document store is empty for each test * Improve docstrings of delete_documents_by_filters and get_documents_by_filters Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2021-10-12 10:53:54 +02:00
Malte Pietsch	bc7167a96c	Fix name	2021-10-12 10:22:41 +02:00
Sara Zan	25d76f508d	Create EntityExtractor (#1573 ) * Create extractor/entity.py * Aggregate NER words into entities * Support indexing * Add doc strings * Add utility for printing * Update signature of run() to match BaseComponent * Add test * Modify simplify_ner_for_qa to return the dictionary and add its test Co-authored-by: brandenchan <brandenchan@icloud.com>	2021-10-11 11:04:11 +02:00
Markus Sagen	69a0c9f2ed	Clarify docs for PDF conversion, languages and encodings (#1570 ) * Clarify PDF conversion, languages and encodings The parameter name `valid_languages` may be a bit miss-leading from reading only the tutorials. Users may, incorrectly assume that it enforces that the conversions only works for those languages, then it's more of a check. - Provided clarifications in the tutorials to highlight what valid_languages does and that changing the encoding may give better results for their language of choice - Updated the command for `pdftotext` to the correct one * Allow encodings for `convert_files_to_dicts` - Set option of passing encoding to the converters. Trying even for some Latin1 languages, the converter does not do it in a good way. Potential issues is that the encoding defaults to None, which is default for the other converters, but not for the PDFToTextConverter. Could add a check and change the ending to Latin1 for pdf if set to None. Was considering adding it to *kwargs, but since it may be a commonly used feature to be documented, I added it as a keyword argument instead. Would love to hear your input and feedback on in. Set back PDF default encoding * Update documentation	2021-10-11 09:30:12 +02:00
Muhammad Hamdan	dbb32c4f79	Adding TfidfRetriever to __init__.py of the retriever package (#1575 ) Adding TfidfRetriever to __init__.py of the retriever package, so people can import it like from haystack.retriever import TfidfRetriever.	2021-10-11 08:05:41 +02:00
Malte Pietsch	38652dd4dd	Enable GPU usage for QuestionGenerator (#1571 ) * enable GPU usage for question generator * Add latest docstring and tutorial changes Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2021-10-08 12:17:48 +02:00
Sara Zan	54947cb840	Return intermediate nodes output in pipelines (#1558 ) * First rough implementation * Add a flag to dump the debug logs to the console as well * Typing run() and _dispatch_run() * Allow debug and debug_logs to be passed as arguments of run() * Avoid overwriting _debug, later we might want to store other objects in it * Put logs under a separate key of the _debug dictionary and add input and output of the node alongside it * Introduce global arguments for pipeline.run() that get applied to every node when defined * Change default values of debug variables to None, otherwise their default would override the params values * Remove a potential infinite recursion on the overridden __getattr__ * Do not append the output of the last node in the _debug key, it causes infinite recursion * Add tests * Move the input/output collection into _dispatch_run to gather only relevant info * Add partial Pipeline.run() docstring * Add latest docstring and tutorial changes Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>	2021-10-07 22:13:25 +02:00
Vladimir Blagojevic	72168eddaf	Add BatchEncoding flatten (#1562 ) * Add BatchEncoding flatten * Rename BatchEncoding flatten to flatten_rename * Unit test for BatchEncoding flatten_rename	2021-10-07 15:29:57 +02:00
Adithya U R	bff90c19d5	Fix multithreading issues for older SQLite versions (#1442 ) * Update sql.py * Parametrize check_same_thread Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>	2021-10-07 10:13:53 +02:00
Vladimir Blagojevic	74d052277d	LFQA: Remove InferenceProcessor dependency (#1559 )	2021-10-05 20:42:11 +02:00
Sara Zan	3539e6b041	Fix circular import in the REST API (#1556 ) * Fix circular import in the REST API * remove unneeded import in test Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>	2021-10-04 21:18:23 +02:00
Sara Zan	af4a44fcbd	WIP Add rest api endpoint to delete documents by filter (#1546 ) * Add rest api endpoint to delete documents by filter. * Remove parametrization of rest api tests * Make the paths in rest_api/config.py absolute * Fix path to pipelines.yaml * Restructuring test_rest_api.py to be able to test only my endpoint (and to make the suite more structured) * Convert DELETE /documents into POST /documents/delete_by_filters Co-authored by: sarthakj2109 <54064348+sarthakj2109@users.noreply.github.com>	2021-10-04 11:21:00 +02:00
Julian Risch	7e063b77d2	Format doc classifier usage example (#1550 ) * Format doc classifier usage example * Add latest docstring and tutorial changes Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2021-10-01 15:01:19 +02:00
Julian Risch	24483d7bad	TransformersDocumentClassifier replacing FARMClassifier (#1540 ) * Initial draft of TransformersClassifier * Add transformers classifier implementation * Add test for SentenceTransformersClassifier * Add truncation and corresponding test case to Classifier * Add zero-shot classification and test * Add document classifier documentation * Add latest docstring and tutorial changes * print meta data with print_documents() * Add latest docstring and tutorial changes * Remove top_k param from Classifier usage example * Add latest docstring and tutorial changes Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2021-10-01 11:22:56 +02:00
bogdankostic	a20eec3098	Remove double mentions from requirements (#1545 ) * Remove one mention of sentence-transformers from requirements * Remove one mention of sklearn from requirements	2021-09-30 16:21:24 +02:00
Julian Risch	9ed726923c	Remove NER and text classification from model conversion (#1536 )	2021-09-29 13:35:59 +02:00
Julian Risch	0e7338f0c6	Remove mentions of FARM from Ranker comments (#1535 ) * Remove mentions of FARM from Ranker comments * Add latest docstring and tutorial changes Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2021-09-29 11:57:30 +02:00
Sara Zan	a30a826c6c	Standardize `delete_documents(filter=...)` across all document stores (#1509 ) * Make InMemoryDocumentStore accept and apply filters in delete_documents() * Modify test_document_store.py to test the filtered deletion in memory, sql and milvus too * Make FAISSDocumentStore accept and properly apply filters in delete_documents() * Add latest docstring and tutorial changes * Remove accidentally duplicated test * Remove unnecessary decorators from test/test_document_store.py::test_delete_documents_with_filters * Add embeddings count test for FAISS and Milvus; Milvus fails it. * Fixed a bug that made Milvus not deleting embeddings * Remove batch size parametrization in tests & update all documentstore's docstrings with a filter example * Add latest docstring and tutorial changes Co-authored-by: prafgup <prafulgupta6@gmail.com>	2021-09-29 09:27:06 +02:00

... 58 59 60 61 62 ...

3803 Commits