* Add endpoint to get documents by filter
* Add test for /documents/get_by_filter and extend the delete documents test
* Add rest_api/file-upload to .gitignore
* Make sure the document store is empty for each test
* Improve docstrings of delete_documents_by_filters and get_documents_by_filters
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* Create extractor/entity.py
* Aggregate NER words into entities
* Support indexing
* Add doc strings
* Add utility for printing
* Update signature of run() to match BaseComponent
* Add test
* Modify simplify_ner_for_qa to return the dictionary and add its test
Co-authored-by: brandenchan <brandenchan@icloud.com>
* First rough implementation
* Add a flag to dump the debug logs to the console as well
* Typing run() and _dispatch_run()
* Allow debug and debug_logs to be passed as arguments of run()
* Avoid overwriting _debug, later we might want to store other objects in it
* Put logs under a separate key of the _debug dictionary and add input and output of the node alongside it
* Introduce global arguments for pipeline.run() that get applied to every node when defined
* Change default values of debug variables to None, otherwise their default would override the params values
* Remove a potential infinite recursion on the overridden __getattr__
* Do not append the output of the last node in the _debug key, it causes infinite recursion
* Add tests
* Move the input/output collection into _dispatch_run to gather only relevant info
* Add partial Pipeline.run() docstring
* Add latest docstring and tutorial changes
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>
* Add rest api endpoint to delete documents by filter.
* Remove parametrization of rest api tests
* Make the paths in rest_api/config.py absolute
* Fix path to pipelines.yaml
* Restructuring test_rest_api.py to be able to test only my endpoint (and to make the suite more structured)
* Convert DELETE /documents into POST /documents/delete_by_filters
Co-authored by: sarthakj2109 <54064348+sarthakj2109@users.noreply.github.com>
* Initial draft of TransformersClassifier
* Add transformers classifier implementation
* Add test for SentenceTransformersClassifier
* Add truncation and corresponding test case to Classifier
* Add zero-shot classification and test
* Add document classifier documentation
* Add latest docstring and tutorial changes
* print meta data with print_documents()
* Add latest docstring and tutorial changes
* Remove top_k param from Classifier usage example
* Add latest docstring and tutorial changes
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* Make InMemoryDocumentStore accept and apply filters in delete_documents()
* Modify test_document_store.py to test the filtered deletion in memory, sql and milvus too
* Make FAISSDocumentStore accept and properly apply filters in delete_documents()
* Add latest docstring and tutorial changes
* Remove accidentally duplicated test
* Remove unnecessary decorators from test/test_document_store.py::test_delete_documents_with_filters
* Add embeddings count test for FAISS and Milvus; Milvus fails it.
* Fixed a bug that made Milvus not deleting embeddings
* Remove batch size parametrization in tests & update all documentstore's docstrings with a filter example
* Add latest docstring and tutorial changes
Co-authored-by: prafgup <prafulgupta6@gmail.com>
* simplify tests for individual doc stores
* WIP refactoring markers of tests
* test alternative approach for tests with existing parametrization
* fix skip logic of already parametrized tests
* fix weaviate behaviour in tests - not parametrizing it in our general test cases.
* Add latest docstring and tutorial changes
* fix some tests
* remove sql from document_store_types
* fix markers for generator and pipeline test
* remove inmemory marker
* remove unneeded elasticsearch markers
* update readme and contributing.md
* update contributing
* adjust example
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* Add inferencer for QA only
* Add latest docstring and tutorial changes
* Add QA inferencer tests
* Add type annotations for inferencer
* Fix type annotations, move util functions
* Fix type annotations
* Move fixtures to the top of the file
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* Saves the FAISSDocumentStore init params to JSON at save() and loads them at load() if they're found. First draft, to be tested.
* Fixing issue with string/Path objects in a few string operations, thanks mypy
* Leverage self.set_config instead of saving the parameters in a separate attribute
* Modify test_faiss_and_milvus:test_faiss_index_save_and_load to test that init params are preserved
* Add assert to verify that the SQL doc count and FAISS vector count is equal. Needs to always specify the name of the SQL db for this to work
* Simplified the implementation a bit, add better comments
* Forgot a return at the end of the file
* Fixing some of the suggestions from the review
* Add a try-catch in the load method and fix the tests
* Typo
* feat: normalize embeddings for cosine sim
* WIP add test case for faiss cosine
* input to faiss normalize needs to be an array of vectors
* fix: test should compare correct result embedding to original embedding
* add sanity check for cosine sim
* fix typo
* normalize cosine score
* Update docstring
Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>
* Add type annotations in QuestionAnsweringHead
* Fix test by increasing max_seq_len
* Add SampleBasket type annotation
* Remove prediction head param from adaptive model init
* Add type ignore for AdaptiveModel init
* Fix and rename tests
* Adjust folder structure
Co-authored-by: Julian Risch <julian.risch@deepset.ai>
* Added support for Multi-GPU inference to DPR including benchmark
* fixed multi gpu
* added batch size to benchmark to better reflect multi gpu capabilities
* remove unnecessary entry in config.json
* fixed typos
* fixed config name
* update benchmark to use DEVICES constant
* changed multi gpu parameters and updated docstring
* adds silent fallback on cpu
* update doc string, warning and config
Co-authored-by: Michel Bartels <kontakt@michelbartels.com>
Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>
* [UPDT] delete_all_documents() replaced by delete_documents()
* [UPDT] warning logs to be fixed
* [UPDT] delete_all_documents() renamed and the same method added
Co-authored-by: Ram Garg <ramgarg102@gmai.com>
* Removing probability field from reader and from test cases
* Add switch to FARMReader to choose score/probability
* Remove probability field from doc returned by doc store
* Relax assertion testing joined es and dpr predictions
* Use switch for confidence scores also for no_answer
* Add test that checks switching to old answer scores > 10
* Normalize score in elastic doc store and reset reader.md
* Scale weights of JoinDocuments to sum to 1 and adapt test case
* Add FARM classification node
* Add classification output to meta field of document
* Update usage example
* Add test case for FARMClassifier
* Replace FARMRanker with FARMClassifier in documentation strings
* Remove base method not implemented by any child class, etc.
* [pipeline] Allow for batch indexing when using Pipelines fix#1168
* [pipeline] Test case fixed fix#1168
* [file_converter] Path.suffix updated #1168
* [file_converter] meta can be one of these three cases:
A single dict that is applied to all files
One dict for each file being converted
None #1168
* [file_converter] mypy error fixed.
* [file_converter] mypy error fixed.
* [rest_api] batch file upload introduced in indexing API.
* [test_case] Test_api file upload parameter name updated.
* [ui] Streamlit file upload parameter updated.
* Annotation Tool: data is not persisted when using local version #853
* First version of weaviate
* First version of weaviate
* First version of weaviate
* Updated comments
* Updated comments
* ran query, get and write tests
* update embeddings, dynamic schema and filters implemented
* Initial set of tests and fixes
* Tests added for update_embeddings and delete documents
* introduced duplicate documents fix
* fixed mypy errors
* Added Weaviate to requirements
* Fix the weaviate docker env variables
* Fixing test dependencies for now
* Created weaviate test marker and fixed query
* Update docstring
* Add documentation
* Bump up weaviate version
* Bump up weaviate version in documentation
* Bump up weaviate version in documentation
* Updgrade weaviate version
Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>