* add new marker
* start using test hierarchies
* move ES tests into their own class
* refactor test workflow
* job steps
* add more tests
* move more tests
* more tests
* test labels
* add more tests
* Update tests.yml
* Update tests.yml
* fix
* typo
* fix es image tag
* map es ports
* try
* fix
* default port
* remove opensearch from the markers sorcery
* revert
* skip new tests in old jobs
* skip opensearch_faiss
* add document_store to retrieve()]
* mypy & pylint
* pass docstore to embedding encoders
* schemas
* mypy and pylint
* fix tfidfretriever
* pylint
* mypy
* pylint
* fix tfidf
* mypy
* pylint
* schemas
* another fix for tfidf
* fix question generation tests
* remove docstore from embedding encoder signature
* pylint
* revert accidental test changes
* Apply suggestions from code review
* check for docstore similarity function only if the docstore is present
* check for docstore similarity function only if the docstore is present
* Removed explicit passage formatting by name field
* passing correct input type for embedding the docs
* Updated test, updated similarity scores and added results
* changed expected input to embed method
* Remove dependence on HuggingFace TokenClassificationPipeline and group all postprocessing functions under one class
* Added copyright notice for HF and deepset to entity file to acknowledge that a lot of the postprocessing parts came from the transformers library.
* Fixed text squishing problem. Added additional unit test for it.
Co-authored-by: ju-gu <julian.gutsch@deepset.ai>
* changes how query and queries are checked if they have been passed in BaseRetriever
* Fixes checking query properly in Pipeline run
* Fixes checking query properly in Pipeline run
* Adds test for FilterRetriever using run method when query is empty
* Adds mock filter retriever and adapts test
* Removes old test, adds MockRetriever to test file and test uses document_store
* Logs error when query is not of type string with a new test for run batch
* Update test/nodes/test_retriever.py
* schemas
* bump elastic to 7.16.2+
* decouple Elasticsearch and Opensearch
use method override instead of func variables
fix mypy
default value
fix broken tests
update schema
* relax version pin
* rename the base class
* rename module
* fix import order
* do not run the new tests in the old job
* remove outdated TODO
* Added checks for DataParallel and WrappedDataParallel
* Update isinstance checks according to pylint recommendation
* Using isinstance over types
* Added test for dpr training
* fix: Allow arbitrary values for parameters in Pipeline configurations
* Add test
* Adapt expected error message in tests
* Fix bug
* Fix bug on checking JSON
* Remove test cases that previously tested if error was thrown
* Change encoding in test
* Restrict possible values
* Re-add tests
* Re-add tests
* Add value flag to list elements
* Adding filters param to MostSimilarDocumentsPipeline run and run_batch
* Adding index param to MostSimilarDocumentsPipeline run and run_batch
* Adding index param documentation to MostSimilarDocumentsPipeline run and run_batch
* Updated index param documentation to MostSimilarDocumentsPipeline run and run_batch. Updated type: ignore in run_batch
* Adding filters param to MostSimilarDocumentsPipeline run and run_batch
* Adding index param to MostSimilarDocumentsPipeline run and run_batch
* Adding index param documentation to MostSimilarDocumentsPipeline run and run_batch
* Updated index param documentation to MostSimilarDocumentsPipeline run and run_batch. Updated type: ignore in run_batch
* don't send the list of inputs back as an output in the running of a node.
* updated documentation
* Update pydoc-markdown.py
* added test case for pipeline join fix
Co-authored-by: JeffRisberg <jrisberg@aol.com>
* fix milvus and faiss tests not running
* fix schema manually
* fix test_dpr_embedding test for milvus
* pip freeze on milvus tests
* fix milvus1 tests being executed: fix all_doc_stores order
* Revert "pip freeze on milvus tests"
This reverts commit 75ebb6f7e507bb8477e87d9e63b4a294f7946cab.
* make infer_required_doc_store more robust
* don't skip tests without docstore requirements
* use markers for docstore tests
* quick fix benchmark runs to make them work with current haystack version
* fix minor typo
* update readme. fix minor things to make benchmarks run again
* Update Documentation & Code Style
* fix typo in readme
* update result files for reader and retriever querying
* reduce batch size for update embeddings to prevent xlarge bulk_update requests that exceed elastic's limits (happening in dense 500k runs)
* change default memory allocation back to normal. add note to readme
* add first indexing results
* add memory to docker cmd
* full benchmarks results on commit c5a2651fcbbeffca06ffa9036b10e62669bcc1b0
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* Use the %s syntax on all debug messages
* Use the %s syntax on some more debug messages
* Use the %s syntax on info messages
* Use the %s syntax on warning messages
* Use the %s syntax on error and exception messages
* mypy
* pylint
* trogger tutorials execution in CI
* trigger tutorials execution on CI
* black
* remove embeddings from repr
* fix Document `__repr__`
* address feedback
* mypy
* feat(PDFToTextConverter): add option to get text in physical layout order
* test: add physical layout extraction test to PDFToTextConverter
* refactor: change layout parameter attribution places
* docs: manually trigger pre-commits
* docs: generate new docs to comply with pydoc-markdown style
* refactor: improve support for dataclasses
* refactor: refactor class init
* refactor: remove unused import
* refactor: testing 3.7 diffs
* refactor: checking meta where is Optional
* refactor: reverting some changes on 3.7
* refactor: remove unused imports
* build: manual pre-commit run
* doc: run doc pre-commit manually
* refactor: post initialization hack for 3.7-3.10 compat.
TODO: investigate another method to improve 3.7 compatibility.
* doc: force pre-commit
* refactor: refactored for both Python 3.7 and 3.9
* docs: manually run pre-commit hooks
* docs: run api docs manually
* docs: fix wrong comment
* refactor: change no type-checked test code
* docs: update primitives
* docs: api documentation
* docs: api documentation
* refactor: minor test refactoring
* refactor: remova unused enumeration on test
* refactor: remove unneeded dir in gitignore
* refactor: exclude all private fields and change meta def
* refactor: add pydantic comment
* refactor : fix for mypy on Python 3.7
* refactor: revert custom init
* docs: update docs to new pydoc-markdown style
* Update test/nodes/test_generator.py
Co-authored-by: Sara Zan <sarazanzo94@gmail.com>