* fix: update kwargs for TriAdaptiveModel
* fix: squeeze batch for TTR inference
* test: add test for ttr + dataframe case
* test: update and reorganise ttr tests
* refactor: make triadaptive model handle shapes
* refactor: remove duplicate reshaping
* refactor: rename test with duplicate name
* fix: add device assignment back to TTR
* fix: remove duplicated vars in test
---------
Co-authored-by: bogdankostic <bogdankostic@web.de>
* Removed double batching around embed_queries
* Add back tests for retrieve_batch for dpr and embedding retrievers
* Updated table-text-retriever to not double batch
* Fixing pylint
* Update to test
* Remove code breaking test
* Updating dev comment to be clearer
* refactor: use weaviate client to build BM25 query
* refactor: remove manual BM25 query building
* refactor: apply BM25 to the content_field only
* test: update weaviate BM25 retrieval test case
update to account for lack of stemming
---------
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
* Fixing broken BM25 support with Weaviate - fixes#3720
Unfortunately the BM25 support with Weaviate got broken with Haystack v1.11.0+, which is getting fixed with this commit.
Please see more under issue #3720.
* Fixing mypy issue - method signature wasn't matching the base class
* Mypy related test fix
Mypy forced me to set the signature of the `query` method of the Weaviate document store to the same as its parent, the `KeywordDocumentStore`, where the `query` parame is `Optional`, but has NO default value, so it must be provided (as None) at runtime.
I am not quite sure why the abstract method's `query` param was set without a default value while its type is `Optional`, but I didn't want to change that, so instead I have changed the Weaviate tests.
* Adding a note regarding an upcomming fix in Weaviate v1.17.0
* Apply suggestions from code review
* revert
* [EMPTY] Re-trigger CI
* first draft to add index param to tfidf
* better mypy handling
* Revert "better mypy handling"
This reverts commit 91a22516320f9dcbeae53827ec69f9dc51e1785c.
* new check in auto_fit
* new check also in retrieve
* better dict typings
* new test and improvements to other test
* remove unnecessary lambda
* improve test
* remove newline from openapi json
* fix test
* language fix
Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>
* language fix 2
Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>
* language fix 3
Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>
* language fix 4
Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>
* language fix 5
Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>
* language fix 6
Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>
* explicit index value handling
* fix test
* better error messages
Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>
* very first draft
* implement query and query_batch
* add more bm25 parameters
* add rank_bm25 dependency
* fix mypy
* remove tokenizer callable parameter
* remove unused import
* only json serializable attributes
* try to fix: pylint too-many-public-methods / R0904
* bm25 attribute always present
* convert errors into warnings to make the tutorial 1 work
* add docstrings; tests
* try to make tests run
* better docstrings; revert not running tests
* some suggestions from review
* rename elasticsearch retriever as bm25 in tests; try to test memory_bm25
* exclude tests with filters
* change elasticsearch to bm25 retriever in test_summarizer
* add tests
* try to improve tests
* better type hint
* adapt test_table_text_retriever_embedding
* handle non-textual docs
* query only textual documents
* changes how query and queries are checked if they have been passed in BaseRetriever
* Fixes checking query properly in Pipeline run
* Fixes checking query properly in Pipeline run
* Adds test for FilterRetriever using run method when query is empty
* Adds mock filter retriever and adapts test
* Removes old test, adds MockRetriever to test file and test uses document_store
* Logs error when query is not of type string with a new test for run batch
* Update test/nodes/test_retriever.py
* schemas
* fix milvus and faiss tests not running
* fix schema manually
* fix test_dpr_embedding test for milvus
* pip freeze on milvus tests
* fix milvus1 tests being executed: fix all_doc_stores order
* Revert "pip freeze on milvus tests"
This reverts commit 75ebb6f7e507bb8477e87d9e63b4a294f7946cab.
* make infer_required_doc_store more robust
* don't skip tests without docstore requirements
* use markers for docstore tests
* Use AutoTokenizer by default, to easily adapt to new models and tokenizers
* Add missing AutoTokenizer import
* Apply Black
* Missing import
* Fix DPR tests
* Remove tests on max length
* Update Documentation & Code Style
Co-authored-by: Sara Zan <sara.zanzottera@deepset.ai>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* Unify CI tests (from #2466)
* Update Documentation & Code Style
* Change folder names
* Fix markers list
* Remove marker 'slow', replaced with 'integration'
* Soften children check
* Start ES first so it has time to boot while Python is setup
* Run the full workflow
* Try to make pip upgrade on Windows
* Set KG tests as integration
* Update Documentation & Code Style
* typo
* faster pylint
* Make Pylint use the cache
* filter diff files for pylint
* debug pylint statement
* revert pylint changes
* Remove path from asserted log (fails on Windows)
* Skip preprocessor test on Windows
* Tackling Windows specific failures
* Fix pytest command for windows suites
* Remove \ from command
* Move poppler test into integration
* Skip opensearch test on windows
* Add tolerance in reader sas score for Windows
* Another pytorch approx
* Raise time limit for unit tests :(
* Skip poppler test on Windows CI
* Specify to pull with FF only in docs check
* temporarily run the docs check immediately
* Allow merge commit for now
* Try without fetch depth
* Accelerating test
* Accelerating test
* Add repository and ref alongside fetch-depth
* Separate out code&docs check from tests
* Use setup-python cache
* Delete custom action
* Remove the pull step in the docs check, will find a way to run on bot commits
* Add requirements.txt in .github for caching
* Actually install dependencies
* Change deps group for pylint
* Unclear why the requirements.txt is still required :/
* Fix the code check python setup
* Install all deps for pylint
* Make the autoformat check depend on tests and doc updates workflows
* Try installing dependencies in another order
* Try again to install the deps
* quoting the paths
* Ad back the requirements
* Try again to install rest_api and ui
* Change deps group
* Duplicate haystack install line
* See if the cache is the problem
* Disable also in mypy, who knows
* split the install step
* Split install step everywhere
* Revert "Separate out code&docs check from tests"
This reverts commit 1cd59b15ffc5b984e1d642dcbf4c8ccc2bb6c9bd.
* Add back the action
* Proactive support for audio (see text2speech branch)
* Fix label generator tests
* Remove install of libsndfile1 on win temporarily
* exclude audio tests on win
* install ffmpeg for integration tests
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>