18 Commits

Author SHA1 Message Date
Massimiliano Pippi
83d615a32b
feat: include testing facilities into haystack package (#4182) 2023-02-17 19:38:03 +01:00
bogdankostic
7eeb3e07bf
feat: Add IVF and Product Quantization support for OpenSearchDocumentStore (#3850)
* Add IVF and Product Quantization support for OpenSearchDocumentStore

* Remove unused import statement

* Fix mypy

* Adapt doc strings and error messages to account for PQ

* Adapt validation of indices

* Adapt existing tests

* Fix pylint

* Add tests

* Update lg

* Adapt based on PR review comments

* Fix Pylint

* Adapt based on PR review

* Add request_timeout

* Adapt based on PR review

* Adapt based on PR review

* Adapt tests

* Pin tenacity

* Unpin tenacity

* Adapt based on PR comments

* Add match to tests

---------

Co-authored-by: agnieszka-m <amarzec13@gmail.com>
2023-02-17 10:28:36 +01:00
Silvano Cerza
274746db07
style: Update black (#4101)
* Update black version

* Format file with new black style

* Update black pre-commit hook version
2023-02-08 15:34:43 +01:00
Fabian
61ebe4b5dc
fix: authenticate with aws4auth if set in OpenSearchDocumentStore (#3741)
* bug(OpenSearchDocumentStore): fix authenticate with aws4auth if set.

Rearrange check to authenticate with aws4auth before username
and password, as the username is set to "admin" by default.

* Make username check less restrictive

* Fix test, do not used mocked _init_client function

* Add warning for aws4auth and username to ElasticSearchDocumentStore

Co-authored-by: Julian Risch <julian.risch@deepset.ai>
2023-01-24 10:01:39 +01:00
tstadel
6ca88bfd23
fix: Despite return_embedding=False SearchEngineDocumentStore.query retrieves embedding_field (#3662)
* fix: Despite return_embedding=False SearchEngineDocumentStore.query retrieves embedding_field

* fix pylint

* add tests

* fix mypy

* fix merge

* format

* fix pylint

* move tests to SearchEngineDocumentStoreTestAbstract

* move missed constants

* add mocked_document_store fixture to TestElasticsearchDocumentStore

* fix mocked_document_store

* fix get_all_documents tests for elasticsearch>=7.16

* fix tests

* fix tests try 2
2023-01-09 11:58:23 +01:00
tstadel
6c067b2b4f
feat: make score_script first class citizen via knn_engine param (#3284)
* OpenSearchDocumentStore: make score_script accessible via knn_engine

* blacken

* fix tests

* fix format

* fix naming of 'score_script' consistently

* fix tests

* fix test

* fix ef_search tests

* always validate index

* improve clone_embedding_field

* fix pylint

* reformat

* remove port

* update tests

* set no_implicit_optional = false

* fix myp

* fix test

* refactorings

* reformat

* fix and refactor tests

* better tests

* create search_field mappings

* remove no_implicit_optional = false

* skip validation for custom mapping

* format

* Apply suggestions from docs code review

Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>

* Apply tougher suggestions from code review

* fix messages

* fix typos

* update tests

* Update haystack/document_stores/opensearch.py

Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>

* fix tests

* fix ef_search validation

* add test for ef_search nmslib

* fix assert_not_called

Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>
2022-12-27 15:24:31 +01:00
tstadel
c1c1c97bb2
feat: add query_by_embedding_batch (#3546)
* add query_by_embedding_batch

* fix mypy

* fix pylint

* add test

* move query_by_embedding_batch to search_engine

* fix and add tests

* fix pylint

* remove Retriever query logs

* add test for multimodal batch retrieval

* allow for np.ndarray
2022-12-08 08:28:43 +01:00
Massimiliano Pippi
b694c7b5cb
Document Store test refactoring (#3449)
* add new marker

* start using test hierarchies

* move ES tests into their own class

* refactor test workflow

* job steps

* add more tests

* move more tests

* more tests

* test labels

* add more tests

* Update tests.yml

* Update tests.yml

* fix

* typo

* fix es image tag

* map es ports

* try

* fix

* default port

* remove opensearch from the markers sorcery

* revert

* skip new tests in old jobs

* skip opensearch_faiss
2022-10-31 15:30:14 +01:00
Massimiliano Pippi
31fa75e9fd
feat: add support for Elasticsearch 7.16.2 (#3318)
* bump elastic to 7.16.2+

* decouple Elasticsearch and Opensearch

use method override instead of func variables

fix mypy

default value

fix broken tests

update schema

* relax version pin

* rename the base class

* rename module

* fix import order

* do not run the new tests in the old job

* remove outdated TODO
2022-10-13 11:53:27 +02:00
tstadel
b84a6b1716
fix: opensearch script score with filters (#3321)
* fix opensearch script score filters

* add comment

* add integration test

* update schema
2022-10-06 15:41:29 +02:00
Kristof Herrmann
da1cc577ae
feat: exponential backoff with exp decreasing batch size for opensearch client (#3194)
* Validate custom_mapping properly as an object

* Remove related test

* black

* feat: exponential backoff with exp dec batch size

* added docstring and split doc lsit

* fix

* fix mypy

* fix

* catch generic exception

* added test

* mypy ignore

* fixed no attribute

* added test

* added tests

* revert strange merge conflicts

* revert merge conflict again

* Update haystack/document_stores/elasticsearch.py

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>

* done

* adjust test

* remove not required caplog

* fixed comments

Co-authored-by: ZanSara <sarazanzo94@gmail.com>
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2022-09-13 14:30:30 +01:00
bogdankostic
e2ec0d1c15
feat: FAISS in OpenSearch: check existing index (#3101)
* Add check for mapping for existing indices

* Add test

* Check if "method" field exists
2022-08-25 17:33:26 +02:00
tstadel
92046ce5b5
feat: FAISS in OpenSearch: Support HNSW for dot product and l2 (#3029)
* support faiss hnsw

* blacken

* update docs

* improve similarity check

* add tests

* update schema

* set ef_search param correctly

* Apply suggestions from code review

Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>

* regenerate docs

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>
2022-08-24 16:43:48 +02:00
bogdankostic
b03de53716
Use random_sample instead of ndarray for random array (#3083) 2022-08-22 13:19:45 +02:00
tstadel
668fd548a6
Fix embeddings_field_supports_similarity of OpenSearchDocumentStore when creating index (#3030)
* fix embeddings_field_supports_similarity when creating index

* fix test
2022-08-12 11:19:59 +02:00
Massimiliano Pippi
40d07c2038
Enable Opensearch unit tests in Windows CI (#2936)
* enable Opensearch unit tests under Win

* move unit tests into a dedicated job

* skip audio tests on missing dependencies

* avoid failing test collection when soundfile is not available

* Update .github/workflows/tests.yml

Co-authored-by: Sara Zan <sara.zanzottera@deepset.ai>

Co-authored-by: Sara Zan <sara.zanzottera@deepset.ai>
2022-08-03 19:19:07 +02:00
Massimiliano Pippi
e7627c3f8b
Use opensearch-py in OpenSearchDocumentStore (#2691)
* add Opensearch extras

* let OpenSearchDocumentStore use opensearch-py

* Update Documentation & Code Style

* fix a bug found after adding tests

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Sara Zan <sara.zanzottera@deepset.ai>
2022-07-28 10:04:49 +02:00
Massimiliano Pippi
374155fd5c
Move Opensearch document store in its own module (#2603)
* move OpenSearchDocumentStore into its own Python module

* Update Documentation & Code Style

* mark test with (sigh) elasticsearch

* skip opensearch tests on windows

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-06-08 16:37:23 +02:00