* feat: added to init
* feat: added return_embedding in to_dict
* feat: added return_embedding to filter_documents
* feat: added return_embedding to bm25_retrieval
* refactor: embedding_retrieval to use return_embedding attribute rather than parameter passed
* docs: added releasenote
* fix: pop from doc_fields instead of changing return_documents attr to none
* fix: made return_embedding an optional field and removed deprecation warning
* fix: give return_embedding a higher priority than self.return_embedding
* feat: changed default behaviour of return_embedding to True
* chore: update tests after InMemory Document store update
* Update releasenotes/notes/update-in-memory-document-store-17f555695caf9d52.yaml
Co-authored-by: Sebastian Husch Lee <10526848+sjrl@users.noreply.github.com>
* chore: update docs
* chore: enhanced clarity and redability of expression
* test: return_embedding is set to false during initialization
* test: overriding return_embedding inside
* fix: changed the use of self.filter_documents to actual implementation inside `embedding_retrieval`
Signed-off-by: rafaeljohn9 <rafaeljohb@gmail.com>
---------
Signed-off-by: rafaeljohn9 <rafaeljohb@gmail.com>
Co-authored-by: Sebastian Husch Lee <10526848+sjrl@users.noreply.github.com>
* Fix types in test_run.py
* Get test_run.py to pass fmt-check
* Add test_run to mypy checks
* Update test folder to pass ruff linting
* Fix merge
* Fix HF tests
* Fix hf test
* Try to fix tests
* Another attempt
* minor fix
* fix SentenceTransformersDiversityRanker
* skip integrations tests due to model unavailable on HF inference
---------
Co-authored-by: anakin87 <stefanofiorucci@gmail.com>
* reorganize docstore test suite to isolate dataframe tests
* improve docstring
* include FilterDocumentsTestWithDataframe in InMemoryDocumentStore tests
* Remove all references to old filter syntax
* More removals
* Lint
* Do not remove test_filter_retriever.py
* Add reno note
* Update ValueError text to match text in haystack-core-integrations
* incorporating better bm25 impl without breaking interface
* all three bm25 algos
* 1. setting algo post-init not allowed; 2. remove extra underscore for naming consistency; 3. remove unused import
* 1. rename attribute name for IDF computation 2. organize document statistics as a dataclass instead of tuple to improve readability
* fix score type initialization (int -> float) to pass mypy check
* release note included
* fixing linting issues and mypy
* fixing tests
* removing heapq import and cleaning up logging
* changing indexing order
* adding more tests
* increasing tests
* removing rank_bm25 from pyproject.toml
---------
Co-authored-by: David S. Batista <dsbatista@gmail.com>
* fix!: `InMemoryBM25Retriever` no longer returns documents that have a score of 0.0
Also update tests to accommodate the new behavior.
* Remove superfluous code
* ci: Use ruff in pre-commit to further limit complexity
* Fix invalid escape sequences in Python code
* Delete releasenotes/notes/ruff-4d2504d362035166.yaml
* Refactor codebase so that doc_type metadata is used instead of namespaces for making distinction between documents without embeddings, documents with embeddings and labels
* Fix parameter name in integration test
* Remove code under comment in add_type_metadata_filter method
* Fix mypy and pylint checks
* Add release note
* Apply minimal changes: rename method, update method docs and remove redundant method
* Mypy fixes
* Fix docstrings
* Revert helper methods for fetching documents when the number of documents exceeds Pinecone limit
* Remove unnecessary attributes in PineconeDocumentStore
* Fix unit test
---------
Co-authored-by: Ivana Zeljkovic <ivana.zeljkovic@smartcat.io>
Co-authored-by: DosticJelena <jelena.dostic@smartcat.io>
* Add job for ES8 integration tests
* Add unit test for Elasticsearch 8
* Add tests.yml
* Adapt tests.yml
* Remove added white space
* Adapt tests.yml
* Adapt tests.yml
* Add dependencies to unit test name
* Adapt unit test matrix
* Adapt unit test matrix
* Adapt unit test matrix
* Adapt unit test matrix
* Update tests.yml
* Create separate tests where necessary
* Fix skip
* Adapt tests
* make a package
* Update haystack/document_stores/elasticsearch/es7.py
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
* do not expose ES types from the package
---------
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
* #4653 fix changing scores by returning new document objects from document store queries
* added integration test for InMemoryDocumentStore demonstrating the desired behavior
* Update test/document_stores/test_memory.py
* Include benchmark config in output
* Use queries from aggregated labels
* Introduce batching for querying in ElasticsearchDocStore and OpenSearchDocStore
* Fix mypy
* Use self.batch_size in write_documents
* Use 10_000 as default batch size
* Add unit tests for write documents
* refactor: make the scope param configurable
the scope parameter is used when authenticating using
AuthClientPassword and AuthClientCredentials
* feat: add support for AuthClientCredentials
add support for authenticating using the OIDC Client Credentials
authentication flow
* feat: add support for AuthBearerToken
Add support for authenticating using OIDC and bearer tokens
* Update lg
* refactor how client is built
Signed-off-by: hsm207 <hsm207@users.noreply.github.com>
* unit test the auth methods
Signed-off-by: hsm207 <hsm207@users.noreply.github.com>
* Update test_weaviate.py
* revert formatting change
* Fix type hints
---------
Signed-off-by: hsm207 <hsm207@users.noreply.github.com>
Co-authored-by: John Doe <johndoe@example.com>
Co-authored-by: agnieszka-m <amarzec13@gmail.com>
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
* Add support for dicts to Weaviate
* Add support for _split_overlap to Pinecone
* Add tests
* Fix Pylint
* Fix Pylint
* Fix test
* Implement PR feedback