Ivana Zeljkovic
2326f2f9fe
feat: Pinecone document store optimizations ( #5902 )
...
* Optimize methods for deleting documents and getting vector count. Enable warning messages when Pinecone limits are exceeded on Starter index type.
* Fix typo
* Add release note
* Fix mypy errors
* Remove unused import. Fix warning logging message.
* Update release note with description about limits for Starter index type in Pinecone
* Improve code base by:
- Adding new test cases for get_embedding_count method
- Fixing get_embedding_count method
- Improving delete documents
- Fix label retrieval
- Increase default batch size
- Improve get_document_count method
* Remove unused variable
* Fix mypy issues
2023-10-16 19:26:24 +02:00
Christian Clauss
bf6d306d68
ci: Simplify Python code with ruff rules SIM ( #5833 )
...
* ci: Simplify Python code with ruff rules SIM
* Revert #5828
* ruff --select=I --fix haystack/modeling/infer.py
---------
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-09-20 08:32:44 +02:00
Christian Clauss
1bc03ddc73
ci: Fix all ruff pyflakes errors except unused imports ( #5820 )
...
* ci: Fix all ruff pyflakes errors except unused imports
* Delete releasenotes/notes/fix-some-pyflakes-errors-69a1106efa5d0203.yaml
2023-09-15 18:30:33 +02:00
Christian Clauss
9405eb90ee
ci: Fix invalid escape sequences in Python code ( #5802 )
...
* ci: Use ruff in pre-commit to further limit complexity
* Fix invalid escape sequences in Python code
* Delete releasenotes/notes/ruff-4d2504d362035166.yaml
2023-09-14 16:42:48 +02:00
Ivana Zeljkovic
4bad202197
feat: Pinecone document store refactoring ( #5725 )
...
* Refactor codebase so that doc_type metadata is used instead of namespaces for making distinction between documents without embeddings, documents with embeddings and labels
* Fix parameter name in integration test
* Remove code under comment in add_type_metadata_filter method
* Fix mypy and pylint checks
* Add release note
* Apply minimal changes: rename method, update method docs and remove redundant method
* Mypy fixes
* Fix docstrings
* Revert helper methods for fetching documents when the number of documents exceeds Pinecone limit
* Remove unnecessary attributes in PineconeDocumentStore
* Fix unit test
---------
Co-authored-by: Ivana Zeljkovic <ivana.zeljkovic@smartcat.io>
Co-authored-by: DosticJelena <jelena.dostic@smartcat.io>
2023-09-14 11:46:47 +02:00
Vladimir Blagojevic
1066e959a2
bug: fix for pinecone not working for per document updates ( #5110 )
2023-07-03 14:07:52 +02:00
bogdankostic
43509c88bf
fix: Add support for _split_overlap
meta to Pinecone and dict
metadata in general to Weaviate ( #4805 )
...
* Add support for dicts to Weaviate
* Add support for _split_overlap to Pinecone
* Add tests
* Fix Pylint
* Fix Pylint
* Fix test
* Implement PR feedback
2023-05-05 11:20:21 +02:00
ZanSara
1b57b96210
refactor!: extract elasticsearch
( #4668 )
...
* extract elasticsearch
* update pyproject.toml
* make more import optional
* move MockBaseRetriever in conftest
* install es in the es integration tests
2023-04-26 10:14:20 +02:00
Massimiliano Pippi
83d615a32b
feat: include testing facilities into haystack package ( #4182 )
2023-02-17 19:38:03 +01:00
Silvano Cerza
274746db07
style: Update black ( #4101 )
...
* Update black version
* Format file with new black style
* Update black pre-commit hook version
2023-02-08 15:34:43 +01:00
tstadel
92c58cfda1
feat: Support multiple document_ids in Answer object (for generative QA) ( #4062 )
...
* initial version without shapers
* set document_ids for BaseGenerator
* introduce question-answering-with-references template
* better prompt
* make PromptTemplate control output_variable
* update schema
* fix add_doc_meta_data_to_answer
* Revert "fix add_doc_meta_data_to_answer"
This reverts commit b994db423ad8272c140ce2b785cf359d55383ff9.
* fix add_doc_meta_data_to_answer
* fix eval
* fix pylint
* fix pinecone
* fix other tests
* fix test
* fix flaky test
* Revert "fix flaky test"
This reverts commit 7ab04275ffaaaca96b4477325ba05d5f34d38775.
* adjust docstrings
* make Label loading backward-compatible
* fix Label backward compatibility for pinecone
* fix Label backward compatibility for search engines
* fix Label backward compatibility for deepset Cloud
* fix tests
* fix None issue
* fix test_write_feedback
* add tests for legacy label support
* add document_id test for pinecone
* reduce unnecessary contents
* add comment to pinecone test
2023-02-08 08:37:22 +01:00
Sebastian
71de0524de
fix: fixed InMemoryDocumentStore.get_embedding_count
to return correct number ( #3980 )
...
* Fix the embedding count function of InMemoryDocumentStore
* Adding some doc strings explaining how many docs with embeddings to expect.
2023-01-30 12:38:30 +01:00
Ahmed Nabil
12e057837b
Adding condition to pinecone
object. ( #3768 )
...
* Adding condition to `pinecone` object.
While you can assign any values to `PineconeDocumentStore`'s parameter `pinecone_index`, it must have another condition to prevent that from happening.
* Added test, and changed the code to make sure the pinecone idx variable has correct instance
* fixed black error
Co-authored-by: Mayank Jobanputra <mayankjobanputra@gmail.com>
2023-01-19 01:34:44 +05:30
Julian Risch
0c2d13f1b8
bug: skip validating empty embeddings ( #3774 )
...
* skip validating empty embeddings
* skip batches without embeddings to update
* add unit test with mocked retriever
2023-01-05 15:13:57 +01:00
James Briggs
520b23ec1b
fix: pinecone metadata format ( #3660 )
...
* fix for multilevel metadata dictionaries
* add metadata dict formating to update function
* typing
* added check for labels meta
* added more info to input parameters
* added test for multilayer metadata
* removed todo
2022-12-13 10:11:24 +01:00
Massimiliano Pippi
b20f808119
refactor: move more tests to the base class ( #3637 )
...
* move more tests to the base class
* skip tests where unsupported
* do not pass index label explicitly
* skip test for Pinecone
2022-11-29 08:43:27 +01:00
Massimiliano Pippi
057a8c0b4f
refactor: Pinecone tests ( #3555 )
...
* add pytest option to unmock pinecone
* first try
* handle missing answer
* fix labels metadata
* more tests
* adapt workflow
* typo
* address review comments
2022-11-14 15:19:15 +01:00
James Briggs
9b1b03002f
update to PineconeDocumentStore to remove dependency on SQL db ( #2749 )
...
* update to PineconeDocumentStore to remove dependency on SQL db
* Update Documentation & Code Style
* typing fixes
* Update Documentation & Code Style
* fixed embedding generator to yield Documents
* Update Documentation & Code Style
* fixes for final typing issues
* fixes for pylint
* Update Documentation & Code Style
* uncomment pinecone tests
* added new params to docstrings
* Update Documentation & Code Style
* Update Documentation & Code Style
* Update haystack/document_stores/pinecone.py
Co-authored-by: Sara Zan <sarazanzo94@gmail.com>
* Update haystack/document_stores/pinecone.py
Co-authored-by: Sara Zan <sarazanzo94@gmail.com>
* Update Documentation & Code Style
* Update haystack/document_stores/pinecone.py
Co-authored-by: Sara Zan <sarazanzo94@gmail.com>
* Update haystack/document_stores/pinecone.py
Co-authored-by: Sara Zan <sarazanzo94@gmail.com>
* Update haystack/document_stores/pinecone.py
Co-authored-by: Sara Zan <sarazanzo94@gmail.com>
* Update haystack/document_stores/pinecone.py
Co-authored-by: Sara Zan <sarazanzo94@gmail.com>
* changes based on comments, updated errors and install
* Update Documentation & Code Style
* mypy
* implement simple filtering in pinecone mock
* typo
* typo in reverse
* account for missing meta key in filtering
* typo
* added metadata filtering to describe index
* added handling for users switching indexes in same doc store, and handling duplicate docs in write
* syntax tweaks
* added index option to document/embedding count calls
* labels implementation in progress
* added metadata fields to be indexed for pinecone tests
* further changes to mock
* WIP implementation of labels+multilabels
* switched to rely on labels namespace rather than filter
* simpler delete_labels
* label fixes, remove debug code
* Apply dostring fixes
Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>
* mypy
* pylint
* docs
* temporarily un-mock Pinecone
* Small Pinecone test suite
* pylint
* Add fake test key to pass the None check
* Add again fake test key to pass the None check
* Add Pinecone to default docstores and fix filters
* Fix field name
* Change field name
* Change field value
* Remove comments
* forgot to upgrade pyproject.toml
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Sara Zan <sara.zanzottera@deepset.ai>
Co-authored-by: Sara Zan <sarazanzo94@gmail.com>
Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>
2022-08-24 13:27:15 +02:00