7 Commits

Author SHA1 Message Date
Lalit Pagaria
2e9f3c1512
Fix update_embeddings function in FAISSDocumentStore and add retriever fixture in tests (#481)
* 1. Prevent update_embeddings function in FAISSDocumentStore to set faiss_index as None when document store does not have any docs.

2. cleaning up tests by adding fixture for retriever.

* TfidfRetriever need document store with documents during initialization as it call fit() function in constructor so fixing it by checking self.paragraphs of None

* Fix naming of retriever's fixture (embedded to embedding and tfid to tfidf)
2020-10-14 16:15:04 +02:00
Malte Pietsch
8edeb844f7
Remove phi normalization from FAISS, support more index types, 3x speedup (#467)
* remove phi normalization

* add special case for hnsw

* rename vector_size to vector_dim

* fix loading. fix extra dim in tests

* switch to new ES syntax for vector similarity

* 3x sql speed up. cascade deletes. add train_index()

* add docstrings. remove vector_dim from load()

* delete docs from faiss and sql

* fix delete of docs in test

* relax type hint for faiss index

* rename metric to metric_type

Co-authored-by: lalitpagaria <19303690+lalitpagaria@users.noreply.github.com>
2020-10-06 16:09:56 +02:00
Lalit Pagaria
465ccbc12e
Allow multiple write calls to existing FAISS index. (#422)
- Fixing issue when update_embeddings always create new FAISS index instead of clearing existing one. New index creation may not free existing used memory and cause memory leak.

Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>
2020-10-05 12:01:20 +02:00
Malte Pietsch
db6864d159
Fix type casting for vectors in FAISS (#399)
* Fix type casting for vectors in FAISS

Co-authored-by: philipp-bode <philipp.bode@student.hpi.de>

* add type casts for elastic. refactor embedding retriever tests

* fix case: empty embedding field

* fix faiss tolerance

* add assert in test_faiss_retrieving

Co-authored-by: philipp-bode <philipp.bode@student.hpi.de>
2020-09-18 17:08:13 +02:00
Malte Pietsch
d69133966d Fix faiss test tolerance 2020-09-18 13:57:29 +02:00
Malte Pietsch
4c503158a7
Fix duplicate vector ids in FAISS (#395)
* fix duplicate vector ids in faiss

* Add test

Co-authored-by: lalitpagaria <19303690+lalitpagaria@users.noreply.github.com>

* revert score change

* switch to faiss_index.ntotal for ids. add tests

Co-authored-by: lalitpagaria <19303690+lalitpagaria@users.noreply.github.com>
2020-09-18 12:52:22 +02:00
Tanay Soni
9d0df60aad
Add FAISS Document Store (#253) 2020-08-07 14:25:08 +02:00