240 Commits

Author SHA1 Message Date
Tanay Soni
01ff66dfd6 Remove redundant test fixture 2020-08-17 14:19:38 +02:00
Dany
403318b1f5 Add Tika Converter (#314) 2020-08-17 11:21:09 +02:00
Tanay Soni
1637ce1184 Revert "Add Tika Converter (#314)"
This reverts commit 5ef59b1901da6d51bfa085683321a243228d4fc9.
2020-08-17 11:13:52 +02:00
Tanay Soni
5ef59b1901
Add Tika Converter (#314) 2020-08-14 14:13:59 +02:00
Tanay Soni
089fecf99e
Fix indexing of metadata for FAISS/SQL Document Store (#310) 2020-08-13 12:25:32 +02:00
bogdankostic
5186d2d235
Batch prediction in evaluation (#137)
* Add Batch evaluation

* Separate evaluation methods

* Clean calculation of eval metrics

* Adapt eval to Label objects

* Fix format of no_answer

* Adapt to MultiLabel

* Add tests
2020-08-10 19:30:31 +02:00
Karim Jana
c7078a36c0
Custom fields for indexing in ElasticsearchDocumentStore (#297) 2020-08-10 11:34:39 +02:00
Tanay Soni
9d0df60aad
Add FAISS Document Store (#253) 2020-08-07 14:25:08 +02:00
Timo Moeller
d9e8b522a1
Add "no answer" aggregation to Transformersreader (#259)
* Add no answer aggregation

* Change to covariant type annotation

* Remove n_best_per_passage from transformersreader
2020-08-06 17:32:55 +02:00
Tanay Soni
5937f9cf16
Deprecate Tags for Document Stores (#286) 2020-08-04 14:24:12 +02:00
Tanay Soni
723921475f
Make document ids of str type (#284) 2020-08-03 16:20:17 +02:00
Tanay Soni
d90435efd6 Add wait for Elasticsearch update call 2020-07-31 12:06:27 +02:00
Malte Pietsch
29a15c0d59
Add eval for Dense Passage Retriever & Refactor handling of labels/feedback (#243) 2020-07-31 11:34:06 +02:00
Tanay Soni
5210c8c2ab
Add method to update meta fields for documents in Elasticsearch (#242) 2020-07-16 15:34:55 +02:00
Malte Pietsch
6bed2f509f
Refactor DPR for latest transformers version & change init arg gpu -> use_gpu for DPR and EmbeddingRetriever (#239)
* fix tokenizer warning in latest transformers

* change dpr arg from gpu to use_gpu

* change gpu arg for EmbeddingRetriever
2020-07-16 10:45:01 +02:00
Tanay Soni
5c1a5fe61d
Add dummy retriever for benchmarking / reader-only settings (#235) 2020-07-15 17:22:17 +02:00
Tanay Soni
912e98cd40
Fix id for documents returned by the TfidfRetriever (#232) 2020-07-15 14:55:07 +02:00
Malte Pietsch
99a6a34047
Upgrade to new FARM / Transformers / PyTorch versions (#212) 2020-07-14 18:53:15 +02:00
Anirban Saha
6b217732f5
Add basic support for Docx Files (#225) 2020-07-14 12:28:19 +02:00
Tanay Soni
b886e054a3
Move document_name attribute to meta (#217) 2020-07-14 09:53:31 +02:00
Malte Pietsch
d2b26a99ff
Add more tests (#213) 2020-07-10 10:54:56 +02:00
Malte Pietsch
07ecfb60b9
Dense Passage Retriever (Inference) (#167) 2020-06-30 19:05:45 +02:00
Tanay Soni
ec433a5ed6
Move out REST API from PyPI package (#160) 2020-06-22 12:07:12 +02:00
Tanay Soni
a349eef0db
Add API endpoint to upload files (#154) 2020-06-17 16:28:26 +02:00
Tanay Soni
180dc8cbd6
Start Elasticsearch with a Github Action (#142) 2020-06-09 12:46:15 +02:00
Tanay Soni
160345f3d5 Update build workflow 2020-06-09 11:45:25 +02:00
Tanay Soni
ef9e4f4467
Add PDF text extraction (#109) 2020-06-08 11:07:19 +02:00
Stan Kirdey
ca6778d934
Add metadata for TF-IDF Retriever (#122) 2020-05-28 10:55:28 +02:00
Stan Kirdey
bf8e506c45
Add embedding query for InMemoryDocumentStore 2020-05-18 14:47:41 +02:00
Stan Kirdey
72a3b70d7a
Add filtering by tags for InMemoryDocumentStore (#108) 2020-05-14 22:12:25 +02:00
Tanay Soni
37e0ff70f7
Add test for Elasticsearch document store (#88) 2020-05-04 18:00:07 +02:00
Stan Kirdey
54d32d4f1f
Add coverage reports and more tests (#78) 2020-04-28 16:10:32 +02:00
Stan Kirdey
6038d40a53
Add InMemoryDocumentStore (#76) 2020-04-27 21:54:12 +02:00
Tanay Soni
8e736cefa0
Simplify Retriever query (#73) 2020-04-27 12:19:59 +02:00
Tanay Soni
f83a164095
Add Elasticsearch Document Store (#13) 2020-01-24 18:24:07 +01:00
Malte Pietsch
3ccd42f981 fix test 2020-01-23 15:25:42 +01:00
Malte Pietsch
8a48cd7dd6 fix test 2020-01-23 09:18:15 +01:00
Tanay Soni
845062ce2d Fix tests 2020-01-22 16:08:52 +01:00
Tanay Soni
d2c77f3077 Fix test 2019-11-27 19:34:10 +01:00
Malte Pietsch
7400abe327 add test 2019-11-27 17:53:42 +01:00