3803 Commits

Author SHA1 Message Date
Tanay Soni
8e52b48e1d
Add pipelines for GenerativeQA & FAQs (#645) 2020-12-03 10:27:06 +01:00
Malte Pietsch
216787ed34
Fix benchmarks (#648)
* disable fasttokenizer, increase ES timeout for delete requests

* add session.close()

* fix deletion of docs
2020-12-02 16:59:42 +01:00
Branden Chan
79555148ac
Add link to FAISS Info in documentation (#643)
* Add link to FAISS info

* Clean link
2020-12-02 15:24:22 +01:00
brandenchan
cdd009d1ef Better payload example spacing 2020-12-01 13:07:29 +01:00
Branden Chan
e573c9e27d
Improve User Feedback Documentation (#539)
* Extend docs

* Add User Feedback API calls

* Incorporate reviewer feedback
2020-12-01 12:55:31 +01:00
Malte Pietsch
a9107d29eb
Refactor DensePassageRetriever._get_predictions (#642) 2020-12-01 09:22:15 +01:00
Tanay Soni
5e62e54875
Rename question parameter to query (#614) 2020-11-30 17:50:04 +01:00
Branden Chan
5e5dba9587
Add api md (#631) 2020-11-27 17:26:53 +01:00
Branden Chan
9fbd845ef3
Clean API docs and increase coverage (#621)
* Fix docstrings

* Fix docstrings

* docstrings for retrievers and docstores

* Clean and add more docstrings
2020-11-27 17:17:58 +01:00
Tanay Soni
fa55de2fab
Add refresh_type param for Elasticsearch update_embeddings() (#630) 2020-11-27 16:10:04 +01:00
brandenchan
ce6cba227f Fix website typo 2020-11-27 16:07:29 +01:00
Markus Paff
88d0ee2c98
Add boxes for recommendations (#629)
* add boxes for recommendations

* add more recommendation boxes

Co-authored-by: brandenchan <brandenchan@icloud.com>
2020-11-27 16:00:20 +01:00
Malte Pietsch
58bc9aa7f0
Add contributor hall of fame (#628) 2020-11-26 14:52:20 +01:00
Ky-Anh Huynh
0edd127f35
Add formatting checks for shell scripts (#627) 2020-11-26 14:36:35 +01:00
Ky-Anh Huynh
4bd4a61e65
README: Fix link to roadmap (#626)
Co-authored-by: Ky-Anh Huynh <kyanh.huynh@viettug.org>
2020-11-26 14:01:05 +01:00
Tanay Soni
ea976ba5b5
Add return_embedding parameter for get_all_documents() (#615) 2020-11-26 10:32:30 +01:00
Branden Chan
09690b84b4
Move DPR embeddings from GPU to CPU straight away (#618)
* Start

* Move embeddings from gpu to cpu
2020-11-25 14:22:43 +01:00
Branden Chan
ae530c3a41
Fix docstring examples (#604)
* Fix docstring examples

* Unify code example format

* Add md files
2020-11-25 14:19:49 +01:00
Markus Paff
3dee284f20
cleaning the api docs (#616) 2020-11-24 18:49:14 +01:00
Branden Chan
e192387e65
Fix link (#613) 2020-11-24 11:11:20 +01:00
Tanay Soni
e3a68aedaf
Add support for building custom Search Pipelines (#596) 2020-11-20 17:41:08 +01:00
Guillim
65cf9547d2
Allow setting return_no_answers for TransformersReader in REST API (SQuAD 1.0 format) (#609)
* Update config.py

* new option

Allow a new option from the settings : tell is a reader model can return a "no answer" like SQuAD2.0 models, or if it's only a  SQuAD1.0-like model, always giving an answer.
2020-11-20 14:09:39 +01:00
Branden Chan
1e8af84ecc
Make more changes to documentation (#578)
* First batch of changes

* Add RAG tutorial links

* Prettify RAG tutorial

* draft of generator doc

* Add text

* Complete generator page

* Create optimization section

* Split intro

* Fix formatting tutorial 7
2020-11-19 14:58:27 +01:00
Branden Chan
2aa3c071fd
Remove column in benchmark website (#608)
* Make benchmarks clearer

* remove column
2020-11-19 12:18:47 +01:00
Branden Chan
827a40b12a
Make benchmarks clearer (#606) 2020-11-19 10:31:43 +01:00
Malte Pietsch
0acafc403a
Automate benchmarks via CML (#518)
* initial test cml

* Update cml.yaml

* WIP test workflow

* switch to general ubuntu ami

* switch to general ubuntu ami

* disable gpu for tests

* rm gpu infos

* rm gpu infos

* update token env

* switch github token

* add postgres

* test db connection

* fix typo

* remove tty

* add sleep for db

* debug runner

* debug removal postgres

* debug: reset to working commit

* debug: change github token

* switch to new bot token

* debug token

* add back postgres

* adjust network runner docker

* add elastic

* fix typo

* adjust working dir

* fix benchmark execution

* enable s3 downloads

* add query benchmark. fix path

* add saving of markdown files

* cat md files. add faiss+dpr. increase n_queries

* switch to GPU instance

* switch availability zone

* switch to public aws DL ami

* increase volume size

* rm faiss. fix error logging

* save markdown files

* add reader benchmarks

* add download of squad data

* correct reader metric normalization

* fix newlines between reports

* fix max_docs for reader eval data. remove max_docs from ci run config

* fix mypy. switch workflow trigger

* try trigger for label

* try trigger for label

* change trigger syntax

* debug machine shutdown with test workflow

* add es and postgres to test workflow

* Revert "add es and postgres to test workflow"

This reverts commit 6f038d3d7f12eea924b54529e61b192858eaa9d5.

* Revert "debug machine shutdown with test workflow"

This reverts commit db70eabae8850b88e1d61fd79b04d4f49d54990a.

* fix typo in action. set benchmark config back to original
2020-11-18 18:28:17 +01:00
Lalit Pagaria
3f81c93f36
Add document update for SQL and FAISS Document Store (#584) 2020-11-16 16:08:13 +01:00
Tanay Soni
3e095ddd7d
Add filters for delete_all_documents() (#591) 2020-11-16 14:15:32 +01:00
Lalit Pagaria
b511f9903e
[RAG] Fix top_k generator issue (#590)
* Removing device information from generator model arguments as it is handled by itself.

* num_return_sequences of should not be greate than num_beams

* Raise error when user use generator with GPU as currently it is not supported
2020-11-16 09:41:30 +01:00
Lalit Pagaria
23f1058b90
Fixing defaults in config for rest_api (#583)
* Fixing defaults configs for rest_apis

* Reverting change to VALID_LANGUAGES

* Casting EMBEDDING_DIM as int
2020-11-16 06:51:27 +01:00
bogdankostic
b3f7115f71
Add MAP retriever metric for open-domain case (#572)
* Add MAP metric for closed-domain case

* Add MAP metric for open-domain case

* Adapt MAP for closed-domain setting + add docstring

Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>
2020-11-13 15:04:25 +01:00
Timo Moeller
f118e4b738
Add needed whitespace before sentence start (#582) 2020-11-13 14:14:24 +01:00
Branden Chan
44230fca45
Fix CI bug due to new Elasticsearch release and new model release (#579)
* Cast generator to list

* Restrict ES version range

* Loosen ES requirement

* Change no_answer_test value
2020-11-13 10:35:53 +01:00
brandenchan
090a8cf3e9 Revert "First batch of changes"
This reverts commit c07182aa0ab77106cdb142f4ca43ff02476e6fbf.
2020-11-12 12:27:16 +01:00
brandenchan
c07182aa0a First batch of changes 2020-11-12 12:07:02 +01:00
Branden Chan
e72f4f4299
Update Colab Torch Version (#576)
* Update torch version

* Update torch version
2020-11-11 13:55:10 +01:00
Tanay Soni
acd088808b
Allow list of filter values in REST API (#568) 2020-11-09 20:41:53 +01:00
Malte Pietsch
2b352d6ac4
Update concept image 2020-11-07 08:44:09 +01:00
Malte Pietsch
ea0fd405d8 add concept sketch 2020-11-07 08:42:01 +01:00
Markus Paff
4cca3b5290
New docs version v0.5.0 (#560) 2020-11-06 13:17:04 +01:00
Branden Chan
99e924aede
Update Documentation for Haystack 0.5.0 (#557)
* Add languages and preprocessing pages

* add content

* address review comments

* make link relative

* update api ref with latest docstrings

* move doc readme and update

* add generator API docs

* fix example code

* design and link fix

Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>
Co-authored-by: PiffPaffM <markuspaff.mp@gmail.com>
v0.5.0
2020-11-06 10:53:22 +01:00
Malte Pietsch
f94603cbe4
Bump haystack version (#559) 2020-11-06 09:53:47 +01:00
Tanay Soni
d744dc109c
Add support for MySQL database (#556) 2020-11-05 17:39:39 +01:00
Markus Paff
40c5c8edb4
Added new formatting for examples in docstrings (#555) 2020-11-05 15:50:08 +01:00
Tanay Soni
727767388a
Allow configuration for Elasticsearch Analyzer (#554) 2020-11-05 13:59:53 +01:00
bogdankostic
ffaa0249f7
Fix retriever evaluation metrics (#547)
* Add mean reciprocal rank and fix mean average precision

* Add mrr metric to docstring

* Fix mypy error
2020-11-05 13:34:47 +01:00
bogdankostic
53be92c155
Add save and load method for DPR (#550)
* Add save and load method for DPR

* lower memory footprint for test. change names to load() and save()

* add test cases

Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>
2020-11-05 13:29:23 +01:00
Malte Pietsch
46530e86f8
Fix sentencepiece dependency in dockerfiles (#553) 2020-11-05 12:01:27 +01:00
Guillim
531d6a1c6e
Fix typo in dense.py (#545)
typo
2020-11-04 10:25:13 +01:00
Malte Pietsch
46fac41b54
Allow configuration of log level in REST API via ENV (#541)
* configure log level via env. adjust debug messages

* pin faiss version
2020-11-04 09:54:02 +01:00