6 Commits

Author SHA1 Message Date
Malte Pietsch
216787ed34
Fix benchmarks (#648)
* disable fasttokenizer, increase ES timeout for delete requests

* add session.close()

* fix deletion of docs
2020-12-02 16:59:42 +01:00
Malte Pietsch
0acafc403a
Automate benchmarks via CML (#518)
* initial test cml

* Update cml.yaml

* WIP test workflow

* switch to general ubuntu ami

* switch to general ubuntu ami

* disable gpu for tests

* rm gpu infos

* rm gpu infos

* update token env

* switch github token

* add postgres

* test db connection

* fix typo

* remove tty

* add sleep for db

* debug runner

* debug removal postgres

* debug: reset to working commit

* debug: change github token

* switch to new bot token

* debug token

* add back postgres

* adjust network runner docker

* add elastic

* fix typo

* adjust working dir

* fix benchmark execution

* enable s3 downloads

* add query benchmark. fix path

* add saving of markdown files

* cat md files. add faiss+dpr. increase n_queries

* switch to GPU instance

* switch availability zone

* switch to public aws DL ami

* increase volume size

* rm faiss. fix error logging

* save markdown files

* add reader benchmarks

* add download of squad data

* correct reader metric normalization

* fix newlines between reports

* fix max_docs for reader eval data. remove max_docs from ci run config

* fix mypy. switch workflow trigger

* try trigger for label

* try trigger for label

* change trigger syntax

* debug machine shutdown with test workflow

* add es and postgres to test workflow

* Revert "add es and postgres to test workflow"

This reverts commit 6f038d3d7f12eea924b54529e61b192858eaa9d5.

* Revert "debug machine shutdown with test workflow"

This reverts commit db70eabae8850b88e1d61fd79b04d4f49d54990a.

* fix typo in action. set benchmark config back to original
2020-11-18 18:28:17 +01:00
brandenchan
d3743d00e9 Merge branch 'master' into automate_benchmarks 2020-10-21 17:48:10 +02:00
Malte Pietsch
11a3976945 update deletes. fix arg in run.py 2020-10-19 14:40:26 +02:00
brandenchan
6d60cc9451 add automation pipeline 2020-10-15 18:12:17 +02:00
Branden Chan
1cebcb7dda
Create time and performance benchmarks for all readers and retrievers (#339)
* add time and perf benchmark for es

* Add retriever benchmarking

* Add Reader benchmarking

* add nq to squad conversion

* add conversion stats

* clean benchmarks

* Add link to dataset

* Update imports

* add first support for neg psgs

* Refactor test

* set max_seq_len

* cleanup benchmark

* begin retriever speed benchmarking

* Add support for retriever query index benchmarking

* improve reader eval, retriever speed benchmarking

* improve retriever speed benchmarking

* Add retriever accuracy benchmark

* Add neg doc shuffling

* Add top_n

* 3x speedup of SQL. add postgres docker run. make shuffle neg a param. add more logging

* Add models to sweep

* add option for faiss index type

* remove unneeded line

* change faiss to faiss_flat

* begin automatic benchmark script

* remove existing postgres docker for benchmarking

* Add data processing scripts

* Remove shuffle in script bc data already shuffled

* switch hnsw setup from 256 to 128

* change es similarity to dot product by default

* Error includes stack trace

* Change ES default timeout

* remove delete_docs() from timing for indexing

* Add support for website export

* update website on push to benchmarks

* add complete benchmarks results

* new json format

* removed NaN as is not a valid json token

* fix benchmarking for faiss hnsw queries. do sql calls in update_embeddings() as batches

* update benchmarks for hnsw 128,20,80

* don't delete full index in delete_all_documents()

* update texts for charts

* update recall column for retriever

* change scale and add units to desc

* add units to legend

* add axis titles. update desc

* add html tags

Co-authored-by: deepset <deepset@Crenolape.localdomain>
Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>
Co-authored-by: PiffPaffM <markuspaff.mp@gmail.com>
2020-10-12 13:34:42 +02:00