mirror of
https://github.com/deepset-ai/haystack.git
synced 2025-12-30 08:37:20 +00:00
* add time and perf benchmark for es * Add retriever benchmarking * Add Reader benchmarking * add nq to squad conversion * add conversion stats * clean benchmarks * Add link to dataset * Update imports * add first support for neg psgs * Refactor test * set max_seq_len * cleanup benchmark * begin retriever speed benchmarking * Add support for retriever query index benchmarking * improve reader eval, retriever speed benchmarking * improve retriever speed benchmarking * Add retriever accuracy benchmark * Add neg doc shuffling * Add top_n * 3x speedup of SQL. add postgres docker run. make shuffle neg a param. add more logging * Add models to sweep * add option for faiss index type * remove unneeded line * change faiss to faiss_flat * begin automatic benchmark script * remove existing postgres docker for benchmarking * Add data processing scripts * Remove shuffle in script bc data already shuffled * switch hnsw setup from 256 to 128 * change es similarity to dot product by default * Error includes stack trace * Change ES default timeout * remove delete_docs() from timing for indexing * Add support for website export * update website on push to benchmarks * add complete benchmarks results * new json format * removed NaN as is not a valid json token * fix benchmarking for faiss hnsw queries. do sql calls in update_embeddings() as batches * update benchmarks for hnsw 128,20,80 * don't delete full index in delete_all_documents() * update texts for charts * update recall column for retriever * change scale and add units to desc * add units to legend * add axis titles. update desc * add html tags Co-authored-by: deepset <deepset@Crenolape.localdomain> Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai> Co-authored-by: PiffPaffM <markuspaff.mp@gmail.com>
864 B
864 B
| 1 | EM | f1 | top_n_accuracy | top_n | reader_time | seconds_per_query | passages_per_second | reader | error | |
|---|---|---|---|---|---|---|---|---|---|---|
| 2 | 0 | 0.7589752233271532 | 0.8067985794671885 | 0.9671329849991572 | 5 | 133.79706027999998 | 0.011275666634080564 | 92.30397120949361 | deepset/roberta-base-squad2 | |
| 3 | 1 | 0.7359683128265633 | 0.7823306265318686 | 0.9714309792684982 | 5 | 125.22323393199997 | 0.010553112584864317 | 98.62387044489225 | deepset/minilm-uncased-squad2 | |
| 4 | 2 | 0.700825889094893 | 0.7490271600053505 | 0.9585369964604753 | 5 | 123.58959278499992 | 0.010415438461570867 | 99.92750782409666 | deepset/bert-base-cased-squad2 | |
| 5 | 3 | 0.7821506826226192 | 0.8264545708097472 | 0.9762346199224675 | 5 | 312.42233685099995 | 0.026329204184308102 | 39.529824033964466 | deepset/bert-large-uncased-whole-word-masking-squad2 | |
| 6 | 4 | 0.8099612337771785 | 0.8526275190954586 | 0.9772459126917242 | 5 | 314.3179854819998 | 0.026488958830439897 | 39.29142006004379 | deepset/xlm-roberta-large-squad2 |