Replace dpr with embeddingretriever tut5 (#2274)

* ipynb: EmbeddingRetriever made more prominent than DPR

* ipynb: EmbeddingRetriever more prominent than DPR

* Update Documentation & Code Style

* indentation fix

* Update Documentation & Code Style

* py: EmbeddingRetriever more prominent than DPR

* indentation fix

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
This commit is contained in:
mkkuemmel 2022-03-04 11:29:48 +01:00 committed by GitHub
parent 256450370d
commit 5951fc463e
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
3 changed files with 54 additions and 392 deletions

View File

@ -131,24 +131,25 @@ from haystack.nodes import ElasticsearchRetriever
retriever = ElasticsearchRetriever(document_store=document_store)
# Alternative: Evaluate dense retrievers (DensePassageRetriever or EmbeddingRetriever)
# DensePassageRetriever uses two separate transformer based encoders for query and document.
# In contrast, EmbeddingRetriever uses a single encoder for both.
# Alternative: Evaluate dense retrievers (EmbeddingRetriever or DensePassageRetriever)
# The EmbeddingRetriever uses a single transformer based encoder model for query and document.
# In contrast, DensePassageRetriever uses two separate encoders for both.
# Please make sure the "embedding_dim" parameter in the DocumentStore above matches the output dimension of your models!
# Please also take care that the PreProcessor splits your files into chunks that can be completely converted with
# the max_seq_len limitations of Transformers
# The SentenceTransformer model "all-mpnet-base-v2" generally works well with the EmbeddingRetriever on any kind of English text.
# For more information check out the documentation at: https://www.sbert.net/docs/pretrained_models.html
# The SentenceTransformer model "sentence-transformers/multi-qa-mpnet-base-dot-v1" generally works well with the EmbeddingRetriever on any kind of English text.
# For more information and suggestions on different models check out the documentation at: https://www.sbert.net/docs/pretrained_models.html
# from haystack.retriever import DensePassageRetriever, EmbeddingRetriever
# from haystack.retriever import EmbeddingRetriever, DensePassageRetriever
# retriever = EmbeddingRetriever(document_store=document_store, model_format="sentence_transformers",
# embedding_model="sentence-transformers/multi-qa-mpnet-base-dot-v1")
# retriever = DensePassageRetriever(document_store=document_store,
# query_embedding_model="facebook/dpr-question_encoder-single-nq-base",
# passage_embedding_model="facebook/dpr-ctx_encoder-single-nq-base",
# use_gpu=True,
# max_seq_len_passage=256,
# embed_title=True)
# retriever = EmbeddingRetriever(document_store=document_store, model_format="sentence_transformers",
# embedding_model="all-mpnet-base-v2")
# document_store.update_embeddings(retriever, index=doc_index)
```

View File

@ -2,13 +2,6 @@
"cells": [
{
"cell_type": "markdown",
"metadata": {
"collapsed": true,
"id": "MGSXn0USOhtu",
"pycharm": {
"name": "#%% md\n"
}
},
"source": [
"# Evaluation of a Pipeline and its Components\n",
"\n",
@ -16,12 +9,18 @@
"\n",
"To be able to make a statement about the quality of results a question-answering pipeline or any other pipeline in haystack produces, it is important to evaluate it. Furthermore, evaluation allows determining which components of the pipeline can be improved.\n",
"The results of the evaluation can be saved as CSV files, which contain all the information to calculate additional metrics later on or inspect individual predictions."
]
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "markdown",
"metadata": {
"id": "lEKOjCS5U7so"
"id": "lEKOjCS5U7so",
"pycharm": {
"is_executing": true
}
},
"source": [
"### Prepare environment\n",
@ -31,7 +30,8 @@
"**Runtime -> Change Runtime type -> Hardware accelerator -> GPU**\n",
"\n",
"<img src=\"https://raw.githubusercontent.com/deepset-ai/haystack/master/docs/_src/img/colab_gpu_runtime.jpg\">"
]
],
"outputs": []
},
{
"cell_type": "code",
@ -248,24 +248,25 @@
"\n",
"retriever = ElasticsearchRetriever(document_store=document_store)\n",
"\n",
"# Alternative: Evaluate dense retrievers (DensePassageRetriever or EmbeddingRetriever)\n",
"# DensePassageRetriever uses two separate transformer based encoders for query and document.\n",
"# In contrast, EmbeddingRetriever uses a single encoder for both.\n",
"# Alternative: Evaluate dense retrievers (EmbeddingRetriever or DensePassageRetriever)\n",
"# The EmbeddingRetriever uses a single transformer based encoder model for query and document.\n",
"# In contrast, DensePassageRetriever uses two separate encoders for both.\n",
"\n",
"# Please make sure the \"embedding_dim\" parameter in the DocumentStore above matches the output dimension of your models!\n",
"# Please also take care that the PreProcessor splits your files into chunks that can be completely converted with\n",
"# the max_seq_len limitations of Transformers\n",
"# The SentenceTransformer model \"all-mpnet-base-v2\" generally works well with the EmbeddingRetriever on any kind of English text.\n",
"# For more information check out the documentation at: https://www.sbert.net/docs/pretrained_models.html\n",
"# The SentenceTransformer model \"sentence-transformers/multi-qa-mpnet-base-dot-v1\" generally works well with the EmbeddingRetriever on any kind of English text.\n",
"# For more information and suggestions on different models check out the documentation at: https://www.sbert.net/docs/pretrained_models.html\n",
"\n",
"# from haystack.retriever import DensePassageRetriever, EmbeddingRetriever\n",
"# from haystack.retriever import EmbeddingRetriever, DensePassageRetriever\n",
"# retriever = EmbeddingRetriever(document_store=document_store, model_format=\"sentence_transformers\",\n",
"# embedding_model=\"sentence-transformers/multi-qa-mpnet-base-dot-v1\")\n",
"# retriever = DensePassageRetriever(document_store=document_store,\n",
"# query_embedding_model=\"facebook/dpr-question_encoder-single-nq-base\",\n",
"# passage_embedding_model=\"facebook/dpr-ctx_encoder-single-nq-base\",\n",
"# use_gpu=True,\n",
"# max_seq_len_passage=256,\n",
"# embed_title=True)\n",
"# retriever = EmbeddingRetriever(document_store=document_store, model_format=\"sentence_transformers\",\n",
"# embedding_model=\"all-mpnet-base-v2\")\n",
"# document_store.update_embeddings(retriever, index=doc_index)"
]
},
@ -435,7 +436,7 @@
},
{
"cell_type": "code",
"execution_count": 8,
"execution_count": 53,
"metadata": {
"pycharm": {
"name": "#%%\n"
@ -444,171 +445,10 @@
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>query</th>\n",
" <th>gold_document_contents</th>\n",
" <th>content</th>\n",
" <th>gold_id_match</th>\n",
" <th>answer_match</th>\n",
" <th>gold_id_or_answer_match</th>\n",
" <th>rank</th>\n",
" <th>document_id</th>\n",
" <th>gold_document_ids</th>\n",
" <th>type</th>\n",
" <th>node</th>\n",
" <th>eval_mode</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>who is written in the book of life</td>\n",
" <td>[Book of Life - wikipedia Book of Life Jump to: navigation, search This arti...</td>\n",
" <td>people considered righteous before God. God has such a book, and to be blott...</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>1.0</td>\n",
" <td>1b090aec7dbd1af6739c4c80f8995877-1</td>\n",
" <td>[1b090aec7dbd1af6739c4c80f8995877-0, 1b090aec7dbd1af6739c4c80f8995877-0]</td>\n",
" <td>document</td>\n",
" <td>Retriever</td>\n",
" <td>integrated</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>who is written in the book of life</td>\n",
" <td>[Book of Life - wikipedia Book of Life Jump to: navigation, search This arti...</td>\n",
" <td>as adversaries (of God). Also, according to ib. xxxvi. 10, one who contrives...</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>2.0</td>\n",
" <td>1b090aec7dbd1af6739c4c80f8995877-2</td>\n",
" <td>[1b090aec7dbd1af6739c4c80f8995877-0, 1b090aec7dbd1af6739c4c80f8995877-0]</td>\n",
" <td>document</td>\n",
" <td>Retriever</td>\n",
" <td>integrated</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>who is written in the book of life</td>\n",
" <td>[Book of Life - wikipedia Book of Life Jump to: navigation, search This arti...</td>\n",
" <td>the citizens' registers. The life which the righteous participate in is to b...</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>3.0</td>\n",
" <td>1b090aec7dbd1af6739c4c80f8995877-6</td>\n",
" <td>[1b090aec7dbd1af6739c4c80f8995877-0, 1b090aec7dbd1af6739c4c80f8995877-0]</td>\n",
" <td>document</td>\n",
" <td>Retriever</td>\n",
" <td>integrated</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>who is written in the book of life</td>\n",
" <td>[Book of Life - wikipedia Book of Life Jump to: navigation, search This arti...</td>\n",
" <td>apostles' names are ``written in heaven'' (Luke x. 20), or ``the fellow-work...</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>4.0</td>\n",
" <td>1b090aec7dbd1af6739c4c80f8995877-3</td>\n",
" <td>[1b090aec7dbd1af6739c4c80f8995877-0, 1b090aec7dbd1af6739c4c80f8995877-0]</td>\n",
" <td>document</td>\n",
" <td>Retriever</td>\n",
" <td>integrated</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>who is written in the book of life</td>\n",
" <td>[Book of Life - wikipedia Book of Life Jump to: navigation, search This arti...</td>\n",
" <td>The Absolutely True Diary of a Part-Time Indian - wikipedia The Absolutely T...</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>5.0</td>\n",
" <td>e9260cbbc129f4246ee8fcfbbe385822-0</td>\n",
" <td>[1b090aec7dbd1af6739c4c80f8995877-0, 1b090aec7dbd1af6739c4c80f8995877-0]</td>\n",
" <td>document</td>\n",
" <td>Retriever</td>\n",
" <td>integrated</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" query \\\n",
"0 who is written in the book of life \n",
"1 who is written in the book of life \n",
"2 who is written in the book of life \n",
"3 who is written in the book of life \n",
"4 who is written in the book of life \n",
"\n",
" gold_document_contents \\\n",
"0 [Book of Life - wikipedia Book of Life Jump to: navigation, search This arti... \n",
"1 [Book of Life - wikipedia Book of Life Jump to: navigation, search This arti... \n",
"2 [Book of Life - wikipedia Book of Life Jump to: navigation, search This arti... \n",
"3 [Book of Life - wikipedia Book of Life Jump to: navigation, search This arti... \n",
"4 [Book of Life - wikipedia Book of Life Jump to: navigation, search This arti... \n",
"\n",
" content \\\n",
"0 people considered righteous before God. God has such a book, and to be blott... \n",
"1 as adversaries (of God). Also, according to ib. xxxvi. 10, one who contrives... \n",
"2 the citizens' registers. The life which the righteous participate in is to b... \n",
"3 apostles' names are ``written in heaven'' (Luke x. 20), or ``the fellow-work... \n",
"4 The Absolutely True Diary of a Part-Time Indian - wikipedia The Absolutely T... \n",
"\n",
" gold_id_match answer_match gold_id_or_answer_match rank \\\n",
"0 0.0 0.0 0.0 1.0 \n",
"1 0.0 0.0 0.0 2.0 \n",
"2 0.0 0.0 0.0 3.0 \n",
"3 0.0 0.0 0.0 4.0 \n",
"4 0.0 0.0 0.0 5.0 \n",
"\n",
" document_id \\\n",
"0 1b090aec7dbd1af6739c4c80f8995877-1 \n",
"1 1b090aec7dbd1af6739c4c80f8995877-2 \n",
"2 1b090aec7dbd1af6739c4c80f8995877-6 \n",
"3 1b090aec7dbd1af6739c4c80f8995877-3 \n",
"4 e9260cbbc129f4246ee8fcfbbe385822-0 \n",
"\n",
" gold_document_ids \\\n",
"0 [1b090aec7dbd1af6739c4c80f8995877-0, 1b090aec7dbd1af6739c4c80f8995877-0] \n",
"1 [1b090aec7dbd1af6739c4c80f8995877-0, 1b090aec7dbd1af6739c4c80f8995877-0] \n",
"2 [1b090aec7dbd1af6739c4c80f8995877-0, 1b090aec7dbd1af6739c4c80f8995877-0] \n",
"3 [1b090aec7dbd1af6739c4c80f8995877-0, 1b090aec7dbd1af6739c4c80f8995877-0] \n",
"4 [1b090aec7dbd1af6739c4c80f8995877-0, 1b090aec7dbd1af6739c4c80f8995877-0] \n",
"\n",
" type node eval_mode \n",
"0 document Retriever integrated \n",
"1 document Retriever integrated \n",
"2 document Retriever integrated \n",
"3 document Retriever integrated \n",
"4 document Retriever integrated "
]
"text/plain": " multilabel_id query filters \\\n0 1886992123615626403 who is written in the book of life b'null' \n1 1886992123615626403 who is written in the book of life b'null' \n2 1886992123615626403 who is written in the book of life b'null' \n3 1886992123615626403 who is written in the book of life b'null' \n4 1886992123615626403 who is written in the book of life b'null' \n\n gold_document_contents \\\n0 [Book of Life - wikipedia Book of Life Jump to: navigation, search This arti... \n1 [Book of Life - wikipedia Book of Life Jump to: navigation, search This arti... \n2 [Book of Life - wikipedia Book of Life Jump to: navigation, search This arti... \n3 [Book of Life - wikipedia Book of Life Jump to: navigation, search This arti... \n4 [Book of Life - wikipedia Book of Life Jump to: navigation, search This arti... \n\n content \\\n0 people considered righteous before God. God has such a book, and to be blott... \n1 as adversaries (of God). Also, according to ib. xxxvi. 10, one who contrives... \n2 the citizens' registers. The life which the righteous participate in is to b... \n3 apostles' names are ``written in heaven'' (Luke x. 20), or ``the fellow-work... \n4 The Absolutely True Diary of a Part-Time Indian - wikipedia The Absolutely T... \n\n gold_id_match answer_match gold_id_or_answer_match rank \\\n0 0.0 0.0 0.0 1.0 \n1 0.0 0.0 0.0 2.0 \n2 0.0 0.0 0.0 3.0 \n3 0.0 0.0 0.0 4.0 \n4 0.0 0.0 0.0 5.0 \n\n document_id \\\n0 1b090aec7dbd1af6739c4c80f8995877-1 \n1 1b090aec7dbd1af6739c4c80f8995877-2 \n2 1b090aec7dbd1af6739c4c80f8995877-6 \n3 1b090aec7dbd1af6739c4c80f8995877-3 \n4 e9260cbbc129f4246ee8fcfbbe385822-0 \n\n gold_document_ids \\\n0 [1b090aec7dbd1af6739c4c80f8995877-0, 1b090aec7dbd1af6739c4c80f8995877-0] \n1 [1b090aec7dbd1af6739c4c80f8995877-0, 1b090aec7dbd1af6739c4c80f8995877-0] \n2 [1b090aec7dbd1af6739c4c80f8995877-0, 1b090aec7dbd1af6739c4c80f8995877-0] \n3 [1b090aec7dbd1af6739c4c80f8995877-0, 1b090aec7dbd1af6739c4c80f8995877-0] \n4 [1b090aec7dbd1af6739c4c80f8995877-0, 1b090aec7dbd1af6739c4c80f8995877-0] \n\n type node eval_mode \n0 document Retriever integrated \n1 document Retriever integrated \n2 document Retriever integrated \n3 document Retriever integrated \n4 document Retriever integrated ",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>multilabel_id</th>\n <th>query</th>\n <th>filters</th>\n <th>gold_document_contents</th>\n <th>content</th>\n <th>gold_id_match</th>\n <th>answer_match</th>\n <th>gold_id_or_answer_match</th>\n <th>rank</th>\n <th>document_id</th>\n <th>gold_document_ids</th>\n <th>type</th>\n <th>node</th>\n <th>eval_mode</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>0</th>\n <td>1886992123615626403</td>\n <td>who is written in the book of life</td>\n <td>b'null'</td>\n <td>[Book of Life - wikipedia Book of Life Jump to: navigation, search This arti...</td>\n <td>people considered righteous before God. God has such a book, and to be blott...</td>\n <td>0.0</td>\n <td>0.0</td>\n <td>0.0</td>\n <td>1.0</td>\n <td>1b090aec7dbd1af6739c4c80f8995877-1</td>\n <td>[1b090aec7dbd1af6739c4c80f8995877-0, 1b090aec7dbd1af6739c4c80f8995877-0]</td>\n <td>document</td>\n <td>Retriever</td>\n <td>integrated</td>\n </tr>\n <tr>\n <th>1</th>\n <td>1886992123615626403</td>\n <td>who is written in the book of life</td>\n <td>b'null'</td>\n <td>[Book of Life - wikipedia Book of Life Jump to: navigation, search This arti...</td>\n <td>as adversaries (of God). Also, according to ib. xxxvi. 10, one who contrives...</td>\n <td>0.0</td>\n <td>0.0</td>\n <td>0.0</td>\n <td>2.0</td>\n <td>1b090aec7dbd1af6739c4c80f8995877-2</td>\n <td>[1b090aec7dbd1af6739c4c80f8995877-0, 1b090aec7dbd1af6739c4c80f8995877-0]</td>\n <td>document</td>\n <td>Retriever</td>\n <td>integrated</td>\n </tr>\n <tr>\n <th>2</th>\n <td>1886992123615626403</td>\n <td>who is written in the book of life</td>\n <td>b'null'</td>\n <td>[Book of Life - wikipedia Book of Life Jump to: navigation, search This arti...</td>\n <td>the citizens' registers. The life which the righteous participate in is to b...</td>\n <td>0.0</td>\n <td>0.0</td>\n <td>0.0</td>\n <td>3.0</td>\n <td>1b090aec7dbd1af6739c4c80f8995877-6</td>\n <td>[1b090aec7dbd1af6739c4c80f8995877-0, 1b090aec7dbd1af6739c4c80f8995877-0]</td>\n <td>document</td>\n <td>Retriever</td>\n <td>integrated</td>\n </tr>\n <tr>\n <th>3</th>\n <td>1886992123615626403</td>\n <td>who is written in the book of life</td>\n <td>b'null'</td>\n <td>[Book of Life - wikipedia Book of Life Jump to: navigation, search This arti...</td>\n <td>apostles' names are ``written in heaven'' (Luke x. 20), or ``the fellow-work...</td>\n <td>0.0</td>\n <td>0.0</td>\n <td>0.0</td>\n <td>4.0</td>\n <td>1b090aec7dbd1af6739c4c80f8995877-3</td>\n <td>[1b090aec7dbd1af6739c4c80f8995877-0, 1b090aec7dbd1af6739c4c80f8995877-0]</td>\n <td>document</td>\n <td>Retriever</td>\n <td>integrated</td>\n </tr>\n <tr>\n <th>4</th>\n <td>1886992123615626403</td>\n <td>who is written in the book of life</td>\n <td>b'null'</td>\n <td>[Book of Life - wikipedia Book of Life Jump to: navigation, search This arti...</td>\n <td>The Absolutely True Diary of a Part-Time Indian - wikipedia The Absolutely T...</td>\n <td>0.0</td>\n <td>0.0</td>\n <td>0.0</td>\n <td>5.0</td>\n <td>e9260cbbc129f4246ee8fcfbbe385822-0</td>\n <td>[1b090aec7dbd1af6739c4c80f8995877-0, 1b090aec7dbd1af6739c4c80f8995877-0]</td>\n <td>document</td>\n <td>Retriever</td>\n <td>integrated</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"execution_count": 8,
"execution_count": 53,
"metadata": {},
"output_type": "execute_result"
}
@ -623,7 +463,7 @@
},
{
"cell_type": "code",
"execution_count": 9,
"execution_count": 54,
"metadata": {
"pycharm": {
"name": "#%%\n"
@ -632,197 +472,10 @@
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>query</th>\n",
" <th>gold_answers</th>\n",
" <th>answer</th>\n",
" <th>context</th>\n",
" <th>exact_match</th>\n",
" <th>f1</th>\n",
" <th>rank</th>\n",
" <th>document_id</th>\n",
" <th>gold_document_ids</th>\n",
" <th>offsets_in_document</th>\n",
" <th>gold_offsets_in_documents</th>\n",
" <th>type</th>\n",
" <th>node</th>\n",
" <th>eval_mode</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>who is written in the book of life</td>\n",
" <td>[every person who is destined for Heaven or the World to Come, all people co...</td>\n",
" <td></td>\n",
" <td>None</td>\n",
" <td>0.0</td>\n",
" <td>0.000000</td>\n",
" <td>1.0</td>\n",
" <td>None</td>\n",
" <td>[1b090aec7dbd1af6739c4c80f8995877-0, 1b090aec7dbd1af6739c4c80f8995877-0]</td>\n",
" <td>[{'start': 0, 'end': 0}]</td>\n",
" <td>[{'start': 374, 'end': 434}, {'start': 1107, 'end': 1149}]</td>\n",
" <td>answer</td>\n",
" <td>Reader</td>\n",
" <td>integrated</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>who is written in the book of life</td>\n",
" <td>[every person who is destined for Heaven or the World to Come, all people co...</td>\n",
" <td>those whose names are written in the Book of Life from the foundation of the...</td>\n",
" <td>ohn of Patmos. As described, only those whose names are written in the Book ...</td>\n",
" <td>0.0</td>\n",
" <td>0.083333</td>\n",
" <td>2.0</td>\n",
" <td>1b090aec7dbd1af6739c4c80f8995877-3</td>\n",
" <td>[1b090aec7dbd1af6739c4c80f8995877-0, 1b090aec7dbd1af6739c4c80f8995877-0]</td>\n",
" <td>[{'start': 576, 'end': 658}]</td>\n",
" <td>[{'start': 374, 'end': 434}, {'start': 1107, 'end': 1149}]</td>\n",
" <td>answer</td>\n",
" <td>Reader</td>\n",
" <td>integrated</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>who is written in the book of life</td>\n",
" <td>[every person who is destined for Heaven or the World to Come, all people co...</td>\n",
" <td>only the names of the righteous</td>\n",
" <td>. The Psalmist likewise speaks of the Book of Life in which only the names o...</td>\n",
" <td>0.0</td>\n",
" <td>0.200000</td>\n",
" <td>3.0</td>\n",
" <td>1b090aec7dbd1af6739c4c80f8995877-1</td>\n",
" <td>[1b090aec7dbd1af6739c4c80f8995877-0, 1b090aec7dbd1af6739c4c80f8995877-0]</td>\n",
" <td>[{'start': 498, 'end': 529}]</td>\n",
" <td>[{'start': 374, 'end': 434}, {'start': 1107, 'end': 1149}]</td>\n",
" <td>answer</td>\n",
" <td>Reader</td>\n",
" <td>integrated</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>who is written in the book of life</td>\n",
" <td>[every person who is destined for Heaven or the World to Come, all people co...</td>\n",
" <td>those who are found written in the book and who shall escape the troubles pr...</td>\n",
" <td>those who are found written in the book and who shall escape the troubles pr...</td>\n",
" <td>0.0</td>\n",
" <td>0.111111</td>\n",
" <td>4.0</td>\n",
" <td>1b090aec7dbd1af6739c4c80f8995877-6</td>\n",
" <td>[1b090aec7dbd1af6739c4c80f8995877-0, 1b090aec7dbd1af6739c4c80f8995877-0]</td>\n",
" <td>[{'start': 135, 'end': 305}]</td>\n",
" <td>[{'start': 374, 'end': 434}, {'start': 1107, 'end': 1149}]</td>\n",
" <td>answer</td>\n",
" <td>Reader</td>\n",
" <td>integrated</td>\n",
" </tr>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>who was the girl in the video brenda got a baby</td>\n",
" <td>[Ethel ``Edy'' Proctor]</td>\n",
" <td>her cousin</td>\n",
" <td>ng a story in the newspaper of a 12-year-old girl getting pregnant by her co...</td>\n",
" <td>0.0</td>\n",
" <td>0.000000</td>\n",
" <td>1.0</td>\n",
" <td>965a125f65658579529b39f8e4344969-3</td>\n",
" <td>[965a125f65658579529b39f8e4344969-3]</td>\n",
" <td>[{'start': 423, 'end': 433}]</td>\n",
" <td>[{'start': 181, 'end': 202}]</td>\n",
" <td>answer</td>\n",
" <td>Reader</td>\n",
" <td>integrated</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" query \\\n",
"0 who is written in the book of life \n",
"1 who is written in the book of life \n",
"2 who is written in the book of life \n",
"3 who is written in the book of life \n",
"0 who was the girl in the video brenda got a baby \n",
"\n",
" gold_answers \\\n",
"0 [every person who is destined for Heaven or the World to Come, all people co... \n",
"1 [every person who is destined for Heaven or the World to Come, all people co... \n",
"2 [every person who is destined for Heaven or the World to Come, all people co... \n",
"3 [every person who is destined for Heaven or the World to Come, all people co... \n",
"0 [Ethel ``Edy'' Proctor] \n",
"\n",
" answer \\\n",
"0 \n",
"1 those whose names are written in the Book of Life from the foundation of the... \n",
"2 only the names of the righteous \n",
"3 those who are found written in the book and who shall escape the troubles pr... \n",
"0 her cousin \n",
"\n",
" context \\\n",
"0 None \n",
"1 ohn of Patmos. As described, only those whose names are written in the Book ... \n",
"2 . The Psalmist likewise speaks of the Book of Life in which only the names o... \n",
"3 those who are found written in the book and who shall escape the troubles pr... \n",
"0 ng a story in the newspaper of a 12-year-old girl getting pregnant by her co... \n",
"\n",
" exact_match f1 rank document_id \\\n",
"0 0.0 0.000000 1.0 None \n",
"1 0.0 0.083333 2.0 1b090aec7dbd1af6739c4c80f8995877-3 \n",
"2 0.0 0.200000 3.0 1b090aec7dbd1af6739c4c80f8995877-1 \n",
"3 0.0 0.111111 4.0 1b090aec7dbd1af6739c4c80f8995877-6 \n",
"0 0.0 0.000000 1.0 965a125f65658579529b39f8e4344969-3 \n",
"\n",
" gold_document_ids \\\n",
"0 [1b090aec7dbd1af6739c4c80f8995877-0, 1b090aec7dbd1af6739c4c80f8995877-0] \n",
"1 [1b090aec7dbd1af6739c4c80f8995877-0, 1b090aec7dbd1af6739c4c80f8995877-0] \n",
"2 [1b090aec7dbd1af6739c4c80f8995877-0, 1b090aec7dbd1af6739c4c80f8995877-0] \n",
"3 [1b090aec7dbd1af6739c4c80f8995877-0, 1b090aec7dbd1af6739c4c80f8995877-0] \n",
"0 [965a125f65658579529b39f8e4344969-3] \n",
"\n",
" offsets_in_document \\\n",
"0 [{'start': 0, 'end': 0}] \n",
"1 [{'start': 576, 'end': 658}] \n",
"2 [{'start': 498, 'end': 529}] \n",
"3 [{'start': 135, 'end': 305}] \n",
"0 [{'start': 423, 'end': 433}] \n",
"\n",
" gold_offsets_in_documents type node \\\n",
"0 [{'start': 374, 'end': 434}, {'start': 1107, 'end': 1149}] answer Reader \n",
"1 [{'start': 374, 'end': 434}, {'start': 1107, 'end': 1149}] answer Reader \n",
"2 [{'start': 374, 'end': 434}, {'start': 1107, 'end': 1149}] answer Reader \n",
"3 [{'start': 374, 'end': 434}, {'start': 1107, 'end': 1149}] answer Reader \n",
"0 [{'start': 181, 'end': 202}] answer Reader \n",
"\n",
" eval_mode \n",
"0 integrated \n",
"1 integrated \n",
"2 integrated \n",
"3 integrated \n",
"0 integrated "
]
"text/plain": " multilabel_id query \\\n0 1886992123615626403 who is written in the book of life \n1 1886992123615626403 who is written in the book of life \n2 1886992123615626403 who is written in the book of life \n3 1886992123615626403 who is written in the book of life \n0 -3790070251458150675 who was the girl in the video brenda got a baby \n\n filters \\\n0 b'null' \n1 b'null' \n2 b'null' \n3 b'null' \n0 b'null' \n\n gold_answers \\\n0 [every person who is destined for Heaven or the World to Come, all people co... \n1 [every person who is destined for Heaven or the World to Come, all people co... \n2 [every person who is destined for Heaven or the World to Come, all people co... \n3 [every person who is destined for Heaven or the World to Come, all people co... \n0 [Ethel ``Edy'' Proctor] \n\n answer \\\n0 \n1 those whose names are written in the Book of Life from the foundation of the... \n2 only the names of the righteous \n3 those who are found written in the book and who shall escape the troubles pr... \n0 her cousin \n\n context \\\n0 None \n1 ohn of Patmos. As described, only those whose names are written in the Book ... \n2 . The Psalmist likewise speaks of the Book of Life in which only the names o... \n3 those who are found written in the book and who shall escape the troubles pr... \n0 ng a story in the newspaper of a 12-year-old girl getting pregnant by her co... \n\n exact_match f1 rank document_id \\\n0 0.0 0.000000 1.0 None \n1 0.0 0.083333 2.0 1b090aec7dbd1af6739c4c80f8995877-3 \n2 0.0 0.200000 3.0 1b090aec7dbd1af6739c4c80f8995877-1 \n3 0.0 0.111111 4.0 1b090aec7dbd1af6739c4c80f8995877-6 \n0 0.0 0.000000 1.0 965a125f65658579529b39f8e4344969-3 \n\n gold_document_ids \\\n0 [1b090aec7dbd1af6739c4c80f8995877-0, 1b090aec7dbd1af6739c4c80f8995877-0] \n1 [1b090aec7dbd1af6739c4c80f8995877-0, 1b090aec7dbd1af6739c4c80f8995877-0] \n2 [1b090aec7dbd1af6739c4c80f8995877-0, 1b090aec7dbd1af6739c4c80f8995877-0] \n3 [1b090aec7dbd1af6739c4c80f8995877-0, 1b090aec7dbd1af6739c4c80f8995877-0] \n0 [965a125f65658579529b39f8e4344969-3] \n\n offsets_in_document \\\n0 [{'start': 0, 'end': 0}] \n1 [{'start': 576, 'end': 658}] \n2 [{'start': 498, 'end': 529}] \n3 [{'start': 135, 'end': 305}] \n0 [{'start': 423, 'end': 433}] \n\n gold_offsets_in_documents type node \\\n0 [{'start': 374, 'end': 434}, {'start': 1107, 'end': 1149}] answer Reader \n1 [{'start': 374, 'end': 434}, {'start': 1107, 'end': 1149}] answer Reader \n2 [{'start': 374, 'end': 434}, {'start': 1107, 'end': 1149}] answer Reader \n3 [{'start': 374, 'end': 434}, {'start': 1107, 'end': 1149}] answer Reader \n0 [{'start': 181, 'end': 202}] answer Reader \n\n eval_mode \n0 integrated \n1 integrated \n2 integrated \n3 integrated \n0 integrated ",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>multilabel_id</th>\n <th>query</th>\n <th>filters</th>\n <th>gold_answers</th>\n <th>answer</th>\n <th>context</th>\n <th>exact_match</th>\n <th>f1</th>\n <th>rank</th>\n <th>document_id</th>\n <th>gold_document_ids</th>\n <th>offsets_in_document</th>\n <th>gold_offsets_in_documents</th>\n <th>type</th>\n <th>node</th>\n <th>eval_mode</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>0</th>\n <td>1886992123615626403</td>\n <td>who is written in the book of life</td>\n <td>b'null'</td>\n <td>[every person who is destined for Heaven or the World to Come, all people co...</td>\n <td></td>\n <td>None</td>\n <td>0.0</td>\n <td>0.000000</td>\n <td>1.0</td>\n <td>None</td>\n <td>[1b090aec7dbd1af6739c4c80f8995877-0, 1b090aec7dbd1af6739c4c80f8995877-0]</td>\n <td>[{'start': 0, 'end': 0}]</td>\n <td>[{'start': 374, 'end': 434}, {'start': 1107, 'end': 1149}]</td>\n <td>answer</td>\n <td>Reader</td>\n <td>integrated</td>\n </tr>\n <tr>\n <th>1</th>\n <td>1886992123615626403</td>\n <td>who is written in the book of life</td>\n <td>b'null'</td>\n <td>[every person who is destined for Heaven or the World to Come, all people co...</td>\n <td>those whose names are written in the Book of Life from the foundation of the...</td>\n <td>ohn of Patmos. As described, only those whose names are written in the Book ...</td>\n <td>0.0</td>\n <td>0.083333</td>\n <td>2.0</td>\n <td>1b090aec7dbd1af6739c4c80f8995877-3</td>\n <td>[1b090aec7dbd1af6739c4c80f8995877-0, 1b090aec7dbd1af6739c4c80f8995877-0]</td>\n <td>[{'start': 576, 'end': 658}]</td>\n <td>[{'start': 374, 'end': 434}, {'start': 1107, 'end': 1149}]</td>\n <td>answer</td>\n <td>Reader</td>\n <td>integrated</td>\n </tr>\n <tr>\n <th>2</th>\n <td>1886992123615626403</td>\n <td>who is written in the book of life</td>\n <td>b'null'</td>\n <td>[every person who is destined for Heaven or the World to Come, all people co...</td>\n <td>only the names of the righteous</td>\n <td>. The Psalmist likewise speaks of the Book of Life in which only the names o...</td>\n <td>0.0</td>\n <td>0.200000</td>\n <td>3.0</td>\n <td>1b090aec7dbd1af6739c4c80f8995877-1</td>\n <td>[1b090aec7dbd1af6739c4c80f8995877-0, 1b090aec7dbd1af6739c4c80f8995877-0]</td>\n <td>[{'start': 498, 'end': 529}]</td>\n <td>[{'start': 374, 'end': 434}, {'start': 1107, 'end': 1149}]</td>\n <td>answer</td>\n <td>Reader</td>\n <td>integrated</td>\n </tr>\n <tr>\n <th>3</th>\n <td>1886992123615626403</td>\n <td>who is written in the book of life</td>\n <td>b'null'</td>\n <td>[every person who is destined for Heaven or the World to Come, all people co...</td>\n <td>those who are found written in the book and who shall escape the troubles pr...</td>\n <td>those who are found written in the book and who shall escape the troubles pr...</td>\n <td>0.0</td>\n <td>0.111111</td>\n <td>4.0</td>\n <td>1b090aec7dbd1af6739c4c80f8995877-6</td>\n <td>[1b090aec7dbd1af6739c4c80f8995877-0, 1b090aec7dbd1af6739c4c80f8995877-0]</td>\n <td>[{'start': 135, 'end': 305}]</td>\n <td>[{'start': 374, 'end': 434}, {'start': 1107, 'end': 1149}]</td>\n <td>answer</td>\n <td>Reader</td>\n <td>integrated</td>\n </tr>\n <tr>\n <th>0</th>\n <td>-3790070251458150675</td>\n <td>who was the girl in the video brenda got a baby</td>\n <td>b'null'</td>\n <td>[Ethel ``Edy'' Proctor]</td>\n <td>her cousin</td>\n <td>ng a story in the newspaper of a 12-year-old girl getting pregnant by her co...</td>\n <td>0.0</td>\n <td>0.000000</td>\n <td>1.0</td>\n <td>965a125f65658579529b39f8e4344969-3</td>\n <td>[965a125f65658579529b39f8e4344969-3]</td>\n <td>[{'start': 423, 'end': 433}]</td>\n <td>[{'start': 181, 'end': 202}]</td>\n <td>answer</td>\n <td>Reader</td>\n <td>integrated</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"execution_count": 9,
"execution_count": 54,
"metadata": {},
"output_type": "execute_result"
}
@ -1084,7 +737,10 @@
{
"cell_type": "markdown",
"metadata": {
"id": "8QJ68G12U7tb"
"id": "8QJ68G12U7tb",
"pycharm": {
"name": "#%% md\n"
}
},
"source": [
"## About us\n",
@ -1116,9 +772,9 @@
"hash": "01829e1eb67c4f5275a41f9336c92adbb77a108c8fc957dfe99d03e96dd1f349"
},
"kernelspec": {
"display_name": "Python (haystack)",
"name": "python3",
"language": "python",
"name": "haystack"
"display_name": "Python 3 (ipykernel)"
},
"language_info": {
"codemirror_mode": {

View File

@ -62,24 +62,29 @@ def tutorial5_evaluation():
)
# Initialize Retriever
from haystack.nodes import ElasticsearchRetriever
retriever = ElasticsearchRetriever(document_store=document_store)
# Alternative: Evaluate dense retrievers (DensePassageRetriever or EmbeddingRetriever)
# DensePassageRetriever uses two separate transformer based encoders for query and document.
# In contrast, EmbeddingRetriever uses a single encoder for both.
# Alternative: Evaluate dense retrievers (EmbeddingRetriever or DensePassageRetriever)
# The EmbeddingRetriever uses a single transformer based encoder model for query and document.
# In contrast, DensePassageRetriever uses two separate encoders for both.
# Please make sure the "embedding_dim" parameter in the DocumentStore above matches the output dimension of your models!
# Please also take care that the PreProcessor splits your files into chunks that can be completely converted with
# the max_seq_len limitations of Transformers
# The SentenceTransformer model "all-mpnet-base-v2" generally works well with the EmbeddingRetriever on any kind of English text.
# For more information check out the documentation at: https://www.sbert.net/docs/pretrained_models.html
# The SentenceTransformer model "sentence-transformers/multi-qa-mpnet-base-dot-v1" generally works well with the EmbeddingRetriever on any kind of English text.
# For more information and suggestions on different models check out the documentation at: https://www.sbert.net/docs/pretrained_models.html
# from haystack.retriever import EmbeddingRetriever, DensePassageRetriever
# retriever = EmbeddingRetriever(document_store=document_store, model_format="sentence_transformers",
# embedding_model="sentence-transformers/multi-qa-mpnet-base-dot-v1")
# retriever = DensePassageRetriever(document_store=document_store,
# query_embedding_model="facebook/dpr-question_encoder-single-nq-base",
# passage_embedding_model="facebook/dpr-ctx_encoder-single-nq-base",
# use_gpu=True,
# max_seq_len_passage=256,
# embed_title=True)
# retriever = EmbeddingRetriever(document_store=document_store, model_format="sentence_transformers",
# embedding_model="all-mpnet-base-v2")
# document_store.update_embeddings(retriever, index=doc_index)
# Initialize Reader