{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "DeAkZwDhufYA" }, "source": [ "# Open-Domain QA on Tables\n", "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/deepset-ai/haystack/blob/master/tutorials/Tutorial15_TableQA.ipynb)\n", "\n", "This tutorial shows you how to perform question-answering on tables using the `EmbeddingRetriever` or `BM25Retriever` as retriever node and the `TableReader` as reader node." ] }, { "cell_type": "markdown", "metadata": { "id": "vbR3bETlvi-3" }, "source": [ "### Prepare environment\n", "\n", "#### Colab: Enable the GPU runtime\n", "Make sure you enable the GPU runtime to experience decent speed in this tutorial.\n", "**Runtime -> Change Runtime type -> Hardware accelerator -> GPU**\n", "\n", "" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "HW66x0rfujyO" }, "outputs": [], "source": [ "# Make sure you have a GPU running\n", "!nvidia-smi" ] }, { "cell_type": "markdown", "source": [ "## Logging\n", "\n", "We configure how logging messages should be displayed and which log level should be used before importing Haystack.\n", "Example log message:\n", "INFO - haystack.utils.preprocessing - Converting data/tutorial1/218_Olenna_Tyrell.txt\n", "Default log level in basicConfig is WARNING so the explicit parameter is not necessary but can be changed easily:" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "code", "execution_count": null, "outputs": [], "source": [ "import logging\n", "\n", "logging.basicConfig(format=\"%(levelname)s - %(name)s - %(message)s\", level=logging.WARNING)\n", "logging.getLogger(\"haystack\").setLevel(logging.INFO)" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "_ZXoyhOAvn7M" }, "outputs": [], "source": [ "# Install the latest release of Haystack in your own environment\n", "#! pip install farm-haystack\n", "\n", "# Install the latest master of Haystack\n", "!pip install --upgrade pip\n", "!pip install git+https://github.com/deepset-ai/haystack.git#egg=farm-haystack[colab]\n", "\n", "# The TaPAs-based TableReader requires the torch-scatter library\n", "import torch\n", "\n", "version = torch.__version__\n", "!pip install torch-scatter -f https://data.pyg.org/whl/torch-{version}.html\n", "\n", "# Install pygraphviz for visualization of Pipelines\n", "!apt install libgraphviz-dev\n", "!pip install pygraphviz" ] }, { "cell_type": "markdown", "metadata": { "id": "K_XJhluXwF5_" }, "source": [ "### Start an Elasticsearch server\n", "You can start Elasticsearch on your local machine instance using Docker. If Docker is not readily available in your environment (e.g. in Colab notebooks), then you can manually download and execute Elasticsearch from source." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "frDqgzK7v2i1" }, "outputs": [], "source": [ "# Recommended: Start Elasticsearch using Docker via the Haystack utility function\n", "from haystack.utils import launch_es\n", "\n", "launch_es()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "S4PGj1A6wKWu" }, "outputs": [], "source": [ "# In Colab / No Docker environments: Start Elasticsearch from source\n", "! wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.9.2-linux-x86_64.tar.gz -q\n", "! tar -xzf elasticsearch-7.9.2-linux-x86_64.tar.gz\n", "! chown -R daemon:daemon elasticsearch-7.9.2\n", "\n", "import os\n", "from subprocess import Popen, PIPE, STDOUT\n", "\n", "es_server = Popen(\n", " [\"elasticsearch-7.9.2/bin/elasticsearch\"], stdout=PIPE, stderr=STDOUT, preexec_fn=lambda: os.setuid(1) # as daemon\n", ")\n", "# wait until ES has started\n", "! sleep 30" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "id": "RmxepXZtwQ0E" }, "outputs": [], "source": [ "# Connect to Elasticsearch\n", "from haystack.document_stores import ElasticsearchDocumentStore\n", "\n", "document_index = \"document\"\n", "document_store = ElasticsearchDocumentStore(host=\"localhost\", username=\"\", password=\"\", index=document_index)" ] }, { "cell_type": "markdown", "metadata": { "id": "fFh26LIlxldw" }, "source": [ "## Add Tables to DocumentStore\n", "To quickly demonstrate the capabilities of the `EmbeddingRetriever` and the `TableReader` we use a subset of 1000 tables and text documents from a dataset we have published in [this paper](https://arxiv.org/abs/2108.04049).\n", "\n", "Just as text passages, tables are represented as `Document` objects in Haystack. The content field, though, is a pandas DataFrame instead of a string." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "nM63uwbd8zd6" }, "outputs": [], "source": [ "# Let's first fetch some tables that we want to query\n", "# Here: 1000 tables from OTT-QA\n", "from haystack.utils import fetch_archive_from_http\n", "\n", "doc_dir = \"data/tutorial15\"\n", "s3_url = \"https://s3.eu-central-1.amazonaws.com/deepset.ai-farm-qa/datasets/documents/table_text_dataset.zip\"\n", "fetch_archive_from_http(url=s3_url, output_dir=doc_dir)" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "SKjw2LuXxlGh", "outputId": "92c67d24-d6fb-413e-8dd7-53075141d508" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " Opponent M W L T NR Win% First Last\n", "0 Afghanistan 2 2 0 0 0 100.0 2012 2014\n", "1 Australia 98 32 62 1 3 34.21 1975 2017\n", "2 Bangladesh 35 31 4 0 0 88.57 1986 2015\n", "3 Canada 2 2 0 0 0 100.0 1979 2011\n", "4 England 82 31 49 0 2 38.75 1974 2017\n", "5 Hong Kong 2 2 0 0 0 100.0 2004 2008\n", "6 India 129 73 52 0 4 58.4 1978 2017\n", "7 Ireland 7 5 1 1 0 78.57 2007 2016\n", "8 Kenya 6 6 0 0 0 100.0 1996 2011\n", "9 Namibia 1 1 0 0 0 100.0 2003 2003\n", "10 Netherlands 3 3 0 0 0 100.0 1996 2003\n", "11 New Zealand 103 53 47 1 2 52.97 1973 2018\n", "12 Scotland 3 3 0 0 0 100.0 1999 2013\n", "13 South Africa 73 25 47 0 1 34.72 1992 2017\n", "14 Sri Lanka 153 90 58 1 4 60.73 1975 2017\n", "15 United Arab Emirates 3 3 0 0 0 100.0 1994 2015\n", "16 West Indies 133 60 70 3 0 46.24 1975 2017\n", "17 Zimbabwe 59 52 4 1 2 92.1 1992 2018\n", "18 Total[12] 894 474 394 8 18 54.56 1973 2018\n", "{}\n" ] } ], "source": [ "# Add the tables to the DocumentStore\n", "\n", "import json\n", "from haystack import Document\n", "import pandas as pd\n", "\n", "\n", "def read_tables(filename):\n", " processed_tables = []\n", " with open(filename) as tables:\n", " tables = json.load(tables)\n", " for key, table in tables.items():\n", " current_columns = table[\"header\"]\n", " current_rows = table[\"data\"]\n", " current_df = pd.DataFrame(columns=current_columns, data=current_rows)\n", " document = Document(content=current_df, content_type=\"table\", id=key)\n", " processed_tables.append(document)\n", "\n", " return processed_tables\n", "\n", "\n", "tables = read_tables(f\"{doc_dir}/tables.json\")\n", "document_store.write_documents(tables, index=document_index)\n", "\n", "# Showing content field and meta field of one of the Documents of content_type 'table'\n", "print(tables[0].content)\n", "print(tables[0].meta)" ] }, { "cell_type": "markdown", "metadata": { "id": "hmQC1sDmw3d7" }, "source": [ "## Initialize Retriever, Reader & Pipeline\n", "\n", "### Retriever\n", "\n", "Retrievers help narrowing down the scope for the Reader to a subset of tables where a given question could be answered.\n", "They use some simple but fast algorithm.\n", "\n", "**Here:** We specify an embedding model that is finetuned so it can also generate embeddings for tables (instead of just text).\n", "\n", "**Alternatives:**\n", "\n", "- `BM25Retriever` that uses BM25 algorithm\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "EY_qvdV6wyK5" }, "outputs": [], "source": [ "from haystack.nodes.retriever import EmbeddingRetriever\n", "\n", "retriever = EmbeddingRetriever(document_store=document_store, embedding_model=\"deepset/all-mpnet-base-v2-table\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "jasi1RM2zIJ7" }, "outputs": [], "source": [ "# Add table embeddings to the tables in DocumentStore\n", "document_store.update_embeddings(retriever=retriever)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "XM-ijy6Zz11L" }, "outputs": [], "source": [ "## Alternative: BM25Retriever\n", "# from haystack.nodes.retriever import BM25Retriever\n", "# retriever = BM25Retriever(document_store=document_store)" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "YHfQWxVI0N2e", "outputId": "1d8dc4d2-a184-489e-defa-d445d76c458f" }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "c31185c0629c46769fb7e7e2eb016fa1", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Batches: 0%| | 0/1 [00:00]\n" ] } ], "source": [ "from haystack.utils import print_answers\n", "\n", "prediction = reader.predict(query=\"Who played Gregory House in the series House?\", documents=[table_doc])\n", "print_answers(prediction, details=\"all\")" ] }, { "cell_type": "markdown", "metadata": { "id": "jkAYNMb7R9qu" }, "source": [ "The offsets in the `offsets_in_document` and `offsets_in_context` field indicate the table cells that the model predicts to be part of the answer. They need to be interpreted on the linearized table, i.e., a flat list containing all of the table cells." ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "It8XYT2ZTVJs", "outputId": "7d31af60-e04a-485d-f0ee-f29592b03928" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Predicted answer: Hugh Laurie\n", "Meta field: {'aggregation_operator': 'NONE', 'answer_cells': ['Hugh Laurie']}\n" ] } ], "source": [ "print(f\"Predicted answer: {prediction['answers'][0].answer}\")\n", "print(f\"Meta field: {prediction['answers'][0].meta}\")" ] }, { "cell_type": "markdown", "metadata": { "id": "pgmG7pzL5ceh" }, "source": [ "### Pipeline\n", "The Retriever and the Reader can be sticked together to a pipeline in order to first retrieve relevant tables and then extract the answer.\n", "\n", "**Notice**: Given that the `TableReader` does not provide useful confidence scores and returns an answer for each of the tables, the sorting of the answers might be not helpful." ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "id": "G-aZZvyv4-Mf" }, "outputs": [], "source": [ "# Initialize pipeline\n", "from haystack import Pipeline\n", "\n", "table_qa_pipeline = Pipeline()\n", "table_qa_pipeline.add_node(component=retriever, name=\"EmbeddingRetriever\", inputs=[\"Query\"])\n", "table_qa_pipeline.add_node(component=reader, name=\"TableReader\", inputs=[\"EmbeddingRetriever\"])" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "m8evexnW6dev", "outputId": "40514084-f516-4f13-fb48-6a55cb578366" }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "ce6722f406154bfebd4053040289b411", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Batches: 0%| | 0/1 [00:00" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Let's have a look on the structure of the combined Table an Text QA pipeline.\n", "from IPython import display\n", "\n", "text_table_qa_pipeline.draw()\n", "display.Image(\"pipeline.png\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "strPNduPoBLe" }, "outputs": [], "source": [ "# Example query whose answer resides in a text passage\n", "predictions = text_table_qa_pipeline.run(query=\"Who was Thomas Alva Edison?\")" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "9YiK75tSoOGA", "outputId": "bd52f841-3846-441f-dd6f-53b02111691e" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "Query: Who was Thomas Alva Edison?\n", "Answers:\n", "[ { 'answer': 'American inventor and businessman',\n", " 'context': 'mas Alva Edison (February 11, 1847October 18, 1931) was an '\n", " 'American inventor and businessman, who has been described '\n", " \"as America's greatest inventor. H\"},\n", " { 'answer': 'John Béchervaise , OAM , MBE',\n", " 'context': Name \\\n", "0 Amanda Barnard \n", "1 Martin G. Bean \n", "2 Gordon S. Brown \n", "3 John Béchervaise , OAM , MBE \n", "4 Megan Clark , AC \n", "5 J. Donald R. de Raadt \n", "6 Graham Dorrington \n", "7 Dennis Gibson , AO \n", "8 Ranulph Glanville \n", "9 Alfred Gottschalk \n", "10 Ann Henderson-Sellers \n", "11 Arthur R. Hogg \n", "12 Kourosh Kalantar-zadeh \n", "13 Richard Kaner \n", "14 Lakshmi Kantam \n", "15 William Kernot \n", "16 Sir Albert Kitson \n", "17 David Malin \n", "18 Henry Millicer , AM \n", "19 Luca Marmorini \n", "\n", " Association with RMIT \\\n", "0 B Sci ( AppPhysics ) ( Hon ) , PhD \n", "1 current Vice-Chancellor \n", "2 Dip Civil Eng , Elec Eng , Mech Eng [ WMC ] \n", "3 science classes \n", "4 DAppSci ( honoris causa ) , former faculty \n", "5 FRMIT \n", "6 faculty \n", "7 former Chancellor \n", "8 former faculty \n", "9 former faculty \n", "10 former Deputy Vice-Chancellor \n", "11 science classes \n", "12 attended ( PhD ) and also former faculty \n", "13 faculty \n", "14 faculty \n", "15 former President [ WMC ] \n", "16 geology , mining , surveying classes [ WMC ] \n", "17 D AppSci ( honoris causa ) \n", "18 D Eng ( honoris causa ) ; former faculty \n", "19 faculty \n", "\n", " Notability \n", "0 nanotechnologist and theoretical physicist ; Head of the CSIRO Nanoscience L... \n", "1 technology executive ; former Global Director of Microsoft and former Vice-C... \n", "2 cyberneticist ; Emeritus Professor of Electrical Engineering at MIT \n", "3 Antarctic explorer and author \n", "4 scientist ; current CEO of the CSIRO \n", "5 Emeritus Professor of Informatics and System Science at Luleå University of ... \n", "6 aeronautical engineer ; subject of the 2004 documentary The White Diamond by... \n", "7 mathematician \n", "8 cybernetics theoretician \n", "9 biochemist and glycoprotein researcher \n", "10 former Director of the UN Climate Programme \n", "11 astronomer and physicist \n", "12 Materials scientist , electronic engineer and Australian Research Council ( ... \n", "13 chemist and nanotechnologist ; recipient of the Tolman Award ( 2008 ) \n", "14 chemist ; Adjunct Professor and Director of the IICT -RMIT Research Centre \n", "15 Old Kernot Engineering School at RMIT named in his honour \n", "16 geologist ; recipient of the Lyell Medal ( 1927 ) \n", "17 astronomer \n", "18 aircraft designer \n", "19 head of the engine and electronics department for the Ferrari F1 team },\n", " { 'answer': 'Ann Henderson-Sellers',\n", " 'context': Name \\\n", "0 Amanda Barnard \n", "1 Martin G. Bean \n", "2 Gordon S. Brown \n", "3 John Béchervaise , OAM , MBE \n", "4 Megan Clark , AC \n", "5 J. Donald R. de Raadt \n", "6 Graham Dorrington \n", "7 Dennis Gibson , AO \n", "8 Ranulph Glanville \n", "9 Alfred Gottschalk \n", "10 Ann Henderson-Sellers \n", "11 Arthur R. Hogg \n", "12 Kourosh Kalantar-zadeh \n", "13 Richard Kaner \n", "14 Lakshmi Kantam \n", "15 William Kernot \n", "16 Sir Albert Kitson \n", "17 David Malin \n", "18 Henry Millicer , AM \n", "19 Luca Marmorini \n", "\n", " Association with RMIT \\\n", "0 B Sci ( AppPhysics ) ( Hon ) , PhD \n", "1 current Vice-Chancellor \n", "2 Dip Civil Eng , Elec Eng , Mech Eng [ WMC ] \n", "3 science classes \n", "4 DAppSci ( honoris causa ) , former faculty \n", "5 FRMIT \n", "6 faculty \n", "7 former Chancellor \n", "8 former faculty \n", "9 former faculty \n", "10 former Deputy Vice-Chancellor \n", "11 science classes \n", "12 attended ( PhD ) and also former faculty \n", "13 faculty \n", "14 faculty \n", "15 former President [ WMC ] \n", "16 geology , mining , surveying classes [ WMC ] \n", "17 D AppSci ( honoris causa ) \n", "18 D Eng ( honoris causa ) ; former faculty \n", "19 faculty \n", "\n", " Notability \n", "0 nanotechnologist and theoretical physicist ; Head of the CSIRO Nanoscience L... \n", "1 technology executive ; former Global Director of Microsoft and former Vice-C... \n", "2 cyberneticist ; Emeritus Professor of Electrical Engineering at MIT \n", "3 Antarctic explorer and author \n", "4 scientist ; current CEO of the CSIRO \n", "5 Emeritus Professor of Informatics and System Science at Luleå University of ... \n", "6 aeronautical engineer ; subject of the 2004 documentary The White Diamond by... \n", "7 mathematician \n", "8 cybernetics theoretician \n", "9 biochemist and glycoprotein researcher \n", "10 former Director of the UN Climate Programme \n", "11 astronomer and physicist \n", "12 Materials scientist , electronic engineer and Australian Research Council ( ... \n", "13 chemist and nanotechnologist ; recipient of the Tolman Award ( 2008 ) \n", "14 chemist ; Adjunct Professor and Director of the IICT -RMIT Research Centre \n", "15 Old Kernot Engineering School at RMIT named in his honour \n", "16 geologist ; recipient of the Lyell Medal ( 1927 ) \n", "17 astronomer \n", "18 aircraft designer \n", "19 head of the engine and electronics department for the Ferrari F1 team },\n", " { 'answer': 'nanotechnologist and theoretical physicist ; Head of the '\n", " 'CSIRO Nanoscience Laboratory',\n", " 'context': Name \\\n", "0 Amanda Barnard \n", "1 Martin G. Bean \n", "2 Gordon S. Brown \n", "3 John Béchervaise , OAM , MBE \n", "4 Megan Clark , AC \n", "5 J. Donald R. de Raadt \n", "6 Graham Dorrington \n", "7 Dennis Gibson , AO \n", "8 Ranulph Glanville \n", "9 Alfred Gottschalk \n", "10 Ann Henderson-Sellers \n", "11 Arthur R. Hogg \n", "12 Kourosh Kalantar-zadeh \n", "13 Richard Kaner \n", "14 Lakshmi Kantam \n", "15 William Kernot \n", "16 Sir Albert Kitson \n", "17 David Malin \n", "18 Henry Millicer , AM \n", "19 Luca Marmorini \n", "\n", " Association with RMIT \\\n", "0 B Sci ( AppPhysics ) ( Hon ) , PhD \n", "1 current Vice-Chancellor \n", "2 Dip Civil Eng , Elec Eng , Mech Eng [ WMC ] \n", "3 science classes \n", "4 DAppSci ( honoris causa ) , former faculty \n", "5 FRMIT \n", "6 faculty \n", "7 former Chancellor \n", "8 former faculty \n", "9 former faculty \n", "10 former Deputy Vice-Chancellor \n", "11 science classes \n", "12 attended ( PhD ) and also former faculty \n", "13 faculty \n", "14 faculty \n", "15 former President [ WMC ] \n", "16 geology , mining , surveying classes [ WMC ] \n", "17 D AppSci ( honoris causa ) \n", "18 D Eng ( honoris causa ) ; former faculty \n", "19 faculty \n", "\n", " Notability \n", "0 nanotechnologist and theoretical physicist ; Head of the CSIRO Nanoscience L... \n", "1 technology executive ; former Global Director of Microsoft and former Vice-C... \n", "2 cyberneticist ; Emeritus Professor of Electrical Engineering at MIT \n", "3 Antarctic explorer and author \n", "4 scientist ; current CEO of the CSIRO \n", "5 Emeritus Professor of Informatics and System Science at Luleå University of ... \n", "6 aeronautical engineer ; subject of the 2004 documentary The White Diamond by... \n", "7 mathematician \n", "8 cybernetics theoretician \n", "9 biochemist and glycoprotein researcher \n", "10 former Director of the UN Climate Programme \n", "11 astronomer and physicist \n", "12 Materials scientist , electronic engineer and Australian Research Council ( ... \n", "13 chemist and nanotechnologist ; recipient of the Tolman Award ( 2008 ) \n", "14 chemist ; Adjunct Professor and Director of the IICT -RMIT Research Centre \n", "15 Old Kernot Engineering School at RMIT named in his honour \n", "16 geologist ; recipient of the Lyell Medal ( 1927 ) \n", "17 astronomer \n", "18 aircraft designer \n", "19 head of the engine and electronics department for the Ferrari F1 team },\n", " { 'answer': 'Christopher Wren',\n", " 'context': Name Years \\\n", "0 John Bainbridge 1620 or 1621 - 1643 \n", "1 John Greaves 1643-48 \n", "2 Seth Ward 1649-60 \n", "3 Christopher Wren 1661-73 \n", "4 Edward Bernard 1673-91 \n", "5 David Gregory 1691-1708 \n", "6 John Caswell 1709-12 \n", "7 John Keill 1712-21 \n", "8 James Bradley 1721-62 \n", "9 Thomas Hornsby 1763-1810 \n", "10 Abraham Robertson 1810-26 \n", "11 Stephen Rigaud 1827-39 \n", "12 George Johnson 1839-42 \n", "13 William Donkin 1842-69 \n", "14 Charles Pritchard 1870-93 \n", "15 Herbert Turner 1893-1930 \n", "16 Harry Plaskett 1932-60 \n", "17 Donald Blackwell 1960-88 \n", "18 George Efstathiou 1988-97 \n", "19 Joseph Silk 1999-2012 \n", "\n", " Education \\\n", "0 University of Cambridge ( Emmanuel College ) \n", "1 Balliol College \n", "2 University of Cambridge ( Sidney Sussex College ) \n", "3 Wadham College \n", "4 St John 's College \n", "5 Marischal College and University of Aberdeen , and Leiden University \n", "6 Wadham College \n", "7 University of Edinburgh and Balliol College \n", "8 Balliol College \n", "9 Corpus Christi College \n", "10 Christ Church \n", "11 Exeter College \n", "12 The Queen 's College \n", "13 St Edmund Hall and University College \n", "14 University of Cambridge ( St John 's College ) \n", "15 University of Cambridge ( Trinity College ) \n", "16 University of Toronto and Imperial College , London \n", "17 University of Cambridge ( Sidney Sussex College ) \n", "18 Keble College and the University of Durham \n", "19 University of Cambridge ( Clare College ) and Harvard University \n", "\n", " College as professor \\\n", "0 Merton College \n", "1 Merton College \n", "2 Wadham College and Trinity College \n", "3 All Souls College \n", "4 St John 's College \n", "5 Balliol College \n", "6 Hart Hall \n", "7 Balliol College \n", "8 - \n", "9 Corpus Christi College \n", "10 Christ Church \n", "11 - \n", "12 The Queen 's College \n", "13 University College \n", "14 New College \n", "15 New College \n", "16 New College \n", "17 New College \n", "18 New College \n", "19 New College \n", "\n", " Notes \n", "0 Bainbridge practised as a physician in Leicestershire and London after leavi... \n", "1 Greaves began studying astronomical texts in Greek , Arabic and Persian at a... \n", "2 When still an undergraduate , Ward impressed John Bainbridge ( the first ast... \n", "3 As an undergraduate , Wren joined the circle of mathematicians and natural s... \n", "4 Bernard studied Hebrew , Arabic , Syriac and Coptic with Edward Pococke ( La... \n", "5 Gregory studied in his native Scotland and befriended the Edinburgh physicia... \n", "6 Carswell matriculated at Wadham College , Oxford , when he was 16 years old ... \n", "7 Keill studied in Edinburgh with David Gregory and moved to Balliol with him ... \n", "8 Bradley was the nephew of James Pound , a leading astronomer who was a colle... \n", "9 Hornsby , who had an observatory at Corpus Christi , gained a high reputatio... \n", "10 Robertson started studying at Oxford aged 24 , having previously run ( unsuc... \n", "11 Rigaud , whose father was the observer at Kew Observatory , made his first r... \n", "12 Johnson was a mathematician and priest with little practical knowledge of as... \n", "13 Donkin , a talented linguist , mathematician and musician , published papers... \n", "14 After leaving Cambridge , Pritchard was headmaster of a grammar school in St... \n", "15 Turner was second wrangler ( achieved the second-highest marks in the Cambri... \n", "16 Plaskett , a solar physicist , was the son of the Canadian astronomer John S... \n", "17 Blackwell was assistant director of the Solar Physics Observatory at Cambrid... \n", "18 After completing his studies at Oxford and Durham , Efstathiou worked as an ... \n", "19 After obtaining his doctorate from Harvard , Silk returned to England to car... },\n", " { 'answer': 'Christopher Wren',\n", " 'context': Name Years \\\n", "0 John Bainbridge 1620 or 1621 - 1643 \n", "1 John Greaves 1643-48 \n", "2 Seth Ward 1649-60 \n", "3 Christopher Wren 1661-73 \n", "4 Edward Bernard 1673-91 \n", "5 David Gregory 1691-1708 \n", "6 John Caswell 1709-12 \n", "7 John Keill 1712-21 \n", "8 James Bradley 1721-62 \n", "9 Thomas Hornsby 1763-1810 \n", "10 Abraham Robertson 1810-26 \n", "11 Stephen Rigaud 1827-39 \n", "12 George Johnson 1839-42 \n", "13 William Donkin 1842-69 \n", "14 Charles Pritchard 1870-93 \n", "15 Herbert Turner 1893-1930 \n", "16 Harry Plaskett 1932-60 \n", "17 Donald Blackwell 1960-88 \n", "18 George Efstathiou 1988-97 \n", "19 Joseph Silk 1999-2012 \n", "\n", " Education \\\n", "0 University of Cambridge ( Emmanuel College ) \n", "1 Balliol College \n", "2 University of Cambridge ( Sidney Sussex College ) \n", "3 Wadham College \n", "4 St John 's College \n", "5 Marischal College and University of Aberdeen , and Leiden University \n", "6 Wadham College \n", "7 University of Edinburgh and Balliol College \n", "8 Balliol College \n", "9 Corpus Christi College \n", "10 Christ Church \n", "11 Exeter College \n", "12 The Queen 's College \n", "13 St Edmund Hall and University College \n", "14 University of Cambridge ( St John 's College ) \n", "15 University of Cambridge ( Trinity College ) \n", "16 University of Toronto and Imperial College , London \n", "17 University of Cambridge ( Sidney Sussex College ) \n", "18 Keble College and the University of Durham \n", "19 University of Cambridge ( Clare College ) and Harvard University \n", "\n", " College as professor \\\n", "0 Merton College \n", "1 Merton College \n", "2 Wadham College and Trinity College \n", "3 All Souls College \n", "4 St John 's College \n", "5 Balliol College \n", "6 Hart Hall \n", "7 Balliol College \n", "8 - \n", "9 Corpus Christi College \n", "10 Christ Church \n", "11 - \n", "12 The Queen 's College \n", "13 University College \n", "14 New College \n", "15 New College \n", "16 New College \n", "17 New College \n", "18 New College \n", "19 New College \n", "\n", " Notes \n", "0 Bainbridge practised as a physician in Leicestershire and London after leavi... \n", "1 Greaves began studying astronomical texts in Greek , Arabic and Persian at a... \n", "2 When still an undergraduate , Ward impressed John Bainbridge ( the first ast... \n", "3 As an undergraduate , Wren joined the circle of mathematicians and natural s... \n", "4 Bernard studied Hebrew , Arabic , Syriac and Coptic with Edward Pococke ( La... \n", "5 Gregory studied in his native Scotland and befriended the Edinburgh physicia... \n", "6 Carswell matriculated at Wadham College , Oxford , when he was 16 years old ... \n", "7 Keill studied in Edinburgh with David Gregory and moved to Balliol with him ... \n", "8 Bradley was the nephew of James Pound , a leading astronomer who was a colle... \n", "9 Hornsby , who had an observatory at Corpus Christi , gained a high reputatio... \n", "10 Robertson started studying at Oxford aged 24 , having previously run ( unsuc... \n", "11 Rigaud , whose father was the observer at Kew Observatory , made his first r... \n", "12 Johnson was a mathematician and priest with little practical knowledge of as... \n", "13 Donkin , a talented linguist , mathematician and musician , published papers... \n", "14 After leaving Cambridge , Pritchard was headmaster of a grammar school in St... \n", "15 Turner was second wrangler ( achieved the second-highest marks in the Cambri... \n", "16 Plaskett , a solar physicist , was the son of the Canadian astronomer John S... \n", "17 Blackwell was assistant director of the Solar Physics Observatory at Cambrid... \n", "18 After completing his studies at Oxford and Durham , Efstathiou worked as an ... \n", "19 After obtaining his doctorate from Harvard , Silk returned to England to car... },\n", " { 'answer': 'John Caswell',\n", " 'context': Name Years \\\n", "0 John Bainbridge 1620 or 1621 - 1643 \n", "1 John Greaves 1643-48 \n", "2 Seth Ward 1649-60 \n", "3 Christopher Wren 1661-73 \n", "4 Edward Bernard 1673-91 \n", "5 David Gregory 1691-1708 \n", "6 John Caswell 1709-12 \n", "7 John Keill 1712-21 \n", "8 James Bradley 1721-62 \n", "9 Thomas Hornsby 1763-1810 \n", "10 Abraham Robertson 1810-26 \n", "11 Stephen Rigaud 1827-39 \n", "12 George Johnson 1839-42 \n", "13 William Donkin 1842-69 \n", "14 Charles Pritchard 1870-93 \n", "15 Herbert Turner 1893-1930 \n", "16 Harry Plaskett 1932-60 \n", "17 Donald Blackwell 1960-88 \n", "18 George Efstathiou 1988-97 \n", "19 Joseph Silk 1999-2012 \n", "\n", " Education \\\n", "0 University of Cambridge ( Emmanuel College ) \n", "1 Balliol College \n", "2 University of Cambridge ( Sidney Sussex College ) \n", "3 Wadham College \n", "4 St John 's College \n", "5 Marischal College and University of Aberdeen , and Leiden University \n", "6 Wadham College \n", "7 University of Edinburgh and Balliol College \n", "8 Balliol College \n", "9 Corpus Christi College \n", "10 Christ Church \n", "11 Exeter College \n", "12 The Queen 's College \n", "13 St Edmund Hall and University College \n", "14 University of Cambridge ( St John 's College ) \n", "15 University of Cambridge ( Trinity College ) \n", "16 University of Toronto and Imperial College , London \n", "17 University of Cambridge ( Sidney Sussex College ) \n", "18 Keble College and the University of Durham \n", "19 University of Cambridge ( Clare College ) and Harvard University \n", "\n", " College as professor \\\n", "0 Merton College \n", "1 Merton College \n", "2 Wadham College and Trinity College \n", "3 All Souls College \n", "4 St John 's College \n", "5 Balliol College \n", "6 Hart Hall \n", "7 Balliol College \n", "8 - \n", "9 Corpus Christi College \n", "10 Christ Church \n", "11 - \n", "12 The Queen 's College \n", "13 University College \n", "14 New College \n", "15 New College \n", "16 New College \n", "17 New College \n", "18 New College \n", "19 New College \n", "\n", " Notes \n", "0 Bainbridge practised as a physician in Leicestershire and London after leavi... \n", "1 Greaves began studying astronomical texts in Greek , Arabic and Persian at a... \n", "2 When still an undergraduate , Ward impressed John Bainbridge ( the first ast... \n", "3 As an undergraduate , Wren joined the circle of mathematicians and natural s... \n", "4 Bernard studied Hebrew , Arabic , Syriac and Coptic with Edward Pococke ( La... \n", "5 Gregory studied in his native Scotland and befriended the Edinburgh physicia... \n", "6 Carswell matriculated at Wadham College , Oxford , when he was 16 years old ... \n", "7 Keill studied in Edinburgh with David Gregory and moved to Balliol with him ... \n", "8 Bradley was the nephew of James Pound , a leading astronomer who was a colle... \n", "9 Hornsby , who had an observatory at Corpus Christi , gained a high reputatio... \n", "10 Robertson started studying at Oxford aged 24 , having previously run ( unsuc... \n", "11 Rigaud , whose father was the observer at Kew Observatory , made his first r... \n", "12 Johnson was a mathematician and priest with little practical knowledge of as... \n", "13 Donkin , a talented linguist , mathematician and musician , published papers... \n", "14 After leaving Cambridge , Pritchard was headmaster of a grammar school in St... \n", "15 Turner was second wrangler ( achieved the second-highest marks in the Cambri... \n", "16 Plaskett , a solar physicist , was the son of the Canadian astronomer John S... \n", "17 Blackwell was assistant director of the Solar Physics Observatory at Cambrid... \n", "18 After completing his studies at Oxford and Durham , Efstathiou worked as an ... \n", "19 After obtaining his doctorate from Harvard , Silk returned to England to car... },\n", " { 'answer': 'John Caswell',\n", " 'context': Name Years \\\n", "0 John Bainbridge 1620 or 1621 - 1643 \n", "1 John Greaves 1643-48 \n", "2 Seth Ward 1649-60 \n", "3 Christopher Wren 1661-73 \n", "4 Edward Bernard 1673-91 \n", "5 David Gregory 1691-1708 \n", "6 John Caswell 1709-12 \n", "7 John Keill 1712-21 \n", "8 James Bradley 1721-62 \n", "9 Thomas Hornsby 1763-1810 \n", "10 Abraham Robertson 1810-26 \n", "11 Stephen Rigaud 1827-39 \n", "12 George Johnson 1839-42 \n", "13 William Donkin 1842-69 \n", "14 Charles Pritchard 1870-93 \n", "15 Herbert Turner 1893-1930 \n", "16 Harry Plaskett 1932-60 \n", "17 Donald Blackwell 1960-88 \n", "18 George Efstathiou 1988-97 \n", "19 Joseph Silk 1999-2012 \n", "\n", " Education \\\n", "0 University of Cambridge ( Emmanuel College ) \n", "1 Balliol College \n", "2 University of Cambridge ( Sidney Sussex College ) \n", "3 Wadham College \n", "4 St John 's College \n", "5 Marischal College and University of Aberdeen , and Leiden University \n", "6 Wadham College \n", "7 University of Edinburgh and Balliol College \n", "8 Balliol College \n", "9 Corpus Christi College \n", "10 Christ Church \n", "11 Exeter College \n", "12 The Queen 's College \n", "13 St Edmund Hall and University College \n", "14 University of Cambridge ( St John 's College ) \n", "15 University of Cambridge ( Trinity College ) \n", "16 University of Toronto and Imperial College , London \n", "17 University of Cambridge ( Sidney Sussex College ) \n", "18 Keble College and the University of Durham \n", "19 University of Cambridge ( Clare College ) and Harvard University \n", "\n", " College as professor \\\n", "0 Merton College \n", "1 Merton College \n", "2 Wadham College and Trinity College \n", "3 All Souls College \n", "4 St John 's College \n", "5 Balliol College \n", "6 Hart Hall \n", "7 Balliol College \n", "8 - \n", "9 Corpus Christi College \n", "10 Christ Church \n", "11 - \n", "12 The Queen 's College \n", "13 University College \n", "14 New College \n", "15 New College \n", "16 New College \n", "17 New College \n", "18 New College \n", "19 New College \n", "\n", " Notes \n", "0 Bainbridge practised as a physician in Leicestershire and London after leavi... \n", "1 Greaves began studying astronomical texts in Greek , Arabic and Persian at a... \n", "2 When still an undergraduate , Ward impressed John Bainbridge ( the first ast... \n", "3 As an undergraduate , Wren joined the circle of mathematicians and natural s... \n", "4 Bernard studied Hebrew , Arabic , Syriac and Coptic with Edward Pococke ( La... \n", "5 Gregory studied in his native Scotland and befriended the Edinburgh physicia... \n", "6 Carswell matriculated at Wadham College , Oxford , when he was 16 years old ... \n", "7 Keill studied in Edinburgh with David Gregory and moved to Balliol with him ... \n", "8 Bradley was the nephew of James Pound , a leading astronomer who was a colle... \n", "9 Hornsby , who had an observatory at Corpus Christi , gained a high reputatio... \n", "10 Robertson started studying at Oxford aged 24 , having previously run ( unsuc... \n", "11 Rigaud , whose father was the observer at Kew Observatory , made his first r... \n", "12 Johnson was a mathematician and priest with little practical knowledge of as... \n", "13 Donkin , a talented linguist , mathematician and musician , published papers... \n", "14 After leaving Cambridge , Pritchard was headmaster of a grammar school in St... \n", "15 Turner was second wrangler ( achieved the second-highest marks in the Cambri... \n", "16 Plaskett , a solar physicist , was the son of the Canadian astronomer John S... \n", "17 Blackwell was assistant director of the Solar Physics Observatory at Cambrid... \n", "18 After completing his studies at Oxford and Durham , Efstathiou worked as an ... \n", "19 After obtaining his doctorate from Harvard , Silk returned to England to car... },\n", " { 'answer': 'Thomas Hornsby',\n", " 'context': Name Years \\\n", "0 John Bainbridge 1620 or 1621 - 1643 \n", "1 John Greaves 1643-48 \n", "2 Seth Ward 1649-60 \n", "3 Christopher Wren 1661-73 \n", "4 Edward Bernard 1673-91 \n", "5 David Gregory 1691-1708 \n", "6 John Caswell 1709-12 \n", "7 John Keill 1712-21 \n", "8 James Bradley 1721-62 \n", "9 Thomas Hornsby 1763-1810 \n", "10 Abraham Robertson 1810-26 \n", "11 Stephen Rigaud 1827-39 \n", "12 George Johnson 1839-42 \n", "13 William Donkin 1842-69 \n", "14 Charles Pritchard 1870-93 \n", "15 Herbert Turner 1893-1930 \n", "16 Harry Plaskett 1932-60 \n", "17 Donald Blackwell 1960-88 \n", "18 George Efstathiou 1988-97 \n", "19 Joseph Silk 1999-2012 \n", "\n", " Education \\\n", "0 University of Cambridge ( Emmanuel College ) \n", "1 Balliol College \n", "2 University of Cambridge ( Sidney Sussex College ) \n", "3 Wadham College \n", "4 St John 's College \n", "5 Marischal College and University of Aberdeen , and Leiden University \n", "6 Wadham College \n", "7 University of Edinburgh and Balliol College \n", "8 Balliol College \n", "9 Corpus Christi College \n", "10 Christ Church \n", "11 Exeter College \n", "12 The Queen 's College \n", "13 St Edmund Hall and University College \n", "14 University of Cambridge ( St John 's College ) \n", "15 University of Cambridge ( Trinity College ) \n", "16 University of Toronto and Imperial College , London \n", "17 University of Cambridge ( Sidney Sussex College ) \n", "18 Keble College and the University of Durham \n", "19 University of Cambridge ( Clare College ) and Harvard University \n", "\n", " College as professor \\\n", "0 Merton College \n", "1 Merton College \n", "2 Wadham College and Trinity College \n", "3 All Souls College \n", "4 St John 's College \n", "5 Balliol College \n", "6 Hart Hall \n", "7 Balliol College \n", "8 - \n", "9 Corpus Christi College \n", "10 Christ Church \n", "11 - \n", "12 The Queen 's College \n", "13 University College \n", "14 New College \n", "15 New College \n", "16 New College \n", "17 New College \n", "18 New College \n", "19 New College \n", "\n", " Notes \n", "0 Bainbridge practised as a physician in Leicestershire and London after leavi... \n", "1 Greaves began studying astronomical texts in Greek , Arabic and Persian at a... \n", "2 When still an undergraduate , Ward impressed John Bainbridge ( the first ast... \n", "3 As an undergraduate , Wren joined the circle of mathematicians and natural s... \n", "4 Bernard studied Hebrew , Arabic , Syriac and Coptic with Edward Pococke ( La... \n", "5 Gregory studied in his native Scotland and befriended the Edinburgh physicia... \n", "6 Carswell matriculated at Wadham College , Oxford , when he was 16 years old ... \n", "7 Keill studied in Edinburgh with David Gregory and moved to Balliol with him ... \n", "8 Bradley was the nephew of James Pound , a leading astronomer who was a colle... \n", "9 Hornsby , who had an observatory at Corpus Christi , gained a high reputatio... \n", "10 Robertson started studying at Oxford aged 24 , having previously run ( unsuc... \n", "11 Rigaud , whose father was the observer at Kew Observatory , made his first r... \n", "12 Johnson was a mathematician and priest with little practical knowledge of as... \n", "13 Donkin , a talented linguist , mathematician and musician , published papers... \n", "14 After leaving Cambridge , Pritchard was headmaster of a grammar school in St... \n", "15 Turner was second wrangler ( achieved the second-highest marks in the Cambri... \n", "16 Plaskett , a solar physicist , was the son of the Canadian astronomer John S... \n", "17 Blackwell was assistant director of the Solar Physics Observatory at Cambrid... \n", "18 After completing his studies at Oxford and Durham , Efstathiou worked as an ... \n", "19 After obtaining his doctorate from Harvard , Silk returned to England to car... },\n", " { 'answer': 'Thomas Hornsby',\n", " 'context': Name Years \\\n", "0 John Bainbridge 1620 or 1621 - 1643 \n", "1 John Greaves 1643-48 \n", "2 Seth Ward 1649-60 \n", "3 Christopher Wren 1661-73 \n", "4 Edward Bernard 1673-91 \n", "5 David Gregory 1691-1708 \n", "6 John Caswell 1709-12 \n", "7 John Keill 1712-21 \n", "8 James Bradley 1721-62 \n", "9 Thomas Hornsby 1763-1810 \n", "10 Abraham Robertson 1810-26 \n", "11 Stephen Rigaud 1827-39 \n", "12 George Johnson 1839-42 \n", "13 William Donkin 1842-69 \n", "14 Charles Pritchard 1870-93 \n", "15 Herbert Turner 1893-1930 \n", "16 Harry Plaskett 1932-60 \n", "17 Donald Blackwell 1960-88 \n", "18 George Efstathiou 1988-97 \n", "19 Joseph Silk 1999-2012 \n", "\n", " Education \\\n", "0 University of Cambridge ( Emmanuel College ) \n", "1 Balliol College \n", "2 University of Cambridge ( Sidney Sussex College ) \n", "3 Wadham College \n", "4 St John 's College \n", "5 Marischal College and University of Aberdeen , and Leiden University \n", "6 Wadham College \n", "7 University of Edinburgh and Balliol College \n", "8 Balliol College \n", "9 Corpus Christi College \n", "10 Christ Church \n", "11 Exeter College \n", "12 The Queen 's College \n", "13 St Edmund Hall and University College \n", "14 University of Cambridge ( St John 's College ) \n", "15 University of Cambridge ( Trinity College ) \n", "16 University of Toronto and Imperial College , London \n", "17 University of Cambridge ( Sidney Sussex College ) \n", "18 Keble College and the University of Durham \n", "19 University of Cambridge ( Clare College ) and Harvard University \n", "\n", " College as professor \\\n", "0 Merton College \n", "1 Merton College \n", "2 Wadham College and Trinity College \n", "3 All Souls College \n", "4 St John 's College \n", "5 Balliol College \n", "6 Hart Hall \n", "7 Balliol College \n", "8 - \n", "9 Corpus Christi College \n", "10 Christ Church \n", "11 - \n", "12 The Queen 's College \n", "13 University College \n", "14 New College \n", "15 New College \n", "16 New College \n", "17 New College \n", "18 New College \n", "19 New College \n", "\n", " Notes \n", "0 Bainbridge practised as a physician in Leicestershire and London after leavi... \n", "1 Greaves began studying astronomical texts in Greek , Arabic and Persian at a... \n", "2 When still an undergraduate , Ward impressed John Bainbridge ( the first ast... \n", "3 As an undergraduate , Wren joined the circle of mathematicians and natural s... \n", "4 Bernard studied Hebrew , Arabic , Syriac and Coptic with Edward Pococke ( La... \n", "5 Gregory studied in his native Scotland and befriended the Edinburgh physicia... \n", "6 Carswell matriculated at Wadham College , Oxford , when he was 16 years old ... \n", "7 Keill studied in Edinburgh with David Gregory and moved to Balliol with him ... \n", "8 Bradley was the nephew of James Pound , a leading astronomer who was a colle... \n", "9 Hornsby , who had an observatory at Corpus Christi , gained a high reputatio... \n", "10 Robertson started studying at Oxford aged 24 , having previously run ( unsuc... \n", "11 Rigaud , whose father was the observer at Kew Observatory , made his first r... \n", "12 Johnson was a mathematician and priest with little practical knowledge of as... \n", "13 Donkin , a talented linguist , mathematician and musician , published papers... \n", "14 After leaving Cambridge , Pritchard was headmaster of a grammar school in St... \n", "15 Turner was second wrangler ( achieved the second-highest marks in the Cambri... \n", "16 Plaskett , a solar physicist , was the son of the Canadian astronomer John S... \n", "17 Blackwell was assistant director of the Solar Physics Observatory at Cambrid... \n", "18 After completing his studies at Oxford and Durham , Efstathiou worked as an ... \n", "19 After obtaining his doctorate from Harvard , Silk returned to England to car... },\n", " { 'answer': 'John Jordan',\n", " 'context': Entrant Constructor Chassis Engine Driver\n", "0 Team Ensign Ensign N180B Cosworth DFV V8 Jim Crawford\n", "1 Team Ensign Ensign N180B Cosworth DFV V8 Joe Castellano\n", "2 Colin Bennett Racing McLaren M29 Cosworth DFV V8 Arnold Glass\n", "3 Colin Bennett Racing March 811 Cosworth DFV V8 Val Musetti\n", "4 Team Sanada Fittipaldi F8 Cosworth DFV V8 Tony Trimmer\n", "5 Warren Booth Shadow DN9 Cosworth DFV V8 Warren Booth\n", "6 John Jordan BRM P207 BRM 202 V12 David Williams\n", "7 Nick Mason Tyrrell 008 Cosworth DFV V8 John Brindley\n", "8 EMKA Productions Williams FW07 Cosworth DFV V8 Steve O'Rourke\n", "9 Team Peru Williams FW07 Cosworth DFV V8 Jorge Koechlin},\n", " { 'answer': 'Alexander Graham Bell Alexander Graham Bell',\n", " 'context': 'is grandchildren. \"U.S. patent images in TIFF format\" '\n", " 'Alexander Graham Bell Alexander Graham Bell (March 3, 1847 '\n", " '– August 2, 1922) was a Scottish-born'},\n", " { 'answer': 'Ernst Heinrich Weber was clear from his memoirs where he '\n", " 'proclaimed that Weber should be regarded as the father of '\n", " 'experimental psychology. . “I would rather call Weber the '\n", " 'father of experimental psychology…It was Weber’s great '\n", " 'contribution to think of measuring psychic quantities and '\n", " 'of showing the exact relationships between them, to be the '\n", " 'first to understand this and carry it out.',\n", " 'context': ' Ernst Heinrich Weber was clear from his memoirs where he '\n", " 'proclaimed that Weber should be regarded as the father of '\n", " 'experimental psychology. . “I would rather call Weber the '\n", " 'father of experimental psychology…It was Weber’s great '\n", " 'contribution to think of measuring psychic quantities and '\n", " 'of showing the exact relationships between them, to be the '\n", " 'first to understand this and carry it out.'}]\n" ] } ], "source": [ "# We can see both text passages and tables as contexts of the predicted answers.\n", "print_answers(predictions, details=\"minimum\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "QYOHDSmLpzEg" }, "outputs": [], "source": [ "# Example query whose answer resides in a table\n", "predictions = text_table_qa_pipeline.run(query=\"Which country does the film Macaroni come from?\")" ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "4kw53uWep3zj", "outputId": "b332cc17-3cb8-4e20-d79d-bb4cf656f277" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "Query: Which country does the film Macaroni come from?\n", "Answers:\n", "[ { 'answer': 'Italian',\n", " 'context': Submitting country Film title used in nomination Language ( s ) \\\n", "0 Argentina The Official Story Spanish \n", "1 Austria Malambo German \n", "2 Belgium Dust French \n", "3 Canada Jacques and November French \n", "4 Czechoslovakia Scalpel , Please Czech \n", "5 Denmark Twist and Shout Danish \n", "6 France Three Men and a Cradle French \n", "7 West Germany Angry Harvest German \n", "8 Hungary Colonel Redl German \n", "9 Iceland Deep Winter Icelandic \n", "10 India Saagar Hindi \n", "11 Israel When Night Falls Hebrew \n", "12 Italy Macaroni Italian \n", "13 Japan Gray Sunset Japanese \n", "14 South Korea Eoudong Korean \n", "15 Mexico Frida Still Life Spanish \n", "16 Netherlands The Dream Dutch , West Frisian \n", "17 Norway Wives - Ten Years After Norwegian \n", "18 Peru The City and the Dogs Spanish \n", "19 Philippines This Is My Country Tagalog \n", "\n", " Original title Director ( s ) \\\n", "0 La Historia oficial Luis Puenzo \n", "1 Malambo Milan Dor \n", "2 Dust Marion Hänsel \n", "3 Jacques et novembre Jean Beaudry and François Bouvier \n", "4 Skalpel , prosím Jirí Svoboda \n", "5 Tro , håb og kærlighed Bille August \n", "6 Trois hommes et un couffin Coline Serreau \n", "7 Bittere Ernte Agnieszka Holland \n", "8 Oberst Redl István Szabó \n", "9 Skammdegi Þráinn Bertelsson \n", "10 सागर Ramesh Sippy \n", "11 עד סוף הלילה Eitan Green \n", "12 Maccheroni Ettore Scola \n", "13 花いちもんめ Shunya Ito \n", "14 어우동 Lee Jang-ho \n", "15 Frida , naturaleza viva Paul Leduc \n", "16 De Dream Pieter Verhoeff \n", "17 Hustruer - ti år etter Anja Breien \n", "18 La ciudad y los perros Francisco José Lombardi \n", "19 Bayan ko : Kapit sa patalim Lino Brocka \n", "\n", " Result \n", "0 Won Academy Award \n", "1 Not Nominated \n", "2 Not Nominated \n", "3 Not Nominated \n", "4 Not Nominated \n", "5 Not Nominated \n", "6 Nominated \n", "7 Nominated \n", "8 Nominated \n", "9 Not Nominated \n", "10 Not Nominated \n", "11 Not Nominated \n", "12 Not Nominated \n", "13 Not Nominated \n", "14 Not Nominated \n", "15 Not Nominated \n", "16 Not Nominated \n", "17 Not Nominated \n", "18 Not Nominated \n", "19 Not Nominated },\n", " { 'answer': 'Italian',\n", " 'context': Submitting country Film title used in nomination Language ( s ) \\\n", "0 Argentina The Official Story Spanish \n", "1 Austria Malambo German \n", "2 Belgium Dust French \n", "3 Canada Jacques and November French \n", "4 Czechoslovakia Scalpel , Please Czech \n", "5 Denmark Twist and Shout Danish \n", "6 France Three Men and a Cradle French \n", "7 West Germany Angry Harvest German \n", "8 Hungary Colonel Redl German \n", "9 Iceland Deep Winter Icelandic \n", "10 India Saagar Hindi \n", "11 Israel When Night Falls Hebrew \n", "12 Italy Macaroni Italian \n", "13 Japan Gray Sunset Japanese \n", "14 South Korea Eoudong Korean \n", "15 Mexico Frida Still Life Spanish \n", "16 Netherlands The Dream Dutch , West Frisian \n", "17 Norway Wives - Ten Years After Norwegian \n", "18 Peru The City and the Dogs Spanish \n", "19 Philippines This Is My Country Tagalog \n", "\n", " Original title Director ( s ) \\\n", "0 La Historia oficial Luis Puenzo \n", "1 Malambo Milan Dor \n", "2 Dust Marion Hänsel \n", "3 Jacques et novembre Jean Beaudry and François Bouvier \n", "4 Skalpel , prosím Jirí Svoboda \n", "5 Tro , håb og kærlighed Bille August \n", "6 Trois hommes et un couffin Coline Serreau \n", "7 Bittere Ernte Agnieszka Holland \n", "8 Oberst Redl István Szabó \n", "9 Skammdegi Þráinn Bertelsson \n", "10 सागर Ramesh Sippy \n", "11 עד סוף הלילה Eitan Green \n", "12 Maccheroni Ettore Scola \n", "13 花いちもんめ Shunya Ito \n", "14 어우동 Lee Jang-ho \n", "15 Frida , naturaleza viva Paul Leduc \n", "16 De Dream Pieter Verhoeff \n", "17 Hustruer - ti år etter Anja Breien \n", "18 La ciudad y los perros Francisco José Lombardi \n", "19 Bayan ko : Kapit sa patalim Lino Brocka \n", "\n", " Result \n", "0 Won Academy Award \n", "1 Not Nominated \n", "2 Not Nominated \n", "3 Not Nominated \n", "4 Not Nominated \n", "5 Not Nominated \n", "6 Nominated \n", "7 Nominated \n", "8 Nominated \n", "9 Not Nominated \n", "10 Not Nominated \n", "11 Not Nominated \n", "12 Not Nominated \n", "13 Not Nominated \n", "14 Not Nominated \n", "15 Not Nominated \n", "16 Not Nominated \n", "17 Not Nominated \n", "18 Not Nominated \n", "19 Not Nominated },\n", " { 'answer': '2014',\n", " 'context': Year Name Transliteration \\\n", "0 2000 The Child And The Soldier Koodak va Sarbaz \n", "1 2001 Under The Moonlight Zir-e Noor-e Maah \n", "2 2002 Here Is A Shining Light Inja Cheraghi Roshan Ast \n", "3 2005 So Close , So Far Kheili Dour , Kheili Nazdik \n", "4 2008 As Simple as That Be Hamin Sadegi \n", "5 2011 A Cube of Sugar Yek Habe Ghand \n", "6 2014 Today Emrooz \n", "7 2016 Daughter Dokhtar \n", "8 2019 Castle of Dreams Ghasr-e Shirin \n", "\n", " Award \n", "0 Nominated Grand Prix des Amériques Montréal World Film Festival 2000 Silver ... \n", "1 International Jury Award São Paulo International Film Festival 2001 Best Dir... \n", "2 Best Screenplay Asia-Pacific Film Festival 2002 Crystal Simorgh Best Directo... \n", "3 Crystal Simorgh National Competition - Best Film Fajr International Film Fes... \n", "4 Golden St. George 30th Moscow International Film Festival 2008 Russian Guild... \n", "5 Special Jury Prize Kazan International Festival of Muslim Cinema 2012 Best F... \n", "6 Best Film Award Rabat International Film Festival 2014 Ecumenical Jury Prize... \n", "7 Crystal Simorgh National Competition - Best Original Score Fajr Internationa... \n", "8 Golden Goblet Award for Best Feature Film Shanghai International Film Festiv... },\n", " { 'answer': '2008',\n", " 'context': Year Name Transliteration \\\n", "0 2000 The Child And The Soldier Koodak va Sarbaz \n", "1 2001 Under The Moonlight Zir-e Noor-e Maah \n", "2 2002 Here Is A Shining Light Inja Cheraghi Roshan Ast \n", "3 2005 So Close , So Far Kheili Dour , Kheili Nazdik \n", "4 2008 As Simple as That Be Hamin Sadegi \n", "5 2011 A Cube of Sugar Yek Habe Ghand \n", "6 2014 Today Emrooz \n", "7 2016 Daughter Dokhtar \n", "8 2019 Castle of Dreams Ghasr-e Shirin \n", "\n", " Award \n", "0 Nominated Grand Prix des Amériques Montréal World Film Festival 2000 Silver ... \n", "1 International Jury Award São Paulo International Film Festival 2001 Best Dir... \n", "2 Best Screenplay Asia-Pacific Film Festival 2002 Crystal Simorgh Best Directo... \n", "3 Crystal Simorgh National Competition - Best Film Fajr International Film Fes... \n", "4 Golden St. George 30th Moscow International Film Festival 2008 Russian Guild... \n", "5 Special Jury Prize Kazan International Festival of Muslim Cinema 2012 Best F... \n", "6 Best Film Award Rabat International Film Festival 2014 Ecumenical Jury Prize... \n", "7 Crystal Simorgh National Competition - Best Original Score Fajr Internationa... \n", "8 Golden Goblet Award for Best Feature Film Shanghai International Film Festiv... },\n", " { 'answer': 'Daughter',\n", " 'context': Year Name Transliteration \\\n", "0 2000 The Child And The Soldier Koodak va Sarbaz \n", "1 2001 Under The Moonlight Zir-e Noor-e Maah \n", "2 2002 Here Is A Shining Light Inja Cheraghi Roshan Ast \n", "3 2005 So Close , So Far Kheili Dour , Kheili Nazdik \n", "4 2008 As Simple as That Be Hamin Sadegi \n", "5 2011 A Cube of Sugar Yek Habe Ghand \n", "6 2014 Today Emrooz \n", "7 2016 Daughter Dokhtar \n", "8 2019 Castle of Dreams Ghasr-e Shirin \n", "\n", " Award \n", "0 Nominated Grand Prix des Amériques Montréal World Film Festival 2000 Silver ... \n", "1 International Jury Award São Paulo International Film Festival 2001 Best Dir... \n", "2 Best Screenplay Asia-Pacific Film Festival 2002 Crystal Simorgh Best Directo... \n", "3 Crystal Simorgh National Competition - Best Film Fajr International Film Fes... \n", "4 Golden St. George 30th Moscow International Film Festival 2008 Russian Guild... \n", "5 Special Jury Prize Kazan International Festival of Muslim Cinema 2012 Best F... \n", "6 Best Film Award Rabat International Film Festival 2014 Ecumenical Jury Prize... \n", "7 Crystal Simorgh National Competition - Best Original Score Fajr Internationa... \n", "8 Golden Goblet Award for Best Feature Film Shanghai International Film Festiv... },\n", " { 'answer': 'Icelandic',\n", " 'context': Submitting country Film title used in nomination Language ( s ) \\\n", "0 Argentina The Official Story Spanish \n", "1 Austria Malambo German \n", "2 Belgium Dust French \n", "3 Canada Jacques and November French \n", "4 Czechoslovakia Scalpel , Please Czech \n", "5 Denmark Twist and Shout Danish \n", "6 France Three Men and a Cradle French \n", "7 West Germany Angry Harvest German \n", "8 Hungary Colonel Redl German \n", "9 Iceland Deep Winter Icelandic \n", "10 India Saagar Hindi \n", "11 Israel When Night Falls Hebrew \n", "12 Italy Macaroni Italian \n", "13 Japan Gray Sunset Japanese \n", "14 South Korea Eoudong Korean \n", "15 Mexico Frida Still Life Spanish \n", "16 Netherlands The Dream Dutch , West Frisian \n", "17 Norway Wives - Ten Years After Norwegian \n", "18 Peru The City and the Dogs Spanish \n", "19 Philippines This Is My Country Tagalog \n", "\n", " Original title Director ( s ) \\\n", "0 La Historia oficial Luis Puenzo \n", "1 Malambo Milan Dor \n", "2 Dust Marion Hänsel \n", "3 Jacques et novembre Jean Beaudry and François Bouvier \n", "4 Skalpel , prosím Jirí Svoboda \n", "5 Tro , håb og kærlighed Bille August \n", "6 Trois hommes et un couffin Coline Serreau \n", "7 Bittere Ernte Agnieszka Holland \n", "8 Oberst Redl István Szabó \n", "9 Skammdegi Þráinn Bertelsson \n", "10 सागर Ramesh Sippy \n", "11 עד סוף הלילה Eitan Green \n", "12 Maccheroni Ettore Scola \n", "13 花いちもんめ Shunya Ito \n", "14 어우동 Lee Jang-ho \n", "15 Frida , naturaleza viva Paul Leduc \n", "16 De Dream Pieter Verhoeff \n", "17 Hustruer - ti år etter Anja Breien \n", "18 La ciudad y los perros Francisco José Lombardi \n", "19 Bayan ko : Kapit sa patalim Lino Brocka \n", "\n", " Result \n", "0 Won Academy Award \n", "1 Not Nominated \n", "2 Not Nominated \n", "3 Not Nominated \n", "4 Not Nominated \n", "5 Not Nominated \n", "6 Nominated \n", "7 Nominated \n", "8 Nominated \n", "9 Not Nominated \n", "10 Not Nominated \n", "11 Not Nominated \n", "12 Not Nominated \n", "13 Not Nominated \n", "14 Not Nominated \n", "15 Not Nominated \n", "16 Not Nominated \n", "17 Not Nominated \n", "18 Not Nominated \n", "19 Not Nominated },\n", " { 'answer': 'Icelandic',\n", " 'context': Submitting country Film title used in nomination Language ( s ) \\\n", "0 Argentina The Official Story Spanish \n", "1 Austria Malambo German \n", "2 Belgium Dust French \n", "3 Canada Jacques and November French \n", "4 Czechoslovakia Scalpel , Please Czech \n", "5 Denmark Twist and Shout Danish \n", "6 France Three Men and a Cradle French \n", "7 West Germany Angry Harvest German \n", "8 Hungary Colonel Redl German \n", "9 Iceland Deep Winter Icelandic \n", "10 India Saagar Hindi \n", "11 Israel When Night Falls Hebrew \n", "12 Italy Macaroni Italian \n", "13 Japan Gray Sunset Japanese \n", "14 South Korea Eoudong Korean \n", "15 Mexico Frida Still Life Spanish \n", "16 Netherlands The Dream Dutch , West Frisian \n", "17 Norway Wives - Ten Years After Norwegian \n", "18 Peru The City and the Dogs Spanish \n", "19 Philippines This Is My Country Tagalog \n", "\n", " Original title Director ( s ) \\\n", "0 La Historia oficial Luis Puenzo \n", "1 Malambo Milan Dor \n", "2 Dust Marion Hänsel \n", "3 Jacques et novembre Jean Beaudry and François Bouvier \n", "4 Skalpel , prosím Jirí Svoboda \n", "5 Tro , håb og kærlighed Bille August \n", "6 Trois hommes et un couffin Coline Serreau \n", "7 Bittere Ernte Agnieszka Holland \n", "8 Oberst Redl István Szabó \n", "9 Skammdegi Þráinn Bertelsson \n", "10 सागर Ramesh Sippy \n", "11 עד סוף הלילה Eitan Green \n", "12 Maccheroni Ettore Scola \n", "13 花いちもんめ Shunya Ito \n", "14 어우동 Lee Jang-ho \n", "15 Frida , naturaleza viva Paul Leduc \n", "16 De Dream Pieter Verhoeff \n", "17 Hustruer - ti år etter Anja Breien \n", "18 La ciudad y los perros Francisco José Lombardi \n", "19 Bayan ko : Kapit sa patalim Lino Brocka \n", "\n", " Result \n", "0 Won Academy Award \n", "1 Not Nominated \n", "2 Not Nominated \n", "3 Not Nominated \n", "4 Not Nominated \n", "5 Not Nominated \n", "6 Nominated \n", "7 Nominated \n", "8 Nominated \n", "9 Not Nominated \n", "10 Not Nominated \n", "11 Not Nominated \n", "12 Not Nominated \n", "13 Not Nominated \n", "14 Not Nominated \n", "15 Not Nominated \n", "16 Not Nominated \n", "17 Not Nominated \n", "18 Not Nominated \n", "19 Not Nominated },\n", " { 'answer': 'Japanese',\n", " 'context': Submitting country Film title used in nomination Language ( s ) \\\n", "0 Argentina The Official Story Spanish \n", "1 Austria Malambo German \n", "2 Belgium Dust French \n", "3 Canada Jacques and November French \n", "4 Czechoslovakia Scalpel , Please Czech \n", "5 Denmark Twist and Shout Danish \n", "6 France Three Men and a Cradle French \n", "7 West Germany Angry Harvest German \n", "8 Hungary Colonel Redl German \n", "9 Iceland Deep Winter Icelandic \n", "10 India Saagar Hindi \n", "11 Israel When Night Falls Hebrew \n", "12 Italy Macaroni Italian \n", "13 Japan Gray Sunset Japanese \n", "14 South Korea Eoudong Korean \n", "15 Mexico Frida Still Life Spanish \n", "16 Netherlands The Dream Dutch , West Frisian \n", "17 Norway Wives - Ten Years After Norwegian \n", "18 Peru The City and the Dogs Spanish \n", "19 Philippines This Is My Country Tagalog \n", "\n", " Original title Director ( s ) \\\n", "0 La Historia oficial Luis Puenzo \n", "1 Malambo Milan Dor \n", "2 Dust Marion Hänsel \n", "3 Jacques et novembre Jean Beaudry and François Bouvier \n", "4 Skalpel , prosím Jirí Svoboda \n", "5 Tro , håb og kærlighed Bille August \n", "6 Trois hommes et un couffin Coline Serreau \n", "7 Bittere Ernte Agnieszka Holland \n", "8 Oberst Redl István Szabó \n", "9 Skammdegi Þráinn Bertelsson \n", "10 सागर Ramesh Sippy \n", "11 עד סוף הלילה Eitan Green \n", "12 Maccheroni Ettore Scola \n", "13 花いちもんめ Shunya Ito \n", "14 어우동 Lee Jang-ho \n", "15 Frida , naturaleza viva Paul Leduc \n", "16 De Dream Pieter Verhoeff \n", "17 Hustruer - ti år etter Anja Breien \n", "18 La ciudad y los perros Francisco José Lombardi \n", "19 Bayan ko : Kapit sa patalim Lino Brocka \n", "\n", " Result \n", "0 Won Academy Award \n", "1 Not Nominated \n", "2 Not Nominated \n", "3 Not Nominated \n", "4 Not Nominated \n", "5 Not Nominated \n", "6 Nominated \n", "7 Nominated \n", "8 Nominated \n", "9 Not Nominated \n", "10 Not Nominated \n", "11 Not Nominated \n", "12 Not Nominated \n", "13 Not Nominated \n", "14 Not Nominated \n", "15 Not Nominated \n", "16 Not Nominated \n", "17 Not Nominated \n", "18 Not Nominated \n", "19 Not Nominated },\n", " { 'answer': 'Japanese',\n", " 'context': Submitting country Film title used in nomination Language ( s ) \\\n", "0 Argentina The Official Story Spanish \n", "1 Austria Malambo German \n", "2 Belgium Dust French \n", "3 Canada Jacques and November French \n", "4 Czechoslovakia Scalpel , Please Czech \n", "5 Denmark Twist and Shout Danish \n", "6 France Three Men and a Cradle French \n", "7 West Germany Angry Harvest German \n", "8 Hungary Colonel Redl German \n", "9 Iceland Deep Winter Icelandic \n", "10 India Saagar Hindi \n", "11 Israel When Night Falls Hebrew \n", "12 Italy Macaroni Italian \n", "13 Japan Gray Sunset Japanese \n", "14 South Korea Eoudong Korean \n", "15 Mexico Frida Still Life Spanish \n", "16 Netherlands The Dream Dutch , West Frisian \n", "17 Norway Wives - Ten Years After Norwegian \n", "18 Peru The City and the Dogs Spanish \n", "19 Philippines This Is My Country Tagalog \n", "\n", " Original title Director ( s ) \\\n", "0 La Historia oficial Luis Puenzo \n", "1 Malambo Milan Dor \n", "2 Dust Marion Hänsel \n", "3 Jacques et novembre Jean Beaudry and François Bouvier \n", "4 Skalpel , prosím Jirí Svoboda \n", "5 Tro , håb og kærlighed Bille August \n", "6 Trois hommes et un couffin Coline Serreau \n", "7 Bittere Ernte Agnieszka Holland \n", "8 Oberst Redl István Szabó \n", "9 Skammdegi Þráinn Bertelsson \n", "10 सागर Ramesh Sippy \n", "11 עד סוף הלילה Eitan Green \n", "12 Maccheroni Ettore Scola \n", "13 花いちもんめ Shunya Ito \n", "14 어우동 Lee Jang-ho \n", "15 Frida , naturaleza viva Paul Leduc \n", "16 De Dream Pieter Verhoeff \n", "17 Hustruer - ti år etter Anja Breien \n", "18 La ciudad y los perros Francisco José Lombardi \n", "19 Bayan ko : Kapit sa patalim Lino Brocka \n", "\n", " Result \n", "0 Won Academy Award \n", "1 Not Nominated \n", "2 Not Nominated \n", "3 Not Nominated \n", "4 Not Nominated \n", "5 Not Nominated \n", "6 Nominated \n", "7 Nominated \n", "8 Nominated \n", "9 Not Nominated \n", "10 Not Nominated \n", "11 Not Nominated \n", "12 Not Nominated \n", "13 Not Nominated \n", "14 Not Nominated \n", "15 Not Nominated \n", "16 Not Nominated \n", "17 Not Nominated \n", "18 Not Nominated \n", "19 Not Nominated },\n", " { 'answer': '2012',\n", " 'context': Title Year \\\n", "0 The American Scream 2012 \n", "1 Dead Souls 2012 \n", "2 Ghoul 2012 \n", "3 Beneath 2013 \n", "4 Chilling Visions : 5 Senses of Fear 2013 \n", "5 The Monkey 's Paw 2013 \n", "6 Animal 2014 \n", "7 Deep in the Darkness 2014 \n", "8 The Boy 2015 \n", "9 SiREN 2016 \n", "10 Camera Obscura 2017 \n", "11 Dementia 13 2017 \n", "\n", " Production Co \n", "0 Chiller Films Brainstorm Media \n", "1 Chiller Films Synthetic Productions \n", "2 Chiller Films Modernciné \n", "3 Glass Eye Pix \n", "4 Chiller Films Synthetic Cinema International \n", "5 TMP Films \n", "6 Flower Films Synthetic Cinema International \n", "7 Chiller Films Synthetic Cinema International \n", "8 SpectreVision \n", "9 Studio71 \n", "10 Chiller Films Hood River Entertainment Paper Street Pictures \n", "11 Pipeline Entertainment Haloran LLC }]\n" ] } ], "source": [ "# We can see both text passages and tables as contexts of the predicted answers.\n", "print_answers(predictions, details=\"minimum\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Evaluation\n", "To evaluate our pipeline, we can use haystack's evaluation feature. We just need to convert our labels into `MultiLabel` objects and the `eval` method will do the rest." ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [], "source": [ "from haystack import Label, MultiLabel, Answer\n", "\n", "\n", "def read_labels(filename, tables):\n", " processed_labels = []\n", " with open(filename) as labels:\n", " labels = json.load(labels)\n", " for table in tables:\n", " if table.id not in labels:\n", " continue\n", " label = labels[table.id]\n", " label = Label(\n", " query=label[\"query\"],\n", " document=table,\n", " is_correct_answer=True,\n", " is_correct_document=True,\n", " answer=Answer(answer=label[\"answer\"]),\n", " origin=\"gold-label\",\n", " )\n", " processed_labels.append(MultiLabel(labels=[label]))\n", " return processed_labels\n", "\n", "\n", "table_labels = read_labels(f\"{doc_dir}/labels.json\", tables)\n", "passage_labels = read_labels(f\"{doc_dir}/labels.json\", passages)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "eval_results = text_table_qa_pipeline.eval(table_labels + passage_labels, params={\"top_k\": 10})" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'JoinAnswers': {'f1': 0.7135714285714286, 'exact_match': 0.6}}\n" ] } ], "source": [ "# Calculating and printing the evaluation metrics\n", "print(eval_results.calculate_metrics())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Adding tables from PDFs\n", "It can sometimes be hard to provide your data in form of a pandas DataFrame. For this case, we provide the `ParsrConverter` wrapper that can help you to convert, for example, a PDF file into a document that you can index.\n", "\n", "**Attention: `parsr` needs a docker environment for execution, but Colab doesn't support docker.**\n", "**If you have a local docker environment, you can uncomment and run the following cells.**" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# import time\n", "\n", "# !docker run -d -p 3001:3001 axarev/parsr\n", "# time.sleep(30)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# !wget https://www.w3.org/WAI/WCAG21/working-examples/pdf-table/table.pdf" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [], "source": [ "# from haystack.nodes import ParsrConverter\n", "\n", "# converter = ParsrConverter()\n", "\n", "# docs = converter.convert(\"table.pdf\")\n", "\n", "# tables = [doc for doc in docs if doc.content_type == \"table\"]" ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[{'content': [['Disability\\nCategory', 'Participants', 'Ballots\\nCompleted', 'Ballots\\nIncomplete/\\nTerminated', 'Results', 'Results'], ['Disability\\nCategory', 'Participants', 'Ballots\\nCompleted', 'Ballots\\nIncomplete/\\nTerminated', 'Accuracy', 'Time to\\ncomplete'], ['Blind', '5', '1', '4', '34.5%, n=1', '1199 sec, n=1'], ['Low Vision', '5', '2', '3', '98.3% n=2\\n(97.7%, n=3)', '1716 sec, n=3\\n(1934 sec, n=2)'], ['Dexterity', '5', '4', '1', '98.3%, n=4', '1672.1 sec, n=4'], ['Mobility', '3', '3', '0', '95.4%, n=3', '1416 sec, n=3']], 'content_type': 'table', 'meta': {'preceding_context': 'Example table\\nThis is an example of a data table.', 'following_context': ''}}]\n" ] } ], "source": [ "# print(tables)" ] }, { "cell_type": "markdown", "metadata": { "id": "RyeK3s28_X1C" }, "source": [ "## About us\n", "\n", "This [Haystack](https://github.com/deepset-ai/haystack/) notebook was made with love by [deepset](https://deepset.ai/) in Berlin, Germany\n", "\n", "We bring NLP to the industry via open source! \n", "Our focus: Industry specific language models & large scale QA systems. \n", " \n", "Some of our other work: \n", "- [German BERT](https://deepset.ai/german-bert)\n", "- [GermanQuAD and GermanDPR](https://deepset.ai/germanquad)\n", "- [FARM](https://github.com/deepset-ai/FARM)\n", "\n", "Get in touch:\n", "[Twitter](https://twitter.com/deepset_ai) | [LinkedIn](https://www.linkedin.com/company/deepset-ai/) | [Slack](https://haystack.deepset.ai/community/join) | [GitHub Discussions](https://github.com/deepset-ai/haystack/discussions) | [Website](https://deepset.ai)\n", "\n", "By the way: [we're hiring!](https://www.deepset.ai/jobs)\n" ] } ], "metadata": { "accelerator": "GPU", "colab": { "name": "Tutorial15_TableQA.ipynb", "provenance": [] }, "kernelspec": { "display_name": "Python 3", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.7" } }, "nbformat": 4, "nbformat_minor": 0 }