haystack/tutorials/Tutorial6_Better_Retrieval_via_DPR.ipynb

3029 lines
112 KiB
Plaintext
Raw Normal View History

{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "bEH-CRbeA6NU"
},
"source": [
Create documentation website (#272) * Skeleton of doc website * Flesh out documentation pages * Split concepts into their own rst files * add tutorial rsts * Consistent level 1 markdown headers in tutorials * Change theme to readthedocs * Turn bullet points into prose * Populate sections * Add more text * Add more sphinx files * Add more retriever documentation * combined all documenations in one structure * rename of src to _src as it was ignored by git * Incorporate MP2's changes * add benchmark bar charts * Adapt docstrings in Readers * Improvements to intro, creation of glossary * Adapt docstrings in Retrievers * Adapt docstrings in Finder * Adapt Docstrings of Finder * Updates to text * Edit text * update doc strings * proof read tutorials * Edit text * Edit text * Add stacked chart * populate graph with data * Switch Documentation to markdown (#386) * add way to generate markdown files to sphinx * changed from rst to markdown and extended sphinx for it * fix spelling * Clean titles * delete file * change spelling * add sections to document store usage * add basic rest api docs * fix readme in setup.py * Update Tutorials * Change section names * add windows note to pip install * update intro * new renderer for markdown files * Fix typos * delete dpr_utils.py * fix windows note in get started * Fix docstrings * deleted rest api docs in api * fixed typo * Fix docstring * revert readme to rst * Fix readme * Update setup.py Co-authored-by: deepset <deepset@Crenolape.localdomain> Co-authored-by: PiffPaffM <markuspaff.mp@gmail.com> Co-authored-by: Bogdan Kostić <bogdankostic@web.de> Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>
2020-09-18 12:57:32 +02:00
"# Better Retrieval via \"Dense Passage Retrieval\"\n",
"\n",
"[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/deepset-ai/haystack/blob/master/tutorials/Tutorial6_Better_Retrieval_via_DPR.ipynb)\n",
"\n",
"### Importance of Retrievers\n",
"\n",
"The Retriever has a huge impact on the performance of our overall search pipeline.\n",
"\n",
"\n",
"### Different types of Retrievers\n",
"#### Sparse\n",
Create documentation website (#272) * Skeleton of doc website * Flesh out documentation pages * Split concepts into their own rst files * add tutorial rsts * Consistent level 1 markdown headers in tutorials * Change theme to readthedocs * Turn bullet points into prose * Populate sections * Add more text * Add more sphinx files * Add more retriever documentation * combined all documenations in one structure * rename of src to _src as it was ignored by git * Incorporate MP2's changes * add benchmark bar charts * Adapt docstrings in Readers * Improvements to intro, creation of glossary * Adapt docstrings in Retrievers * Adapt docstrings in Finder * Adapt Docstrings of Finder * Updates to text * Edit text * update doc strings * proof read tutorials * Edit text * Edit text * Add stacked chart * populate graph with data * Switch Documentation to markdown (#386) * add way to generate markdown files to sphinx * changed from rst to markdown and extended sphinx for it * fix spelling * Clean titles * delete file * change spelling * add sections to document store usage * add basic rest api docs * fix readme in setup.py * Update Tutorials * Change section names * add windows note to pip install * update intro * new renderer for markdown files * Fix typos * delete dpr_utils.py * fix windows note in get started * Fix docstrings * deleted rest api docs in api * fixed typo * Fix docstring * revert readme to rst * Fix readme * Update setup.py Co-authored-by: deepset <deepset@Crenolape.localdomain> Co-authored-by: PiffPaffM <markuspaff.mp@gmail.com> Co-authored-by: Bogdan Kostić <bogdankostic@web.de> Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>
2020-09-18 12:57:32 +02:00
"Family of algorithms based on counting the occurrences of words (bag-of-words) resulting in very sparse vectors with length = vocab size.\n",
"\n",
Create documentation website (#272) * Skeleton of doc website * Flesh out documentation pages * Split concepts into their own rst files * add tutorial rsts * Consistent level 1 markdown headers in tutorials * Change theme to readthedocs * Turn bullet points into prose * Populate sections * Add more text * Add more sphinx files * Add more retriever documentation * combined all documenations in one structure * rename of src to _src as it was ignored by git * Incorporate MP2's changes * add benchmark bar charts * Adapt docstrings in Readers * Improvements to intro, creation of glossary * Adapt docstrings in Retrievers * Adapt docstrings in Finder * Adapt Docstrings of Finder * Updates to text * Edit text * update doc strings * proof read tutorials * Edit text * Edit text * Add stacked chart * populate graph with data * Switch Documentation to markdown (#386) * add way to generate markdown files to sphinx * changed from rst to markdown and extended sphinx for it * fix spelling * Clean titles * delete file * change spelling * add sections to document store usage * add basic rest api docs * fix readme in setup.py * Update Tutorials * Change section names * add windows note to pip install * update intro * new renderer for markdown files * Fix typos * delete dpr_utils.py * fix windows note in get started * Fix docstrings * deleted rest api docs in api * fixed typo * Fix docstring * revert readme to rst * Fix readme * Update setup.py Co-authored-by: deepset <deepset@Crenolape.localdomain> Co-authored-by: PiffPaffM <markuspaff.mp@gmail.com> Co-authored-by: Bogdan Kostić <bogdankostic@web.de> Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>
2020-09-18 12:57:32 +02:00
"**Examples**: BM25, TF-IDF\n",
"\n",
"**Pros**: Simple, fast, well explainable\n",
"\n",
"**Cons**: Relies on exact keyword matches between query and text\n",
" \n",
"\n",
"#### Dense\n",
"These retrievers use neural network models to create \"dense\" embedding vectors. Within this family there are two different approaches: \n",
"\n",
"a) Single encoder: Use a **single model** to embed both query and passage. \n",
"b) Dual-encoder: Use **two models**, one to embed the query and one to embed the passage\n",
"\n",
"Recent work suggests that dual encoders work better, likely because they can deal better with the different nature of query and passage (length, style, syntax ...). \n",
"\n",
Create documentation website (#272) * Skeleton of doc website * Flesh out documentation pages * Split concepts into their own rst files * add tutorial rsts * Consistent level 1 markdown headers in tutorials * Change theme to readthedocs * Turn bullet points into prose * Populate sections * Add more text * Add more sphinx files * Add more retriever documentation * combined all documenations in one structure * rename of src to _src as it was ignored by git * Incorporate MP2's changes * add benchmark bar charts * Adapt docstrings in Readers * Improvements to intro, creation of glossary * Adapt docstrings in Retrievers * Adapt docstrings in Finder * Adapt Docstrings of Finder * Updates to text * Edit text * update doc strings * proof read tutorials * Edit text * Edit text * Add stacked chart * populate graph with data * Switch Documentation to markdown (#386) * add way to generate markdown files to sphinx * changed from rst to markdown and extended sphinx for it * fix spelling * Clean titles * delete file * change spelling * add sections to document store usage * add basic rest api docs * fix readme in setup.py * Update Tutorials * Change section names * add windows note to pip install * update intro * new renderer for markdown files * Fix typos * delete dpr_utils.py * fix windows note in get started * Fix docstrings * deleted rest api docs in api * fixed typo * Fix docstring * revert readme to rst * Fix readme * Update setup.py Co-authored-by: deepset <deepset@Crenolape.localdomain> Co-authored-by: PiffPaffM <markuspaff.mp@gmail.com> Co-authored-by: Bogdan Kostić <bogdankostic@web.de> Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>
2020-09-18 12:57:32 +02:00
"**Examples**: REALM, DPR, Sentence-Transformers\n",
"\n",
"**Pros**: Captures semantinc similarity instead of \"word matches\" (e.g. synonyms, related topics ...)\n",
"\n",
"**Cons**: Computationally more heavy, initial training of model\n",
"\n",
"\n",
"### \"Dense Passage Retrieval\"\n",
"\n",
"In this Tutorial, we want to highlight one \"Dense Dual-Encoder\" called Dense Passage Retriever. \n",
"It was introdoced by Karpukhin et al. (2020, https://arxiv.org/abs/2004.04906. \n",
"\n",
"Original Abstract: \n",
"\n",
"_\"Open-domain question answering relies on efficient passage retrieval to select candidate contexts, where traditional sparse vector space models, such as TF-IDF or BM25, are the de facto method. In this work, we show that retrieval can be practically implemented using dense representations alone, where embeddings are learned from a small number of questions and passages by a simple dual-encoder framework. When evaluated on a wide range of open-domain QA datasets, our dense retriever outperforms a strong Lucene-BM25 system largely by 9%-19% absolute in terms of top-20 passage retrieval accuracy, and helps our end-to-end QA system establish new state-of-the-art on multiple open-domain QA benchmarks.\"_\n",
"\n",
"Paper: https://arxiv.org/abs/2004.04906 \n",
"Original Code: https://fburl.com/qa-dpr \n",
"\n",
"\n",
Create documentation website (#272) * Skeleton of doc website * Flesh out documentation pages * Split concepts into their own rst files * add tutorial rsts * Consistent level 1 markdown headers in tutorials * Change theme to readthedocs * Turn bullet points into prose * Populate sections * Add more text * Add more sphinx files * Add more retriever documentation * combined all documenations in one structure * rename of src to _src as it was ignored by git * Incorporate MP2's changes * add benchmark bar charts * Adapt docstrings in Readers * Improvements to intro, creation of glossary * Adapt docstrings in Retrievers * Adapt docstrings in Finder * Adapt Docstrings of Finder * Updates to text * Edit text * update doc strings * proof read tutorials * Edit text * Edit text * Add stacked chart * populate graph with data * Switch Documentation to markdown (#386) * add way to generate markdown files to sphinx * changed from rst to markdown and extended sphinx for it * fix spelling * Clean titles * delete file * change spelling * add sections to document store usage * add basic rest api docs * fix readme in setup.py * Update Tutorials * Change section names * add windows note to pip install * update intro * new renderer for markdown files * Fix typos * delete dpr_utils.py * fix windows note in get started * Fix docstrings * deleted rest api docs in api * fixed typo * Fix docstring * revert readme to rst * Fix readme * Update setup.py Co-authored-by: deepset <deepset@Crenolape.localdomain> Co-authored-by: PiffPaffM <markuspaff.mp@gmail.com> Co-authored-by: Bogdan Kostić <bogdankostic@web.de> Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>
2020-09-18 12:57:32 +02:00
"*Use this* [link](https://colab.research.google.com/github/deepset-ai/haystack/blob/master/tutorials/Tutorial6_Better_Retrieval_via_DPR.ipynb) *to open the notebook in Google Colab.*\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "3K27Y5FbA6NV"
},
"source": [
Create documentation website (#272) * Skeleton of doc website * Flesh out documentation pages * Split concepts into their own rst files * add tutorial rsts * Consistent level 1 markdown headers in tutorials * Change theme to readthedocs * Turn bullet points into prose * Populate sections * Add more text * Add more sphinx files * Add more retriever documentation * combined all documenations in one structure * rename of src to _src as it was ignored by git * Incorporate MP2's changes * add benchmark bar charts * Adapt docstrings in Readers * Improvements to intro, creation of glossary * Adapt docstrings in Retrievers * Adapt docstrings in Finder * Adapt Docstrings of Finder * Updates to text * Edit text * update doc strings * proof read tutorials * Edit text * Edit text * Add stacked chart * populate graph with data * Switch Documentation to markdown (#386) * add way to generate markdown files to sphinx * changed from rst to markdown and extended sphinx for it * fix spelling * Clean titles * delete file * change spelling * add sections to document store usage * add basic rest api docs * fix readme in setup.py * Update Tutorials * Change section names * add windows note to pip install * update intro * new renderer for markdown files * Fix typos * delete dpr_utils.py * fix windows note in get started * Fix docstrings * deleted rest api docs in api * fixed typo * Fix docstring * revert readme to rst * Fix readme * Update setup.py Co-authored-by: deepset <deepset@Crenolape.localdomain> Co-authored-by: PiffPaffM <markuspaff.mp@gmail.com> Co-authored-by: Bogdan Kostić <bogdankostic@web.de> Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>
2020-09-18 12:57:32 +02:00
"### Prepare environment\n",
"\n",
Create documentation website (#272) * Skeleton of doc website * Flesh out documentation pages * Split concepts into their own rst files * add tutorial rsts * Consistent level 1 markdown headers in tutorials * Change theme to readthedocs * Turn bullet points into prose * Populate sections * Add more text * Add more sphinx files * Add more retriever documentation * combined all documenations in one structure * rename of src to _src as it was ignored by git * Incorporate MP2's changes * add benchmark bar charts * Adapt docstrings in Readers * Improvements to intro, creation of glossary * Adapt docstrings in Retrievers * Adapt docstrings in Finder * Adapt Docstrings of Finder * Updates to text * Edit text * update doc strings * proof read tutorials * Edit text * Edit text * Add stacked chart * populate graph with data * Switch Documentation to markdown (#386) * add way to generate markdown files to sphinx * changed from rst to markdown and extended sphinx for it * fix spelling * Clean titles * delete file * change spelling * add sections to document store usage * add basic rest api docs * fix readme in setup.py * Update Tutorials * Change section names * add windows note to pip install * update intro * new renderer for markdown files * Fix typos * delete dpr_utils.py * fix windows note in get started * Fix docstrings * deleted rest api docs in api * fixed typo * Fix docstring * revert readme to rst * Fix readme * Update setup.py Co-authored-by: deepset <deepset@Crenolape.localdomain> Co-authored-by: PiffPaffM <markuspaff.mp@gmail.com> Co-authored-by: Bogdan Kostić <bogdankostic@web.de> Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>
2020-09-18 12:57:32 +02:00
"#### Colab: Enable the GPU runtime\n",
"Make sure you enable the GPU runtime to experience decent speed in this tutorial. \n",
"**Runtime -> Change Runtime type -> Hardware accelerator -> GPU**\n",
"\n",
2020-12-08 10:28:31 +01:00
"<img src=\"https://raw.githubusercontent.com/deepset-ai/haystack/master/docs/_src/img/colab_gpu_runtime.jpg\">"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 357
},
"colab_type": "code",
"id": "JlZgP8q1A6NW",
"outputId": "c893ac99-b7a0-4d49-a8eb-1a9951d364d9"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Mon Aug 24 11:56:45 2020 \r\n",
"+-----------------------------------------------------------------------------+\r\n",
"| NVIDIA-SMI 440.100 Driver Version: 440.100 CUDA Version: 10.2 |\r\n",
"|-------------------------------+----------------------+----------------------+\r\n",
"| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |\r\n",
"| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |\r\n",
"|===============================+======================+======================|\r\n",
"| 0 Tesla V100-SXM2... Off | 00000000:00:1E.0 Off | 0 |\r\n",
"| N/A 41C P0 39W / 300W | 0MiB / 16160MiB | 0% Default |\r\n",
"+-------------------------------+----------------------+----------------------+\r\n",
" \r\n",
"+-----------------------------------------------------------------------------+\r\n",
"| Processes: GPU Memory |\r\n",
"| GPU PID Type Process name Usage |\r\n",
"|=============================================================================|\r\n",
"| No running processes found |\r\n",
"+-----------------------------------------------------------------------------+\r\n"
]
}
],
"source": [
"# Make sure you have a GPU running\n",
"!nvidia-smi"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 1000
},
"colab_type": "code",
"id": "NM36kbRFA6Nc",
"outputId": "af1a9d85-9557-4d68-ea87-a01f00c584f9"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Collecting git+https://github.com/deepset-ai/haystack.git\n",
" Cloning https://github.com/deepset-ai/haystack.git to /tmp/pip-req-build-fqgbr4x7\n",
" Running command git clone -q https://github.com/deepset-ai/haystack.git /tmp/pip-req-build-fqgbr4x7\n",
"Requirement already satisfied (use --upgrade to upgrade): farm-haystack==0.3.0 from git+https://github.com/deepset-ai/haystack.git in /home/ubuntu/deepset/haystack\n",
"Requirement already satisfied: farm==0.4.6 in /home/ubuntu/deepset/FARM (from farm-haystack==0.3.0) (0.4.6)\n",
"Requirement already satisfied: fastapi in /home/ubuntu/py3_6/lib/python3.6/site-packages (from farm-haystack==0.3.0) (0.59.0)\n",
"Requirement already satisfied: uvicorn in /home/ubuntu/py3_6/lib/python3.6/site-packages (from farm-haystack==0.3.0) (0.11.6)\n",
"Requirement already satisfied: gunicorn in /home/ubuntu/py3_6/lib/python3.6/site-packages (from farm-haystack==0.3.0) (20.0.4)\n",
"Requirement already satisfied: pandas in /home/ubuntu/py3_6/lib/python3.6/site-packages (from farm-haystack==0.3.0) (1.0.5)\n",
"Requirement already satisfied: psycopg2-binary in /home/ubuntu/py3_6/lib/python3.6/site-packages (from farm-haystack==0.3.0) (2.8.5)\n",
"Requirement already satisfied: sklearn in /home/ubuntu/py3_6/lib/python3.6/site-packages (from farm-haystack==0.3.0) (0.0)\n",
"Requirement already satisfied: elasticsearch in /home/ubuntu/py3_6/lib/python3.6/site-packages (from farm-haystack==0.3.0) (7.8.0)\n",
"Requirement already satisfied: elastic-apm in /home/ubuntu/py3_6/lib/python3.6/site-packages (from farm-haystack==0.3.0) (5.8.1)\n",
"Requirement already satisfied: tox in /home/ubuntu/py3_6/lib/python3.6/site-packages (from farm-haystack==0.3.0) (3.17.1)\n",
"Requirement already satisfied: coverage in /home/ubuntu/py3_6/lib/python3.6/site-packages (from farm-haystack==0.3.0) (5.2)\n",
"Requirement already satisfied: langdetect in /home/ubuntu/py3_6/lib/python3.6/site-packages (from farm-haystack==0.3.0) (1.0.8)\n",
"Requirement already satisfied: wget in /home/ubuntu/py3_6/lib/python3.6/site-packages (from farm-haystack==0.3.0) (3.2)\n",
"Requirement already satisfied: python-multipart in /home/ubuntu/py3_6/lib/python3.6/site-packages (from farm-haystack==0.3.0) (0.0.5)\n",
"Requirement already satisfied: python-docx in /home/ubuntu/py3_6/lib/python3.6/site-packages (from farm-haystack==0.3.0) (0.8.10)\n",
"Requirement already satisfied: sqlalchemy_utils in /home/ubuntu/py3_6/lib/python3.6/site-packages (from farm-haystack==0.3.0) (0.36.8)\n",
"Requirement already satisfied: faiss-cpu in /home/ubuntu/py3_6/lib/python3.6/site-packages (from farm-haystack==0.3.0) (1.6.3)\n",
"Requirement already satisfied: tika in /home/ubuntu/py3_6/lib/python3.6/site-packages (from farm-haystack==0.3.0) (1.24)\n",
"Requirement already satisfied: setuptools in /home/ubuntu/py3_6/lib/python3.6/site-packages (from farm==0.4.6->farm-haystack==0.3.0) (49.1.0)\n",
"Requirement already satisfied: wheel in /home/ubuntu/py3_6/lib/python3.6/site-packages (from farm==0.4.6->farm-haystack==0.3.0) (0.34.2)\n",
"Requirement already satisfied: torch==1.5.* in /home/ubuntu/py3_6/lib/python3.6/site-packages (from farm==0.4.6->farm-haystack==0.3.0) (1.5.1)\n",
"Requirement already satisfied: tqdm in /home/ubuntu/py3_6/lib/python3.6/site-packages (from farm==0.4.6->farm-haystack==0.3.0) (4.47.0)\n",
"Requirement already satisfied: boto3 in /home/ubuntu/py3_6/lib/python3.6/site-packages (from farm==0.4.6->farm-haystack==0.3.0) (1.14.20)\n",
"Requirement already satisfied: requests in /home/ubuntu/py3_6/lib/python3.6/site-packages (from farm==0.4.6->farm-haystack==0.3.0) (2.24.0)\n",
"Requirement already satisfied: scipy>=1.3.2 in /home/ubuntu/py3_6/lib/python3.6/site-packages (from farm==0.4.6->farm-haystack==0.3.0) (1.5.1)\n",
"Requirement already satisfied: seqeval in /home/ubuntu/py3_6/lib/python3.6/site-packages (from farm==0.4.6->farm-haystack==0.3.0) (0.0.12)\n",
"Requirement already satisfied: mlflow==1.0.0 in /home/ubuntu/py3_6/lib/python3.6/site-packages (from farm==0.4.6->farm-haystack==0.3.0) (1.0.0)\n",
"Requirement already satisfied: transformers==3.0.2 in /home/ubuntu/transformers_3.0.2/transformers/src (from farm==0.4.6->farm-haystack==0.3.0) (3.0.2)\n",
"Requirement already satisfied: dotmap==1.3.0 in /home/ubuntu/py3_6/lib/python3.6/site-packages (from farm==0.4.6->farm-haystack==0.3.0) (1.3.0)\n",
"Requirement already satisfied: Werkzeug==0.16.1 in /home/ubuntu/py3_6/lib/python3.6/site-packages (from farm==0.4.6->farm-haystack==0.3.0) (0.16.1)\n",
"Requirement already satisfied: flask in /home/ubuntu/py3_6/lib/python3.6/site-packages (from farm==0.4.6->farm-haystack==0.3.0) (1.1.2)\n",
"Requirement already satisfied: flask-restplus in /home/ubuntu/py3_6/lib/python3.6/site-packages (from farm==0.4.6->farm-haystack==0.3.0) (0.13.0)\n",
"Requirement already satisfied: flask-cors in /home/ubuntu/py3_6/lib/python3.6/site-packages (from farm==0.4.6->farm-haystack==0.3.0) (3.0.8)\n",
"Requirement already satisfied: dill in /home/ubuntu/py3_6/lib/python3.6/site-packages (from farm==0.4.6->farm-haystack==0.3.0) (0.3.2)\n",
"Requirement already satisfied: psutil in /home/ubuntu/py3_6/lib/python3.6/site-packages (from farm==0.4.6->farm-haystack==0.3.0) (5.7.0)\n",
"Requirement already satisfied: starlette==0.13.4 in /home/ubuntu/py3_6/lib/python3.6/site-packages (from fastapi->farm-haystack==0.3.0) (0.13.4)\n",
"Requirement already satisfied: pydantic<2.0.0,>=0.32.2 in /home/ubuntu/py3_6/lib/python3.6/site-packages (from fastapi->farm-haystack==0.3.0) (1.6.1)\n",
"Requirement already satisfied: click==7.* in /home/ubuntu/py3_6/lib/python3.6/site-packages (from uvicorn->farm-haystack==0.3.0) (7.1.2)\n",
"Requirement already satisfied: websockets==8.* in /home/ubuntu/py3_6/lib/python3.6/site-packages (from uvicorn->farm-haystack==0.3.0) (8.1)\n",
"Requirement already satisfied: httptools==0.1.*; sys_platform != \"win32\" and sys_platform != \"cygwin\" and platform_python_implementation != \"PyPy\" in /home/ubuntu/py3_6/lib/python3.6/site-packages (from uvicorn->farm-haystack==0.3.0) (0.1.1)\n",
"Requirement already satisfied: h11<0.10,>=0.8 in /home/ubuntu/py3_6/lib/python3.6/site-packages (from uvicorn->farm-haystack==0.3.0) (0.9.0)\n",
"Requirement already satisfied: uvloop>=0.14.0; sys_platform != \"win32\" and sys_platform != \"cygwin\" and platform_python_implementation != \"PyPy\" in /home/ubuntu/py3_6/lib/python3.6/site-packages (from uvicorn->farm-haystack==0.3.0) (0.14.0)\n",
"Requirement already satisfied: numpy>=1.13.3 in /home/ubuntu/py3_6/lib/python3.6/site-packages (from pandas->farm-haystack==0.3.0) (1.19.0)\n",
"Requirement already satisfied: python-dateutil>=2.6.1 in /home/ubuntu/py3_6/lib/python3.6/site-packages (from pandas->farm-haystack==0.3.0) (2.8.1)\n",
"Requirement already satisfied: pytz>=2017.2 in /home/ubuntu/py3_6/lib/python3.6/site-packages (from pandas->farm-haystack==0.3.0) (2020.1)\n",
"Requirement already satisfied: scikit-learn in /home/ubuntu/py3_6/lib/python3.6/site-packages (from sklearn->farm-haystack==0.3.0) (0.23.1)\n",
"Requirement already satisfied: certifi in /home/ubuntu/py3_6/lib/python3.6/site-packages (from elasticsearch->farm-haystack==0.3.0) (2020.6.20)\n",
"Requirement already satisfied: urllib3>=1.21.1 in /home/ubuntu/py3_6/lib/python3.6/site-packages (from elasticsearch->farm-haystack==0.3.0) (1.25.9)\n",
"Requirement already satisfied: virtualenv!=20.0.0,!=20.0.1,!=20.0.2,!=20.0.3,!=20.0.4,!=20.0.5,!=20.0.6,!=20.0.7,>=16.0.0 in /home/ubuntu/py3_6/lib/python3.6/site-packages (from tox->farm-haystack==0.3.0) (20.0.27)\n",
"Requirement already satisfied: filelock>=3.0.0 in /home/ubuntu/py3_6/lib/python3.6/site-packages (from tox->farm-haystack==0.3.0) (3.0.12)\n",
"Requirement already satisfied: toml>=0.9.4 in /home/ubuntu/py3_6/lib/python3.6/site-packages (from tox->farm-haystack==0.3.0) (0.10.1)\n",
"Requirement already satisfied: packaging>=14 in /home/ubuntu/py3_6/lib/python3.6/site-packages (from tox->farm-haystack==0.3.0) (20.4)\n",
"Requirement already satisfied: pluggy>=0.12.0 in /home/ubuntu/py3_6/lib/python3.6/site-packages (from tox->farm-haystack==0.3.0) (0.13.1)\n",
"Requirement already satisfied: importlib-metadata<2,>=0.12; python_version < \"3.8\" in /home/ubuntu/py3_6/lib/python3.6/site-packages (from tox->farm-haystack==0.3.0) (1.7.0)\n",
"Requirement already satisfied: six>=1.14.0 in /home/ubuntu/py3_6/lib/python3.6/site-packages (from tox->farm-haystack==0.3.0) (1.15.0)\n",
"Requirement already satisfied: py>=1.4.17 in /home/ubuntu/py3_6/lib/python3.6/site-packages (from tox->farm-haystack==0.3.0) (1.9.0)\n",
"Requirement already satisfied: lxml>=2.3.2 in /home/ubuntu/py3_6/lib/python3.6/site-packages (from python-docx->farm-haystack==0.3.0) (4.5.2)\n",
"Requirement already satisfied: SQLAlchemy>=1.0 in /home/ubuntu/py3_6/lib/python3.6/site-packages (from sqlalchemy_utils->farm-haystack==0.3.0) (1.3.18)\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Requirement already satisfied: future in /home/ubuntu/py3_6/lib/python3.6/site-packages (from torch==1.5.*->farm==0.4.6->farm-haystack==0.3.0) (0.18.2)\n",
"Requirement already satisfied: botocore<1.18.0,>=1.17.20 in /home/ubuntu/py3_6/lib/python3.6/site-packages (from boto3->farm==0.4.6->farm-haystack==0.3.0) (1.17.20)\n",
"Requirement already satisfied: jmespath<1.0.0,>=0.7.1 in /home/ubuntu/py3_6/lib/python3.6/site-packages (from boto3->farm==0.4.6->farm-haystack==0.3.0) (0.10.0)\n",
"Requirement already satisfied: s3transfer<0.4.0,>=0.3.0 in /home/ubuntu/py3_6/lib/python3.6/site-packages (from boto3->farm==0.4.6->farm-haystack==0.3.0) (0.3.3)\n",
"Requirement already satisfied: chardet<4,>=3.0.2 in /home/ubuntu/py3_6/lib/python3.6/site-packages (from requests->farm==0.4.6->farm-haystack==0.3.0) (3.0.4)\n",
"Requirement already satisfied: idna<3,>=2.5 in /home/ubuntu/py3_6/lib/python3.6/site-packages (from requests->farm==0.4.6->farm-haystack==0.3.0) (2.10)\n",
"Requirement already satisfied: Keras>=2.2.4 in /home/ubuntu/py3_6/lib/python3.6/site-packages (from seqeval->farm==0.4.6->farm-haystack==0.3.0) (2.4.3)\n",
"Requirement already satisfied: simplejson in /home/ubuntu/py3_6/lib/python3.6/site-packages (from mlflow==1.0.0->farm==0.4.6->farm-haystack==0.3.0) (3.17.0)\n",
"Requirement already satisfied: docker>=3.6.0 in /home/ubuntu/py3_6/lib/python3.6/site-packages (from mlflow==1.0.0->farm==0.4.6->farm-haystack==0.3.0) (4.2.2)\n",
"Requirement already satisfied: alembic in /home/ubuntu/py3_6/lib/python3.6/site-packages (from mlflow==1.0.0->farm==0.4.6->farm-haystack==0.3.0) (1.4.2)\n",
"Requirement already satisfied: entrypoints in /home/ubuntu/py3_6/lib/python3.6/site-packages (from mlflow==1.0.0->farm==0.4.6->farm-haystack==0.3.0) (0.3)\n",
"Requirement already satisfied: gitpython>=2.1.0 in /home/ubuntu/py3_6/lib/python3.6/site-packages (from mlflow==1.0.0->farm==0.4.6->farm-haystack==0.3.0) (3.1.7)\n",
"Requirement already satisfied: pyyaml in /home/ubuntu/py3_6/lib/python3.6/site-packages (from mlflow==1.0.0->farm==0.4.6->farm-haystack==0.3.0) (5.3.1)\n",
"Requirement already satisfied: databricks-cli>=0.8.0 in /home/ubuntu/py3_6/lib/python3.6/site-packages (from mlflow==1.0.0->farm==0.4.6->farm-haystack==0.3.0) (0.11.0)\n",
"Requirement already satisfied: querystring-parser in /home/ubuntu/py3_6/lib/python3.6/site-packages (from mlflow==1.0.0->farm==0.4.6->farm-haystack==0.3.0) (1.2.4)\n",
"Requirement already satisfied: cloudpickle in /home/ubuntu/py3_6/lib/python3.6/site-packages (from mlflow==1.0.0->farm==0.4.6->farm-haystack==0.3.0) (1.5.0)\n",
"Requirement already satisfied: protobuf>=3.6.0 in /home/ubuntu/py3_6/lib/python3.6/site-packages (from mlflow==1.0.0->farm==0.4.6->farm-haystack==0.3.0) (3.12.2)\n",
"Requirement already satisfied: sqlparse in /home/ubuntu/py3_6/lib/python3.6/site-packages (from mlflow==1.0.0->farm==0.4.6->farm-haystack==0.3.0) (0.3.1)\n",
"Requirement already satisfied: tokenizers==0.8.1.rc2 in /home/ubuntu/py3_6/lib/python3.6/site-packages (from transformers==3.0.2->farm==0.4.6->farm-haystack==0.3.0) (0.8.1rc2)\n",
"Requirement already satisfied: regex!=2019.12.17 in /home/ubuntu/py3_6/lib/python3.6/site-packages (from transformers==3.0.2->farm==0.4.6->farm-haystack==0.3.0) (2020.6.8)\n",
"Requirement already satisfied: sentencepiece!=0.1.92 in /home/ubuntu/py3_6/lib/python3.6/site-packages (from transformers==3.0.2->farm==0.4.6->farm-haystack==0.3.0) (0.1.91)\n",
"Requirement already satisfied: sacremoses in /home/ubuntu/py3_6/lib/python3.6/site-packages (from transformers==3.0.2->farm==0.4.6->farm-haystack==0.3.0) (0.0.43)\n",
"Requirement already satisfied: dataclasses in /home/ubuntu/py3_6/lib/python3.6/site-packages (from transformers==3.0.2->farm==0.4.6->farm-haystack==0.3.0) (0.7)\n",
"Requirement already satisfied: itsdangerous>=0.24 in /home/ubuntu/py3_6/lib/python3.6/site-packages (from flask->farm==0.4.6->farm-haystack==0.3.0) (1.1.0)\n",
"Requirement already satisfied: Jinja2>=2.10.1 in /home/ubuntu/py3_6/lib/python3.6/site-packages (from flask->farm==0.4.6->farm-haystack==0.3.0) (2.11.2)\n",
"Requirement already satisfied: jsonschema in /home/ubuntu/py3_6/lib/python3.6/site-packages (from flask-restplus->farm==0.4.6->farm-haystack==0.3.0) (3.2.0)\n",
"Requirement already satisfied: aniso8601>=0.82 in /home/ubuntu/py3_6/lib/python3.6/site-packages (from flask-restplus->farm==0.4.6->farm-haystack==0.3.0) (8.0.0)\n",
"Requirement already satisfied: threadpoolctl>=2.0.0 in /home/ubuntu/py3_6/lib/python3.6/site-packages (from scikit-learn->sklearn->farm-haystack==0.3.0) (2.1.0)\n",
"Requirement already satisfied: joblib>=0.11 in /home/ubuntu/py3_6/lib/python3.6/site-packages (from scikit-learn->sklearn->farm-haystack==0.3.0) (0.16.0)\n",
"Requirement already satisfied: distlib<1,>=0.3.1 in /home/ubuntu/py3_6/lib/python3.6/site-packages (from virtualenv!=20.0.0,!=20.0.1,!=20.0.2,!=20.0.3,!=20.0.4,!=20.0.5,!=20.0.6,!=20.0.7,>=16.0.0->tox->farm-haystack==0.3.0) (0.3.1)\n",
"Requirement already satisfied: appdirs<2,>=1.4.3 in /home/ubuntu/py3_6/lib/python3.6/site-packages (from virtualenv!=20.0.0,!=20.0.1,!=20.0.2,!=20.0.3,!=20.0.4,!=20.0.5,!=20.0.6,!=20.0.7,>=16.0.0->tox->farm-haystack==0.3.0) (1.4.4)\n",
"Requirement already satisfied: importlib-resources>=1.0; python_version < \"3.7\" in /home/ubuntu/py3_6/lib/python3.6/site-packages (from virtualenv!=20.0.0,!=20.0.1,!=20.0.2,!=20.0.3,!=20.0.4,!=20.0.5,!=20.0.6,!=20.0.7,>=16.0.0->tox->farm-haystack==0.3.0) (3.0.0)\n",
"Requirement already satisfied: pyparsing>=2.0.2 in /home/ubuntu/py3_6/lib/python3.6/site-packages (from packaging>=14->tox->farm-haystack==0.3.0) (2.4.7)\n",
"Requirement already satisfied: zipp>=0.5 in /home/ubuntu/py3_6/lib/python3.6/site-packages (from importlib-metadata<2,>=0.12; python_version < \"3.8\"->tox->farm-haystack==0.3.0) (3.1.0)\n",
"Requirement already satisfied: docutils<0.16,>=0.10 in /home/ubuntu/py3_6/lib/python3.6/site-packages (from botocore<1.18.0,>=1.17.20->boto3->farm==0.4.6->farm-haystack==0.3.0) (0.15.2)\n",
"Requirement already satisfied: h5py in /home/ubuntu/py3_6/lib/python3.6/site-packages (from Keras>=2.2.4->seqeval->farm==0.4.6->farm-haystack==0.3.0) (2.10.0)\n",
"Requirement already satisfied: websocket-client>=0.32.0 in /home/ubuntu/py3_6/lib/python3.6/site-packages (from docker>=3.6.0->mlflow==1.0.0->farm==0.4.6->farm-haystack==0.3.0) (0.57.0)\n",
"Requirement already satisfied: Mako in /home/ubuntu/py3_6/lib/python3.6/site-packages (from alembic->mlflow==1.0.0->farm==0.4.6->farm-haystack==0.3.0) (1.1.3)\n",
"Requirement already satisfied: python-editor>=0.3 in /home/ubuntu/py3_6/lib/python3.6/site-packages (from alembic->mlflow==1.0.0->farm==0.4.6->farm-haystack==0.3.0) (1.0.4)\n",
"Requirement already satisfied: gitdb<5,>=4.0.1 in /home/ubuntu/py3_6/lib/python3.6/site-packages (from gitpython>=2.1.0->mlflow==1.0.0->farm==0.4.6->farm-haystack==0.3.0) (4.0.5)\n",
"Requirement already satisfied: tabulate>=0.7.7 in /home/ubuntu/py3_6/lib/python3.6/site-packages (from databricks-cli>=0.8.0->mlflow==1.0.0->farm==0.4.6->farm-haystack==0.3.0) (0.8.7)\n",
"Requirement already satisfied: MarkupSafe>=0.23 in /home/ubuntu/py3_6/lib/python3.6/site-packages (from Jinja2>=2.10.1->flask->farm==0.4.6->farm-haystack==0.3.0) (1.1.1)\n",
"Requirement already satisfied: attrs>=17.4.0 in /home/ubuntu/py3_6/lib/python3.6/site-packages (from jsonschema->flask-restplus->farm==0.4.6->farm-haystack==0.3.0) (19.3.0)\n",
"Requirement already satisfied: pyrsistent>=0.14.0 in /home/ubuntu/py3_6/lib/python3.6/site-packages (from jsonschema->flask-restplus->farm==0.4.6->farm-haystack==0.3.0) (0.16.0)\n",
"Requirement already satisfied: smmap<4,>=3.0.1 in /home/ubuntu/py3_6/lib/python3.6/site-packages (from gitdb<5,>=4.0.1->gitpython>=2.1.0->mlflow==1.0.0->farm==0.4.6->farm-haystack==0.3.0) (3.0.4)\n",
"Building wheels for collected packages: farm-haystack\n",
" Building wheel for farm-haystack (setup.py) ... \u001B[?25ldone\n",
"\u001B[?25h Created wheel for farm-haystack: filename=farm_haystack-0.3.0-py3-none-any.whl size=99007 sha256=c46bad086db77ddc557d67d6a47b0e8ead6a76c20451e21bd7e56e7b3adf5434\n",
" Stored in directory: /tmp/pip-ephem-wheel-cache-s2p1ltpe/wheels/5b/d7/60/7a15bd24f2905dfa70aa762413b9570b9d37add064b151aaf0\n",
"Successfully built farm-haystack\n",
"\u001B[33mWARNING: You are using pip version 20.1.1; however, version 20.2.2 is available.\n",
"You should consider upgrading via the '/home/ubuntu/py3_6/bin/python3.6 -m pip install --upgrade pip' command.\u001B[0m\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Looking in links: https://download.pytorch.org/whl/torch_stable.html\n",
"Collecting torch==1.5.1+cu101\n",
" Downloading https://download.pytorch.org/whl/cu101/torch-1.5.1%2Bcu101-cp36-cp36m-linux_x86_64.whl (704.4 MB)\n",
"\u001B[K |████████████████████████████████| 704.4 MB 9.3 kB/s eta 0:00:011\n",
"\u001B[?25hCollecting torchvision==0.6.1+cu101\n",
" Downloading https://download.pytorch.org/whl/cu101/torchvision-0.6.1%2Bcu101-cp36-cp36m-linux_x86_64.whl (6.6 MB)\n",
"\u001B[K |████████████████████████████████| 6.6 MB 881 kB/s eta 0:00:01\n",
"\u001B[?25hRequirement already satisfied: numpy in /home/ubuntu/py3_6/lib/python3.6/site-packages (from torch==1.5.1+cu101) (1.19.0)\n",
"Requirement already satisfied: future in /home/ubuntu/py3_6/lib/python3.6/site-packages (from torch==1.5.1+cu101) (0.18.2)\n",
"Requirement already satisfied: pillow>=4.1.1 in /home/ubuntu/py3_6/lib/python3.6/site-packages (from torchvision==0.6.1+cu101) (7.2.0)\n",
"Installing collected packages: torch, torchvision\n",
" Attempting uninstall: torch\n",
" Found existing installation: torch 1.5.1\n",
" Uninstalling torch-1.5.1:\n",
" Successfully uninstalled torch-1.5.1\n",
"Successfully installed torch-1.5.1+cu101 torchvision-0.6.1+cu101\n",
"\u001B[33mWARNING: You are using pip version 20.1.1; however, version 20.2.2 is available.\n",
"You should consider upgrading via the '/home/ubuntu/py3_6/bin/python3.6 -m pip install --upgrade pip' command.\u001B[0m\n"
]
}
],
"source": [
"# Install the latest release of Haystack in your own environment \n",
"#! pip install farm-haystack\n",
"\n",
"# Install the latest master of Haystack\n",
"!pip install grpcio-tools==1.34.1\n",
"!pip install git+https://github.com/deepset-ai/haystack.git\n"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"colab": {},
"colab_type": "code",
"id": "xmRuhTQ7A6Nh"
},
"outputs": [],
"source": [
"from haystack import Finder\n",
"from haystack.preprocessor.cleaning import clean_wiki_text\n",
"from haystack.preprocessor.utils import convert_files_to_dicts, fetch_archive_from_http\n",
"from haystack.reader.farm import FARMReader\n",
"from haystack.reader.transformers import TransformersReader\n",
"from haystack.utils import print_answers"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "q3dSo7ZtA6Nl"
},
"source": [
Create documentation website (#272) * Skeleton of doc website * Flesh out documentation pages * Split concepts into their own rst files * add tutorial rsts * Consistent level 1 markdown headers in tutorials * Change theme to readthedocs * Turn bullet points into prose * Populate sections * Add more text * Add more sphinx files * Add more retriever documentation * combined all documenations in one structure * rename of src to _src as it was ignored by git * Incorporate MP2's changes * add benchmark bar charts * Adapt docstrings in Readers * Improvements to intro, creation of glossary * Adapt docstrings in Retrievers * Adapt docstrings in Finder * Adapt Docstrings of Finder * Updates to text * Edit text * update doc strings * proof read tutorials * Edit text * Edit text * Add stacked chart * populate graph with data * Switch Documentation to markdown (#386) * add way to generate markdown files to sphinx * changed from rst to markdown and extended sphinx for it * fix spelling * Clean titles * delete file * change spelling * add sections to document store usage * add basic rest api docs * fix readme in setup.py * Update Tutorials * Change section names * add windows note to pip install * update intro * new renderer for markdown files * Fix typos * delete dpr_utils.py * fix windows note in get started * Fix docstrings * deleted rest api docs in api * fixed typo * Fix docstring * revert readme to rst * Fix readme * Update setup.py Co-authored-by: deepset <deepset@Crenolape.localdomain> Co-authored-by: PiffPaffM <markuspaff.mp@gmail.com> Co-authored-by: Bogdan Kostić <bogdankostic@web.de> Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>
2020-09-18 12:57:32 +02:00
"### Document Store\n",
"\n",
"#### Option 1: FAISS\n",
"\n",
"FAISS is a library for efficient similarity search on a cluster of dense vectors.\n",
"The `FAISSDocumentStore` uses a SQL(SQLite in-memory be default) database under-the-hood\n",
"to store the document text and other meta data. The vector embeddings of the text are\n",
"indexed on a FAISS Index that later is queried for searching answers.\n",
"The default flavour of FAISSDocumentStore is \"Flat\" but can also be set to \"HNSW\" for\n",
"faster search at the expense of some accuracy. Just set the faiss_index_factor_str argument in the constructor.\n",
"For more info on which suits your use case: https://github.com/facebookresearch/faiss/wiki/Guidelines-to-choose-an-index"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 51
},
"colab_type": "code",
"id": "1cYgDJmrA6Nv",
"outputId": "a8aa6da1-9acf-43b1-fa3c-200123e9bdce",
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"08/25/2020 08:27:51 - INFO - faiss - Loading faiss with AVX2 support.\n",
"08/25/2020 08:27:51 - INFO - faiss - Loading faiss.\n"
]
}
],
"source": [
"from haystack.document_store import FAISSDocumentStore\n",
"\n",
"document_store = FAISSDocumentStore(faiss_index_factory_str=\"Flat\")"
]
},
{
"cell_type": "markdown",
"source": [
"#### Option 2: Milvus\n",
"\n",
"Milvus is an open source database library that is also optimized for vector similarity searches like FAISS.\n",
"Like FAISS it has both a \"Flat\" and \"HNSW\" mode but it outperforms FAISS when it comes to dynamic data management.\n",
"It does require a little more setup, however, as it is run through Docker and requires the setup of some config files.\n",
"See [their docs](https://milvus.io/docs/v1.0.0/milvus_docker-cpu.md) for more details."
],
"metadata": {
"collapsed": false,
"pycharm": {
"name": "#%% md\n"
}
}
},
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"from haystack.utils import launch_milvus\n",
"from haystack.document_store import MilvusDocumentStore\n",
"\n",
"launch_milvus()\n",
"document_store = MilvusDocumentStore()"
],
"metadata": {
"collapsed": false,
"pycharm": {
"name": "#%%\n"
}
}
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "06LatTJBA6N0",
"pycharm": {
"name": "#%% md\n"
}
},
"source": [
Create documentation website (#272) * Skeleton of doc website * Flesh out documentation pages * Split concepts into their own rst files * add tutorial rsts * Consistent level 1 markdown headers in tutorials * Change theme to readthedocs * Turn bullet points into prose * Populate sections * Add more text * Add more sphinx files * Add more retriever documentation * combined all documenations in one structure * rename of src to _src as it was ignored by git * Incorporate MP2's changes * add benchmark bar charts * Adapt docstrings in Readers * Improvements to intro, creation of glossary * Adapt docstrings in Retrievers * Adapt docstrings in Finder * Adapt Docstrings of Finder * Updates to text * Edit text * update doc strings * proof read tutorials * Edit text * Edit text * Add stacked chart * populate graph with data * Switch Documentation to markdown (#386) * add way to generate markdown files to sphinx * changed from rst to markdown and extended sphinx for it * fix spelling * Clean titles * delete file * change spelling * add sections to document store usage * add basic rest api docs * fix readme in setup.py * Update Tutorials * Change section names * add windows note to pip install * update intro * new renderer for markdown files * Fix typos * delete dpr_utils.py * fix windows note in get started * Fix docstrings * deleted rest api docs in api * fixed typo * Fix docstring * revert readme to rst * Fix readme * Update setup.py Co-authored-by: deepset <deepset@Crenolape.localdomain> Co-authored-by: PiffPaffM <markuspaff.mp@gmail.com> Co-authored-by: Bogdan Kostić <bogdankostic@web.de> Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>
2020-09-18 12:57:32 +02:00
"### Cleaning & indexing documents\n",
"\n",
"Similarly to the previous tutorials, we download, convert and index some Game of Thrones articles to our DocumentStore"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 156
},
"colab_type": "code",
"id": "iqKnu6wxA6N1",
"outputId": "bb5dcc7b-b65f-49ed-db0b-842981af213b",
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"08/25/2020 08:27:53 - INFO - haystack.indexing.utils - Found data stored in `data/article_txt_got`. Delete this first if you really want to fetch new data.\n"
]
}
],
"source": [
"# Let's first get some files that we want to use\n",
"doc_dir = \"data/article_txt_got\"\n",
"s3_url = \"https://s3.eu-central-1.amazonaws.com/deepset.ai-farm-qa/datasets/documents/wiki_gameofthrones_txt.zip\"\n",
"fetch_archive_from_http(url=s3_url, output_dir=doc_dir)\n",
"\n",
"# Convert files to dicts\n",
"dicts = convert_files_to_dicts(dir_path=doc_dir, clean_func=clean_wiki_text, split_paragraphs=True)\n",
"\n",
"# Now, let's write the dicts containing documents to our DB.\n",
"document_store.write_documents(dicts)"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "wgjedxx_A6N6"
},
"source": [
Create documentation website (#272) * Skeleton of doc website * Flesh out documentation pages * Split concepts into their own rst files * add tutorial rsts * Consistent level 1 markdown headers in tutorials * Change theme to readthedocs * Turn bullet points into prose * Populate sections * Add more text * Add more sphinx files * Add more retriever documentation * combined all documenations in one structure * rename of src to _src as it was ignored by git * Incorporate MP2's changes * add benchmark bar charts * Adapt docstrings in Readers * Improvements to intro, creation of glossary * Adapt docstrings in Retrievers * Adapt docstrings in Finder * Adapt Docstrings of Finder * Updates to text * Edit text * update doc strings * proof read tutorials * Edit text * Edit text * Add stacked chart * populate graph with data * Switch Documentation to markdown (#386) * add way to generate markdown files to sphinx * changed from rst to markdown and extended sphinx for it * fix spelling * Clean titles * delete file * change spelling * add sections to document store usage * add basic rest api docs * fix readme in setup.py * Update Tutorials * Change section names * add windows note to pip install * update intro * new renderer for markdown files * Fix typos * delete dpr_utils.py * fix windows note in get started * Fix docstrings * deleted rest api docs in api * fixed typo * Fix docstring * revert readme to rst * Fix readme * Update setup.py Co-authored-by: deepset <deepset@Crenolape.localdomain> Co-authored-by: PiffPaffM <markuspaff.mp@gmail.com> Co-authored-by: Bogdan Kostić <bogdankostic@web.de> Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>
2020-09-18 12:57:32 +02:00
"### Initalize Retriever, Reader, & Finder\n",
"\n",
Create documentation website (#272) * Skeleton of doc website * Flesh out documentation pages * Split concepts into their own rst files * add tutorial rsts * Consistent level 1 markdown headers in tutorials * Change theme to readthedocs * Turn bullet points into prose * Populate sections * Add more text * Add more sphinx files * Add more retriever documentation * combined all documenations in one structure * rename of src to _src as it was ignored by git * Incorporate MP2's changes * add benchmark bar charts * Adapt docstrings in Readers * Improvements to intro, creation of glossary * Adapt docstrings in Retrievers * Adapt docstrings in Finder * Adapt Docstrings of Finder * Updates to text * Edit text * update doc strings * proof read tutorials * Edit text * Edit text * Add stacked chart * populate graph with data * Switch Documentation to markdown (#386) * add way to generate markdown files to sphinx * changed from rst to markdown and extended sphinx for it * fix spelling * Clean titles * delete file * change spelling * add sections to document store usage * add basic rest api docs * fix readme in setup.py * Update Tutorials * Change section names * add windows note to pip install * update intro * new renderer for markdown files * Fix typos * delete dpr_utils.py * fix windows note in get started * Fix docstrings * deleted rest api docs in api * fixed typo * Fix docstring * revert readme to rst * Fix readme * Update setup.py Co-authored-by: deepset <deepset@Crenolape.localdomain> Co-authored-by: PiffPaffM <markuspaff.mp@gmail.com> Co-authored-by: Bogdan Kostić <bogdankostic@web.de> Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>
2020-09-18 12:57:32 +02:00
"#### Retriever\n",
"\n",
"**Here:** We use a `DensePassageRetriever`\n",
"\n",
"**Alternatives:**\n",
"\n",
"- The `ElasticsearchRetriever`with custom queries (e.g. boosting) and filters\n",
"- Use `EmbeddingRetriever` to find candidate documents based on the similarity of embeddings (e.g. created via Sentence-BERT)\n",
"- Use `TfidfRetriever` in combination with a SQL or InMemory Document store for simple prototyping and debugging"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 1000,
"referenced_widgets": [
"20affb86c4574e3a9829136fdfe40470",
"7f8c2c86bbb74a18ac8bd24046d99d34",
"84311c037c6e44b5b621237f59f027a0",
"05d793fc179746e9b74cbcbc1a3389eb",
"ad2ce6a8b4f844ac93b425f1261c131f",
"bb45d5e4c9944fcd87b408e2fbfea440",
"248d02e01dea4a63a3296e28e4537eaf",
"74a9c43eb61a43aa973194b0b70e18f5",
"58fc3339f13644aea1d4c6d8e1d43a65",
"460bef2bfa7d4aa480639095555577ac",
"8553a48fb3144739b99fa04adf8b407c",
"babe35bb292f4010b64104b2b5bc92af",
"887412c45ce744efbcc875b563770c29",
"b4b950d899df4e3fbed9255b281e988a",
"89535c589aa64648b82a9794a2888e78",
"f35430501bb14fba8dbd5fb797c2e509",
"eb5d93a8416a437e9cb039650756ac74",
"5b8d5975d2674e7e9ada64e77c463c0a",
"4afa2be1c2c5483f932a42ea4a7897af",
"0e7186eeb5fa47d89c8c111ebe43c5af",
"fa946133dfcc4a6ebc6fef2ef9dd92f7",
"518b6a993e42490297289f2328d0270a",
"cea074a636d34a75b311569fc3f0b3ab",
"2630fd2fa91d498796af6d7d8d73aba4"
]
},
"colab_type": "code",
"id": "kFwiPP60A6N7",
"outputId": "07249856-3222-4898-9246-68e9ecbf5a1b",
"pycharm": {
"is_executing": true
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"08/25/2020 08:28:12 - INFO - haystack.database.faiss - Updating embeddings for 2497 docs ...\n",
"/pytorch/torch/csrc/utils/python_arg_parser.cpp:756: UserWarning: This overload of nonzero is deprecated:\n",
"\tnonzero(Tensor input, *, Tensor out)\n",
"Consider using one of the following signatures instead:\n",
"\tnonzero(Tensor input, *, bool as_tuple)\n",
"08/25/2020 08:28:13 - INFO - haystack.retriever.dense - Embedded 80 / 2497 texts\n",
"08/25/2020 08:28:14 - INFO - haystack.retriever.dense - Embedded 160 / 2497 texts\n",
"08/25/2020 08:28:14 - INFO - haystack.retriever.dense - Embedded 240 / 2497 texts\n",
"08/25/2020 08:28:15 - INFO - haystack.retriever.dense - Embedded 320 / 2497 texts\n",
"08/25/2020 08:28:16 - INFO - haystack.retriever.dense - Embedded 400 / 2497 texts\n",
"08/25/2020 08:28:17 - INFO - haystack.retriever.dense - Embedded 480 / 2497 texts\n",
"08/25/2020 08:28:17 - INFO - haystack.retriever.dense - Embedded 560 / 2497 texts\n",
"08/25/2020 08:28:18 - INFO - haystack.retriever.dense - Embedded 640 / 2497 texts\n",
"08/25/2020 08:28:19 - INFO - haystack.retriever.dense - Embedded 720 / 2497 texts\n",
"08/25/2020 08:28:19 - INFO - haystack.retriever.dense - Embedded 800 / 2497 texts\n",
"08/25/2020 08:28:20 - INFO - haystack.retriever.dense - Embedded 880 / 2497 texts\n",
"08/25/2020 08:28:20 - INFO - haystack.retriever.dense - Embedded 960 / 2497 texts\n",
"08/25/2020 08:28:21 - INFO - haystack.retriever.dense - Embedded 1040 / 2497 texts\n",
"08/25/2020 08:28:22 - INFO - haystack.retriever.dense - Embedded 1120 / 2497 texts\n",
"08/25/2020 08:28:22 - INFO - haystack.retriever.dense - Embedded 1200 / 2497 texts\n",
"08/25/2020 08:28:23 - INFO - haystack.retriever.dense - Embedded 1280 / 2497 texts\n",
"08/25/2020 08:28:24 - INFO - haystack.retriever.dense - Embedded 1360 / 2497 texts\n",
"08/25/2020 08:28:24 - INFO - haystack.retriever.dense - Embedded 1440 / 2497 texts\n",
"08/25/2020 08:28:25 - INFO - haystack.retriever.dense - Embedded 1520 / 2497 texts\n",
"08/25/2020 08:28:25 - INFO - haystack.retriever.dense - Embedded 1600 / 2497 texts\n",
"08/25/2020 08:28:26 - INFO - haystack.retriever.dense - Embedded 1680 / 2497 texts\n",
"08/25/2020 08:28:27 - INFO - haystack.retriever.dense - Embedded 1760 / 2497 texts\n",
"08/25/2020 08:28:27 - INFO - haystack.retriever.dense - Embedded 1840 / 2497 texts\n",
"08/25/2020 08:28:28 - INFO - haystack.retriever.dense - Embedded 1920 / 2497 texts\n",
"08/25/2020 08:28:29 - INFO - haystack.retriever.dense - Embedded 2000 / 2497 texts\n",
"08/25/2020 08:28:29 - INFO - haystack.retriever.dense - Embedded 2080 / 2497 texts\n",
"08/25/2020 08:28:30 - INFO - haystack.retriever.dense - Embedded 2160 / 2497 texts\n",
"08/25/2020 08:28:30 - INFO - haystack.retriever.dense - Embedded 2240 / 2497 texts\n",
"08/25/2020 08:28:31 - INFO - haystack.retriever.dense - Embedded 2320 / 2497 texts\n",
"08/25/2020 08:28:32 - INFO - haystack.retriever.dense - Embedded 2400 / 2497 texts\n",
"08/25/2020 08:28:32 - INFO - haystack.retriever.dense - Embedded 2480 / 2497 texts\n"
]
}
],
"source": [
"from haystack.retriever.dense import DensePassageRetriever\n",
"retriever = DensePassageRetriever(document_store=document_store,\n",
" query_embedding_model=\"facebook/dpr-question_encoder-single-nq-base\",\n",
" passage_embedding_model=\"facebook/dpr-ctx_encoder-single-nq-base\",\n",
" max_seq_len_query=64,\n",
" max_seq_len_passage=256,\n",
" batch_size=16,\n",
" use_gpu=True,\n",
" embed_title=True,\n",
" use_fast_tokenizers=True)\n",
"# Important: \n",
2020-07-03 16:06:46 +02:00
"# Now that after we have the DPR initialized, we need to call update_embeddings() to iterate over all\n",
"# previously indexed documents and update their embedding representation. \n",
"# While this can be a time consuming operation (depending on corpus size), it only needs to be done once. \n",
"# At query time, we only need to embed the query and compare it the existing doc embeddings which is very fast.\n",
"document_store.update_embeddings(retriever)"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "rnVR28OXA6OA"
},
"source": [
Create documentation website (#272) * Skeleton of doc website * Flesh out documentation pages * Split concepts into their own rst files * add tutorial rsts * Consistent level 1 markdown headers in tutorials * Change theme to readthedocs * Turn bullet points into prose * Populate sections * Add more text * Add more sphinx files * Add more retriever documentation * combined all documenations in one structure * rename of src to _src as it was ignored by git * Incorporate MP2's changes * add benchmark bar charts * Adapt docstrings in Readers * Improvements to intro, creation of glossary * Adapt docstrings in Retrievers * Adapt docstrings in Finder * Adapt Docstrings of Finder * Updates to text * Edit text * update doc strings * proof read tutorials * Edit text * Edit text * Add stacked chart * populate graph with data * Switch Documentation to markdown (#386) * add way to generate markdown files to sphinx * changed from rst to markdown and extended sphinx for it * fix spelling * Clean titles * delete file * change spelling * add sections to document store usage * add basic rest api docs * fix readme in setup.py * Update Tutorials * Change section names * add windows note to pip install * update intro * new renderer for markdown files * Fix typos * delete dpr_utils.py * fix windows note in get started * Fix docstrings * deleted rest api docs in api * fixed typo * Fix docstring * revert readme to rst * Fix readme * Update setup.py Co-authored-by: deepset <deepset@Crenolape.localdomain> Co-authored-by: PiffPaffM <markuspaff.mp@gmail.com> Co-authored-by: Bogdan Kostić <bogdankostic@web.de> Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>
2020-09-18 12:57:32 +02:00
"#### Reader\n",
"\n",
"Similar to previous Tutorials we now initalize our reader.\n",
"\n",
"Here we use a FARMReader with the *deepset/roberta-base-squad2* model (see: https://huggingface.co/deepset/roberta-base-squad2)\n",
"\n",
"\n",
"\n",
Create documentation website (#272) * Skeleton of doc website * Flesh out documentation pages * Split concepts into their own rst files * add tutorial rsts * Consistent level 1 markdown headers in tutorials * Change theme to readthedocs * Turn bullet points into prose * Populate sections * Add more text * Add more sphinx files * Add more retriever documentation * combined all documenations in one structure * rename of src to _src as it was ignored by git * Incorporate MP2's changes * add benchmark bar charts * Adapt docstrings in Readers * Improvements to intro, creation of glossary * Adapt docstrings in Retrievers * Adapt docstrings in Finder * Adapt Docstrings of Finder * Updates to text * Edit text * update doc strings * proof read tutorials * Edit text * Edit text * Add stacked chart * populate graph with data * Switch Documentation to markdown (#386) * add way to generate markdown files to sphinx * changed from rst to markdown and extended sphinx for it * fix spelling * Clean titles * delete file * change spelling * add sections to document store usage * add basic rest api docs * fix readme in setup.py * Update Tutorials * Change section names * add windows note to pip install * update intro * new renderer for markdown files * Fix typos * delete dpr_utils.py * fix windows note in get started * Fix docstrings * deleted rest api docs in api * fixed typo * Fix docstring * revert readme to rst * Fix readme * Update setup.py Co-authored-by: deepset <deepset@Crenolape.localdomain> Co-authored-by: PiffPaffM <markuspaff.mp@gmail.com> Co-authored-by: Bogdan Kostić <bogdankostic@web.de> Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>
2020-09-18 12:57:32 +02:00
"##### FARMReader"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 739,
"referenced_widgets": [
"3d273d2d3b25435ba4eb4ffd8e812b6f",
"5104b7cddf6d4d0f92d3dd142b9f4c42",
"e0510255a31d448497af3ca0f4915cb4",
"670270fd06274932adad4d42c8a1912e",
"6ca292cd3f46417ea296684e48863af9",
"75578e0466cd4b84ba7dfee1028ae4cd",
"cbe09b984b804402b1fe82739cbc375c",
"4fd0caca56bd415b8c31860ba542145a",
"9960be4cc1c64905917b5fd7ea6bb294",
"2f3d901b3acb4841a4b03b2c5cd4393b",
"04644b74bb2a45a7a6fcf86151b5bf8c",
"5efa895c53284b72adec629a6fc59fa9",
"182e5db14fac427b90380b5213f57825",
"243600e420f449089c1b5ed0d2715339",
"466222c8b2e1403ca69c8130423f0a8b",
"a458be4cc49240e4b9bc1c95c05551e8",
"d9ee08fa621d4b558bd1a415e3ee6f62",
"1b905c5551b940ed9bc5320e1e5a9213",
"64fc7775a84e425c8082a545f7c2a0c1",
"66cd72dae82d434a87b638236784fd4b",
"36b1b48aea02494a8bc94020a15d7417",
"5934bc4db2a94c20b5c55f1c017024ab",
"f9289caeac404087ad4973a646e3a117",
"7e121f0fdb1746c094bff218a4f623ab",
"98781635b86244aca5d22be4280c32de",
"e148b28d946549a9b5eb09294ebe124e",
"4b8b29c1b1a243808de4cc1cae3f6bd6",
"bbef597f804e4ca580aee665399a3bc1",
"345f49b2b42c40278478d30e8a691768",
"e3724385769d443cb4ea39b92e0b2abd",
"d05fbb94014840cab4584c4781a590c1",
"b8d52b604dad43c18ba00c935b961422",
"e625a32fc81b42fb9e0fff7ce766fcdc",
"885390f24e08495db6a1febd661531e0",
"c2a614f48e974fb8b13a3c5d7cafaed6",
"ada8fa1c88954ef8b839f29090de9e79",
"427b07b356e44c68b47178b277aaa16f",
"1b4166bda5ae48aa8539e0fa5521007a",
"fd30d43909874239b2183c5fb61241fe",
"09a647660cf94131a1c140d06eb293ab",
"3e482e9ef4d34d93b4ba4f7f07b0e44f",
"66450cab654d40ae8ed1c32fa733397a",
"aa4becf2e33d4f1e9fdac70236d48f6e",
"78d087ed952e429b97eb3d8fcdc7c8ec",
"5020846874ae473bbfa7038fe98de474",
"08c736f4ad424330a82df1b5dc047b2c",
"9169ca606bf64d41aa08fb42876bd2ab",
"c8f1f7e8462d4d14a507816f67953eae"
]
},
"colab_type": "code",
"id": "fyIuWVwhA6OB",
"outputId": "33113253-8b95-4604-f9e5-1aa28ee66a91"
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"08/25/2020 08:28:54 - INFO - farm.utils - device: cuda n_gpu: 1, distributed training: False, automatic mixed precision training: None\n",
"08/25/2020 08:28:54 - INFO - farm.infer - Could not find `deepset/roberta-base-squad2` locally. Try to download from model hub ...\n",
"08/25/2020 08:28:59 - WARNING - farm.modeling.language_model - Could not automatically detect from language model name what language it is. \n",
"\t We guess it's an *ENGLISH* model ... \n",
"\t If not: Init the language model by supplying the 'language' param.\n",
"08/25/2020 08:29:06 - WARNING - farm.modeling.prediction_head - Some unused parameters are passed to the QuestionAnsweringHead. Might not be a problem. Params: {\"loss_ignore_index\": -1}\n",
"08/25/2020 08:29:09 - INFO - farm.utils - device: cuda n_gpu: 1, distributed training: False, automatic mixed precision training: None\n",
"08/25/2020 08:29:10 - INFO - farm.infer - Got ya 7 parallel workers to do inference ...\n",
"08/25/2020 08:29:10 - INFO - farm.infer - 0 0 0 0 0 0 0 \n",
"08/25/2020 08:29:10 - INFO - farm.infer - /w\\ /w\\ /w\\ /w\\ /w\\ /w\\ /w\\\n",
"08/25/2020 08:29:10 - INFO - farm.infer - /'\\ / \\ /'\\ /'\\ / \\ / \\ /'\\\n",
"08/25/2020 08:29:10 - INFO - farm.infer - \n"
]
}
],
"source": [
"# Load a local model or any of the QA models on\n",
"# Hugging Face's model hub (https://huggingface.co/models)\n",
"\n",
2020-07-03 16:06:46 +02:00
"reader = FARMReader(model_name_or_path=\"deepset/roberta-base-squad2\", use_gpu=True)"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "unhLD18yA6OF"
},
"source": [
"### Pipeline\n",
"\n",
"With a Haystack `Pipeline` you can stick together your building blocks to a search pipeline.\n",
"Under the hood, `Pipelines` are Directed Acyclic Graphs (DAGs) that you can easily customize for your own use cases.\n",
"To speed things up, Haystack also comes with a few predefined Pipelines. One of them is the `ExtractiveQAPipeline` that combines a retriever and a reader to answer our questions.\n",
"You can learn more about `Pipelines` in the [docs](https://haystack.deepset.ai/docs/latest/pipelinesmd)."
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"colab": {},
"colab_type": "code",
"id": "TssPQyzWA6OG"
},
"outputs": [],
"source": [
"from haystack.pipeline import ExtractiveQAPipeline\n",
"pipe = ExtractiveQAPipeline(reader, retriever)"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "bXlBBxKXA6OL"
},
"source": [
"## Voilà! Ask a question!"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 275
},
"colab_type": "code",
"id": "Zi97Hif2A6OM",
"outputId": "5eb9363d-ba92-45d5-c4d0-63ada3073f02"
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"08/25/2020 08:30:28 - INFO - haystack.finder - Reader is looking for detailed answer in 9168 chars ...\n",
"Inferencing Samples: 0%| | 0/1 [00:00<?, ? Batches/s]/home/ubuntu/deepset/FARM/farm/modeling/prediction_head.py:1073: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.\n",
" start_logits_normalized = nn.functional.softmax(start_logits)\n",
"/home/ubuntu/deepset/FARM/farm/modeling/prediction_head.py:1076: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.\n",
" end_logits_normalized = nn.functional.softmax(end_logits)\n",
"Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 3.56 Batches/s]\n",
"Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 38.79 Batches/s]\n",
"Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 39.61 Batches/s]\n",
"Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 53.05 Batches/s]\n",
"Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 37.39 Batches/s]\n",
"Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 67.21 Batches/s]\n",
"Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 67.10 Batches/s]\n",
"Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 66.66 Batches/s]\n",
"Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 47.91 Batches/s]\n",
"Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 33.05 Batches/s]\n"
]
}
],
"source": [
"# You can configure how many candidates the reader and retriever shall return\n",
"# The higher top_k_retriever, the better (but also the slower) your answers.\n",
"prediction = pipe.run(query=\"Who created the Dothraki vocabulary?\", top_k_retriever=10, top_k_reader=5)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print_answers(prediction, details=\"minimal\")"
]
},
{
"cell_type": "markdown",
"source": [
"## About us\n",
"\n",
"This [Haystack](https://github.com/deepset-ai/haystack/) notebook was made with love by [deepset](https://deepset.ai/) in Berlin, Germany\n",
"\n",
"We bring NLP to the industry via open source! \n",
"Our focus: Industry specific language models & large scale QA systems. \n",
" \n",
"Some of our other work: \n",
"- [German BERT](https://deepset.ai/german-bert)\n",
"- [GermanQuAD and GermanDPR](https://deepset.ai/germanquad)\n",
"- [FARM](https://github.com/deepset-ai/FARM)\n",
"\n",
"Get in touch:\n",
"[Twitter](https://twitter.com/deepset_ai) | [LinkedIn](https://www.linkedin.com/company/deepset-ai/) | [Slack](https://haystack.deepset.ai/community/join) | [GitHub Discussions](https://github.com/deepset-ai/haystack/discussions) | [Website](https://deepset.ai)\n",
"\n",
"By the way: [we're hiring!](https://apply.workable.com/deepset/) "
],
"metadata": {
"collapsed": false
}
}
],
"metadata": {
"accelerator": "GPU",
"colab": {
"collapsed_sections": [],
"name": "Tutorial6_Better_Retrieval_via_DPR.ipynb",
"provenance": []
},
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.9"
},
"widgets": {
"application/vnd.jupyter.widget-state+json": {
"04644b74bb2a45a7a6fcf86151b5bf8c": {
"model_module": "@jupyter-widgets/controls",
"model_name": "FloatProgressModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "FloatProgressModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "ProgressView",
"bar_style": "success",
"description": "Downloading: 100%",
"description_tooltip": null,
"layout": "IPY_MODEL_243600e420f449089c1b5ed0d2715339",
"max": 498637366,
"min": 0,
"orientation": "horizontal",
"style": "IPY_MODEL_182e5db14fac427b90380b5213f57825",
"value": 498637366
}
},
"05d793fc179746e9b74cbcbc1a3389eb": {
"model_module": "@jupyter-widgets/controls",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HTMLView",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_74a9c43eb61a43aa973194b0b70e18f5",
"placeholder": "",
"style": "IPY_MODEL_248d02e01dea4a63a3296e28e4537eaf",
"value": " 232k/232k [00:00&lt;00:00, 628kB/s]"
}
},
"08c736f4ad424330a82df1b5dc047b2c": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"09a647660cf94131a1c140d06eb293ab": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"0e7186eeb5fa47d89c8c111ebe43c5af": {
"model_module": "@jupyter-widgets/controls",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HTMLView",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_2630fd2fa91d498796af6d7d8d73aba4",
"placeholder": "",
"style": "IPY_MODEL_cea074a636d34a75b311569fc3f0b3ab",
"value": " 438M/438M [00:13&lt;00:00, 31.7MB/s]"
}
},
"182e5db14fac427b90380b5213f57825": {
"model_module": "@jupyter-widgets/controls",
"model_name": "ProgressStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "ProgressStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"bar_color": null,
"description_width": "initial"
}
},
"1b4166bda5ae48aa8539e0fa5521007a": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"1b905c5551b940ed9bc5320e1e5a9213": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"20affb86c4574e3a9829136fdfe40470": {
"model_module": "@jupyter-widgets/controls",
"model_name": "HBoxModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HBoxModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HBoxView",
"box_style": "",
"children": [
"IPY_MODEL_84311c037c6e44b5b621237f59f027a0",
"IPY_MODEL_05d793fc179746e9b74cbcbc1a3389eb"
],
"layout": "IPY_MODEL_7f8c2c86bbb74a18ac8bd24046d99d34"
}
},
"243600e420f449089c1b5ed0d2715339": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"248d02e01dea4a63a3296e28e4537eaf": {
"model_module": "@jupyter-widgets/controls",
"model_name": "DescriptionStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "DescriptionStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"description_width": ""
}
},
"2630fd2fa91d498796af6d7d8d73aba4": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"2f3d901b3acb4841a4b03b2c5cd4393b": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"345f49b2b42c40278478d30e8a691768": {
"model_module": "@jupyter-widgets/controls",
"model_name": "ProgressStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "ProgressStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"bar_color": null,
"description_width": "initial"
}
},
"36b1b48aea02494a8bc94020a15d7417": {
"model_module": "@jupyter-widgets/controls",
"model_name": "ProgressStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "ProgressStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"bar_color": null,
"description_width": "initial"
}
},
"3d273d2d3b25435ba4eb4ffd8e812b6f": {
"model_module": "@jupyter-widgets/controls",
"model_name": "HBoxModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HBoxModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HBoxView",
"box_style": "",
"children": [
"IPY_MODEL_e0510255a31d448497af3ca0f4915cb4",
"IPY_MODEL_670270fd06274932adad4d42c8a1912e"
],
"layout": "IPY_MODEL_5104b7cddf6d4d0f92d3dd142b9f4c42"
}
},
"3e482e9ef4d34d93b4ba4f7f07b0e44f": {
"model_module": "@jupyter-widgets/controls",
"model_name": "HBoxModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HBoxModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HBoxView",
"box_style": "",
"children": [
"IPY_MODEL_aa4becf2e33d4f1e9fdac70236d48f6e",
"IPY_MODEL_78d087ed952e429b97eb3d8fcdc7c8ec"
],
"layout": "IPY_MODEL_66450cab654d40ae8ed1c32fa733397a"
}
},
"427b07b356e44c68b47178b277aaa16f": {
"model_module": "@jupyter-widgets/controls",
"model_name": "ProgressStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "ProgressStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"bar_color": null,
"description_width": "initial"
}
},
"460bef2bfa7d4aa480639095555577ac": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"466222c8b2e1403ca69c8130423f0a8b": {
"model_module": "@jupyter-widgets/controls",
"model_name": "DescriptionStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "DescriptionStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"description_width": ""
}
},
"4afa2be1c2c5483f932a42ea4a7897af": {
"model_module": "@jupyter-widgets/controls",
"model_name": "FloatProgressModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "FloatProgressModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "ProgressView",
"bar_style": "success",
"description": "Downloading: 100%",
"description_tooltip": null,
"layout": "IPY_MODEL_518b6a993e42490297289f2328d0270a",
"max": 437983985,
"min": 0,
"orientation": "horizontal",
"style": "IPY_MODEL_fa946133dfcc4a6ebc6fef2ef9dd92f7",
"value": 437983985
}
},
"4b8b29c1b1a243808de4cc1cae3f6bd6": {
"model_module": "@jupyter-widgets/controls",
"model_name": "FloatProgressModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "FloatProgressModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "ProgressView",
"bar_style": "success",
"description": "Downloading: 100%",
"description_tooltip": null,
"layout": "IPY_MODEL_e3724385769d443cb4ea39b92e0b2abd",
"max": 456318,
"min": 0,
"orientation": "horizontal",
"style": "IPY_MODEL_345f49b2b42c40278478d30e8a691768",
"value": 456318
}
},
"4fd0caca56bd415b8c31860ba542145a": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"5020846874ae473bbfa7038fe98de474": {
"model_module": "@jupyter-widgets/controls",
"model_name": "ProgressStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "ProgressStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"bar_color": null,
"description_width": "initial"
}
},
"5104b7cddf6d4d0f92d3dd142b9f4c42": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"518b6a993e42490297289f2328d0270a": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"58fc3339f13644aea1d4c6d8e1d43a65": {
"model_module": "@jupyter-widgets/controls",
"model_name": "HBoxModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HBoxModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HBoxView",
"box_style": "",
"children": [
"IPY_MODEL_8553a48fb3144739b99fa04adf8b407c",
"IPY_MODEL_babe35bb292f4010b64104b2b5bc92af"
],
"layout": "IPY_MODEL_460bef2bfa7d4aa480639095555577ac"
}
},
"5934bc4db2a94c20b5c55f1c017024ab": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"5b8d5975d2674e7e9ada64e77c463c0a": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"5efa895c53284b72adec629a6fc59fa9": {
"model_module": "@jupyter-widgets/controls",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HTMLView",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_a458be4cc49240e4b9bc1c95c05551e8",
"placeholder": "",
"style": "IPY_MODEL_466222c8b2e1403ca69c8130423f0a8b",
"value": " 499M/499M [00:23&lt;00:00, 21.1MB/s]"
}
},
"64fc7775a84e425c8082a545f7c2a0c1": {
"model_module": "@jupyter-widgets/controls",
"model_name": "FloatProgressModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "FloatProgressModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "ProgressView",
"bar_style": "success",
"description": "Downloading: 100%",
"description_tooltip": null,
"layout": "IPY_MODEL_5934bc4db2a94c20b5c55f1c017024ab",
"max": 898822,
"min": 0,
"orientation": "horizontal",
"style": "IPY_MODEL_36b1b48aea02494a8bc94020a15d7417",
"value": 898822
}
},
"66450cab654d40ae8ed1c32fa733397a": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"66cd72dae82d434a87b638236784fd4b": {
"model_module": "@jupyter-widgets/controls",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HTMLView",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_7e121f0fdb1746c094bff218a4f623ab",
"placeholder": "",
"style": "IPY_MODEL_f9289caeac404087ad4973a646e3a117",
"value": " 899k/899k [00:01&lt;00:00, 684kB/s]"
}
},
"670270fd06274932adad4d42c8a1912e": {
"model_module": "@jupyter-widgets/controls",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HTMLView",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_4fd0caca56bd415b8c31860ba542145a",
"placeholder": "",
"style": "IPY_MODEL_cbe09b984b804402b1fe82739cbc375c",
"value": " 559/559 [00:00&lt;00:00, 2.78kB/s]"
}
},
"6ca292cd3f46417ea296684e48863af9": {
"model_module": "@jupyter-widgets/controls",
"model_name": "ProgressStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "ProgressStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"bar_color": null,
"description_width": "initial"
}
},
"74a9c43eb61a43aa973194b0b70e18f5": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"75578e0466cd4b84ba7dfee1028ae4cd": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"78d087ed952e429b97eb3d8fcdc7c8ec": {
"model_module": "@jupyter-widgets/controls",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HTMLView",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_c8f1f7e8462d4d14a507816f67953eae",
"placeholder": "",
"style": "IPY_MODEL_9169ca606bf64d41aa08fb42876bd2ab",
"value": " 189/189 [00:00&lt;00:00, 409B/s]"
}
},
"7e121f0fdb1746c094bff218a4f623ab": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"7f8c2c86bbb74a18ac8bd24046d99d34": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"84311c037c6e44b5b621237f59f027a0": {
"model_module": "@jupyter-widgets/controls",
"model_name": "FloatProgressModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "FloatProgressModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "ProgressView",
"bar_style": "success",
"description": "Downloading: 100%",
"description_tooltip": null,
"layout": "IPY_MODEL_bb45d5e4c9944fcd87b408e2fbfea440",
"max": 231508,
"min": 0,
"orientation": "horizontal",
"style": "IPY_MODEL_ad2ce6a8b4f844ac93b425f1261c131f",
"value": 231508
}
},
"8553a48fb3144739b99fa04adf8b407c": {
"model_module": "@jupyter-widgets/controls",
"model_name": "FloatProgressModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "FloatProgressModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "ProgressView",
"bar_style": "success",
"description": "Downloading: 100%",
"description_tooltip": null,
"layout": "IPY_MODEL_b4b950d899df4e3fbed9255b281e988a",
"max": 437986065,
"min": 0,
"orientation": "horizontal",
"style": "IPY_MODEL_887412c45ce744efbcc875b563770c29",
"value": 437986065
}
},
"885390f24e08495db6a1febd661531e0": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"887412c45ce744efbcc875b563770c29": {
"model_module": "@jupyter-widgets/controls",
"model_name": "ProgressStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "ProgressStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"bar_color": null,
"description_width": "initial"
}
},
"89535c589aa64648b82a9794a2888e78": {
"model_module": "@jupyter-widgets/controls",
"model_name": "DescriptionStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "DescriptionStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"description_width": ""
}
},
"9169ca606bf64d41aa08fb42876bd2ab": {
"model_module": "@jupyter-widgets/controls",
"model_name": "DescriptionStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "DescriptionStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"description_width": ""
}
},
"98781635b86244aca5d22be4280c32de": {
"model_module": "@jupyter-widgets/controls",
"model_name": "HBoxModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HBoxModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HBoxView",
"box_style": "",
"children": [
"IPY_MODEL_4b8b29c1b1a243808de4cc1cae3f6bd6",
"IPY_MODEL_bbef597f804e4ca580aee665399a3bc1"
],
"layout": "IPY_MODEL_e148b28d946549a9b5eb09294ebe124e"
}
},
"9960be4cc1c64905917b5fd7ea6bb294": {
"model_module": "@jupyter-widgets/controls",
"model_name": "HBoxModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HBoxModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HBoxView",
"box_style": "",
"children": [
"IPY_MODEL_04644b74bb2a45a7a6fcf86151b5bf8c",
"IPY_MODEL_5efa895c53284b72adec629a6fc59fa9"
],
"layout": "IPY_MODEL_2f3d901b3acb4841a4b03b2c5cd4393b"
}
},
"a458be4cc49240e4b9bc1c95c05551e8": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"aa4becf2e33d4f1e9fdac70236d48f6e": {
"model_module": "@jupyter-widgets/controls",
"model_name": "FloatProgressModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "FloatProgressModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "ProgressView",
"bar_style": "success",
"description": "Downloading: 100%",
"description_tooltip": null,
"layout": "IPY_MODEL_08c736f4ad424330a82df1b5dc047b2c",
"max": 189,
"min": 0,
"orientation": "horizontal",
"style": "IPY_MODEL_5020846874ae473bbfa7038fe98de474",
"value": 189
}
},
"ad2ce6a8b4f844ac93b425f1261c131f": {
"model_module": "@jupyter-widgets/controls",
"model_name": "ProgressStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "ProgressStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"bar_color": null,
"description_width": "initial"
}
},
"ada8fa1c88954ef8b839f29090de9e79": {
"model_module": "@jupyter-widgets/controls",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HTMLView",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_09a647660cf94131a1c140d06eb293ab",
"placeholder": "",
"style": "IPY_MODEL_fd30d43909874239b2183c5fb61241fe",
"value": " 150/150 [00:01&lt;00:00, 119B/s]"
}
},
"b4b950d899df4e3fbed9255b281e988a": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"b8d52b604dad43c18ba00c935b961422": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"babe35bb292f4010b64104b2b5bc92af": {
"model_module": "@jupyter-widgets/controls",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HTMLView",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_f35430501bb14fba8dbd5fb797c2e509",
"placeholder": "",
"style": "IPY_MODEL_89535c589aa64648b82a9794a2888e78",
"value": " 438M/438M [00:13&lt;00:00, 32.3MB/s]"
}
},
"bb45d5e4c9944fcd87b408e2fbfea440": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"bbef597f804e4ca580aee665399a3bc1": {
"model_module": "@jupyter-widgets/controls",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HTMLView",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_b8d52b604dad43c18ba00c935b961422",
"placeholder": "",
"style": "IPY_MODEL_d05fbb94014840cab4584c4781a590c1",
"value": " 456k/456k [00:02&lt;00:00, 166kB/s]"
}
},
"c2a614f48e974fb8b13a3c5d7cafaed6": {
"model_module": "@jupyter-widgets/controls",
"model_name": "FloatProgressModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "FloatProgressModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "ProgressView",
"bar_style": "success",
"description": "Downloading: 100%",
"description_tooltip": null,
"layout": "IPY_MODEL_1b4166bda5ae48aa8539e0fa5521007a",
"max": 150,
"min": 0,
"orientation": "horizontal",
"style": "IPY_MODEL_427b07b356e44c68b47178b277aaa16f",
"value": 150
}
},
"c8f1f7e8462d4d14a507816f67953eae": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"cbe09b984b804402b1fe82739cbc375c": {
"model_module": "@jupyter-widgets/controls",
"model_name": "DescriptionStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "DescriptionStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"description_width": ""
}
},
"cea074a636d34a75b311569fc3f0b3ab": {
"model_module": "@jupyter-widgets/controls",
"model_name": "DescriptionStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "DescriptionStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"description_width": ""
}
},
"d05fbb94014840cab4584c4781a590c1": {
"model_module": "@jupyter-widgets/controls",
"model_name": "DescriptionStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "DescriptionStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"description_width": ""
}
},
"d9ee08fa621d4b558bd1a415e3ee6f62": {
"model_module": "@jupyter-widgets/controls",
"model_name": "HBoxModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HBoxModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HBoxView",
"box_style": "",
"children": [
"IPY_MODEL_64fc7775a84e425c8082a545f7c2a0c1",
"IPY_MODEL_66cd72dae82d434a87b638236784fd4b"
],
"layout": "IPY_MODEL_1b905c5551b940ed9bc5320e1e5a9213"
}
},
"e0510255a31d448497af3ca0f4915cb4": {
"model_module": "@jupyter-widgets/controls",
"model_name": "FloatProgressModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "FloatProgressModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "ProgressView",
"bar_style": "success",
"description": "Downloading: 100%",
"description_tooltip": null,
"layout": "IPY_MODEL_75578e0466cd4b84ba7dfee1028ae4cd",
"max": 559,
"min": 0,
"orientation": "horizontal",
"style": "IPY_MODEL_6ca292cd3f46417ea296684e48863af9",
"value": 559
}
},
"e148b28d946549a9b5eb09294ebe124e": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"e3724385769d443cb4ea39b92e0b2abd": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"e625a32fc81b42fb9e0fff7ce766fcdc": {
"model_module": "@jupyter-widgets/controls",
"model_name": "HBoxModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HBoxModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HBoxView",
"box_style": "",
"children": [
"IPY_MODEL_c2a614f48e974fb8b13a3c5d7cafaed6",
"IPY_MODEL_ada8fa1c88954ef8b839f29090de9e79"
],
"layout": "IPY_MODEL_885390f24e08495db6a1febd661531e0"
}
},
"eb5d93a8416a437e9cb039650756ac74": {
"model_module": "@jupyter-widgets/controls",
"model_name": "HBoxModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HBoxModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HBoxView",
"box_style": "",
"children": [
"IPY_MODEL_4afa2be1c2c5483f932a42ea4a7897af",
"IPY_MODEL_0e7186eeb5fa47d89c8c111ebe43c5af"
],
"layout": "IPY_MODEL_5b8d5975d2674e7e9ada64e77c463c0a"
}
},
"f35430501bb14fba8dbd5fb797c2e509": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"f9289caeac404087ad4973a646e3a117": {
"model_module": "@jupyter-widgets/controls",
"model_name": "DescriptionStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "DescriptionStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"description_width": ""
}
},
"fa946133dfcc4a6ebc6fef2ef9dd92f7": {
"model_module": "@jupyter-widgets/controls",
"model_name": "ProgressStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "ProgressStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"bar_color": null,
"description_width": "initial"
}
},
"fd30d43909874239b2183c5fb61241fe": {
"model_module": "@jupyter-widgets/controls",
"model_name": "DescriptionStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "DescriptionStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"description_width": ""
}
}
}
}
},
"nbformat": 4,
"nbformat_minor": 1
}