haystack/tutorials/Tutorial17_Audio.ipynb

581 lines
21 KiB
Plaintext
Raw Normal View History

`AnswerToSpeech` (#2584) * Add new audio answer primitives * Add AnswerToSpeech * Add dependency group * Update Documentation & Code Style * Extract TextToSpeech in a helper class, create DocumentToSpeech and primitives * Add tests * Update Documentation & Code Style * Add ability to compress audio and more tests * Add audio group to test, all and all-gpu * fix pylint * Update Documentation & Code Style * Accidental git tag * Try pleasing mypy * Update Documentation & Code Style * fix pylint * Add warning for missing OS library and support in CI * Try fixing mypy * Update Documentation & Code Style * Add docs, simplify args for audio nodes and add tutorials * Fix mypy * Fix run_batch * Feedback on tutorials * fix mypy and pylint * Fix mypy again * Fix mypy yet again * Fix the ci * Fix dicts merge and install ffmpeg on CI * Make the audio nodes import safe * Trying to increase tolerance in audio test * Fix import paths * fix linter * Update Documentation & Code Style * Add audio libs in unit tests * Update _text_to_speech.py * Update answer_to_speech.py * Use dedicated dataset & update telemetry * Remove and use distilled roberta * Revert special primitives so that the nodes run in indexing * Improve tutorials and fix smaller bugs * Update Documentation & Code Style * Fix serialization issue * Update Documentation & Code Style * Improve tutorial * Update Documentation & Code Style * Update _text_to_speech.py * Minor lg updates * Minor lg updates to tutorial * Making indexing work in tutorials * Update Documentation & Code Style * Improve docstrings * Try to use GPU when available * Update Documentation & Code Style * Fixi mypy and pylint * Try to pass the device correctly * Update Documentation & Code Style * Use type of device * use .cpu() * Improve .ipynb * update apt index to be able to download libsndfile1 * Fix SpeechDocument.from_dict() * Change pip URL Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>
2022-06-15 10:13:18 +02:00
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "Dne2XSNzB3SK"
},
"source": [
"# Make Your QA Pipelines Talk!\n",
"\n",
"<img style=\"float: right;\" src=\"https://upload.wikimedia.org/wikipedia/en/d/d8/Game_of_Thrones_title_card.jpg\">\n",
"\n",
"[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/deepset-ai/haystack/blob/master/tutorials/Tutorial17_Audio.ipynb)\n",
"\n",
"Question answering works primarily on text, but Haystack provides some features for audio files that contain speech as well.\n",
"\n",
"In this tutorial, we're going to see how to use `AnswerToSpeech` to convert answers into audio files."
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false,
"id": "4UBjfz4LB3SS"
},
"source": [
"### Prepare environment\n",
"\n",
"#### Colab: Enable the GPU runtime\n",
"Make sure you enable the GPU runtime to experience decent speed in this tutorial.\n",
"**Runtime -> Change Runtime type -> Hardware accelerator -> GPU**\n",
"\n",
"<img src=\"https://raw.githubusercontent.com/deepset-ai/haystack/master/docs/img/colab_gpu_runtime.jpg\">"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "uDHmaD2gB3SX",
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"# Make sure you have a GPU running\n",
"!nvidia-smi"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "QsY0HC8JB3Sc"
},
"outputs": [],
"source": [
"# Install the latest release of Haystack in your own environment\n",
"#! pip install farm-haystack\n",
"\n",
"# Install the latest master of Haystack\n",
"!pip install --upgrade pip\n",
"!pip install git+https://github.com/deepset-ai/haystack.git#egg=farm-haystack[colab,audio]"
]
},
{
"cell_type": "markdown",
"source": [
"## Logging\n",
"\n",
"We configure how logging messages should be displayed and which log level should be used before importing Haystack.\n",
"Example log message:\n",
"INFO - haystack.utils.preprocessing - Converting data/tutorial1/218_Olenna_Tyrell.txt\n",
"Default log level in basicConfig is WARNING so the explicit parameter is not necessary but can be changed easily:"
],
"metadata": {
"collapsed": false,
"pycharm": {
"name": "#%% md\n"
}
}
},
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"import logging\n",
"\n",
"logging.basicConfig(format=\"%(levelname)s - %(name)s - %(message)s\", level=logging.WARNING)\n",
"logging.getLogger(\"haystack\").setLevel(logging.INFO)"
],
"metadata": {
"collapsed": false,
"pycharm": {
"name": "#%%\n"
}
}
},
{
"cell_type": "markdown",
"metadata": {
"id": "ZCemC4_XB3Se"
},
"source": [
"### Setup Elasticsearch\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "BtEN_VgSB3Sg"
},
"outputs": [],
"source": [
"# Recommended: Start Elasticsearch using Docker via the Haystack utility function\n",
"from haystack.utils import launch_es\n",
"\n",
"launch_es()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "r-oqc2g1B3Si"
},
"outputs": [],
"source": [
"# In Colab / No Docker environments: Start Elasticsearch from source\n",
"! wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.9.2-linux-x86_64.tar.gz -q\n",
"! tar -xzf elasticsearch-7.9.2-linux-x86_64.tar.gz\n",
"! chown -R daemon:daemon elasticsearch-7.9.2\n",
"\n",
"import os\n",
"from subprocess import Popen, PIPE, STDOUT\n",
"\n",
"es_server = Popen(\n",
" [\"elasticsearch-7.9.2/bin/elasticsearch\"], stdout=PIPE, stderr=STDOUT, preexec_fn=lambda: os.setuid(1) # as daemon\n",
")\n",
"# wait until ES has started\n",
"! sleep 30"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "pbGu92rAB3Sl"
},
"source": [
"### Populate the document store with `SpeechDocuments`\n",
"\n",
"First of all, we will populate the document store with a simple indexing pipeline. See [Tutorial 1](https://colab.research.google.com/github/deepset-ai/haystack/blob/master/tutorials/Tutorial1_Basic_QA_Pipeline.ipynb) for more details about these steps.\n",
"\n",
"To the basic version, we can add here a DocumentToSpeech node that also generates an audio file for each of the indexed documents. This will make possible, during querying, to access the audio version of the documents the answers were extracted from without having to generate it on the fly.\n",
"\n",
"**Note**: this additional step can slow down your indexing quite a lot if you are not running on GPU. Experiment with very small corpora to start."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "eWYnP3nWB3So",
"pycharm": {
"name": "#%%\n"
`AnswerToSpeech` (#2584) * Add new audio answer primitives * Add AnswerToSpeech * Add dependency group * Update Documentation & Code Style * Extract TextToSpeech in a helper class, create DocumentToSpeech and primitives * Add tests * Update Documentation & Code Style * Add ability to compress audio and more tests * Add audio group to test, all and all-gpu * fix pylint * Update Documentation & Code Style * Accidental git tag * Try pleasing mypy * Update Documentation & Code Style * fix pylint * Add warning for missing OS library and support in CI * Try fixing mypy * Update Documentation & Code Style * Add docs, simplify args for audio nodes and add tutorials * Fix mypy * Fix run_batch * Feedback on tutorials * fix mypy and pylint * Fix mypy again * Fix mypy yet again * Fix the ci * Fix dicts merge and install ffmpeg on CI * Make the audio nodes import safe * Trying to increase tolerance in audio test * Fix import paths * fix linter * Update Documentation & Code Style * Add audio libs in unit tests * Update _text_to_speech.py * Update answer_to_speech.py * Use dedicated dataset & update telemetry * Remove and use distilled roberta * Revert special primitives so that the nodes run in indexing * Improve tutorials and fix smaller bugs * Update Documentation & Code Style * Fix serialization issue * Update Documentation & Code Style * Improve tutorial * Update Documentation & Code Style * Update _text_to_speech.py * Minor lg updates * Minor lg updates to tutorial * Making indexing work in tutorials * Update Documentation & Code Style * Improve docstrings * Try to use GPU when available * Update Documentation & Code Style * Fixi mypy and pylint * Try to pass the device correctly * Update Documentation & Code Style * Use type of device * use .cpu() * Improve .ipynb * update apt index to be able to download libsndfile1 * Fix SpeechDocument.from_dict() * Change pip URL Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>
2022-06-15 10:13:18 +02:00
}
},
"outputs": [],
"source": [
"from haystack.document_stores import ElasticsearchDocumentStore\n",
"from haystack.utils import fetch_archive_from_http, launch_es\n",
"from pathlib import Path\n",
"from haystack import Pipeline\n",
"from haystack.nodes import FileTypeClassifier, TextConverter, PreProcessor, DocumentToSpeech\n",
"\n",
"document_store = ElasticsearchDocumentStore(host=\"localhost\", username=\"\", password=\"\", index=\"document\")\n",
"\n",
"# Get the documents\n",
"documents_path = \"data/tutorial17\"\n",
"s3_url = \"https://s3.eu-central-1.amazonaws.com/deepset.ai-farm-qa/datasets/documents/wiki_gameofthrones_txt17.zip\"\n",
"fetch_archive_from_http(url=s3_url, output_dir=documents_path)\n",
"\n",
"# List all the paths\n",
"file_paths = [p for p in Path(documents_path).glob(\"**/*\")]\n",
"\n",
"# NOTE: In this example we're going to use only one text file from the wiki, as the DocumentToSpeech node is quite slow\n",
"# on CPU machines. Comment out this line to use all documents from the dataset if you machine is powerful enough.\n",
"file_paths = [p for p in file_paths if \"Stormborn\" in p.name]\n",
"\n",
"# Prepare some basic metadata for the files\n",
"files_metadata = [{\"name\": path.name} for path in file_paths]\n",
"\n",
"# Here we create a basic indexing pipeline\n",
"indexing_pipeline = Pipeline()\n",
"\n",
"# - Makes sure the file is a TXT file (FileTypeClassifier node)\n",
"classifier = FileTypeClassifier()\n",
"indexing_pipeline.add_node(classifier, name=\"classifier\", inputs=[\"File\"])\n",
"\n",
"# - Converts a file into text and performs basic cleaning (TextConverter node)\n",
"text_converter = TextConverter(remove_numeric_tables=True)\n",
"indexing_pipeline.add_node(text_converter, name=\"text_converter\", inputs=[\"classifier.output_1\"])\n",
"\n",
"# - Pre-processes the text by performing splits and adding metadata to the text (Preprocessor node)\n",
"preprocessor = PreProcessor(\n",
" clean_whitespace=True,\n",
" clean_empty_lines=True,\n",
" split_length=100,\n",
" split_overlap=50,\n",
" split_respect_sentence_boundary=True,\n",
")\n",
"indexing_pipeline.add_node(preprocessor, name=\"preprocessor\", inputs=[\"text_converter\"])\n",
"\n",
"#\n",
"# DocumentToSpeech\n",
"#\n",
"# Here is where we convert all documents to be indexed into SpeechDocuments, that will hold not only\n",
"# the text content, but also their audio version.\n",
"#\n",
"# Note that DocumentToSpeech implements a light caching, so if a document's audio have already\n",
"# been generated in a previous pass in the same folder, it will reuse the existing file instead\n",
"# of generating it again.\n",
"doc2speech = DocumentToSpeech(\n",
" model_name_or_path=\"espnet/kan-bayashi_ljspeech_vits\", generated_audio_dir=Path(\"./generated_audio_documents\")\n",
")\n",
"indexing_pipeline.add_node(doc2speech, name=\"doc2speech\", inputs=[\"preprocessor\"])\n",
"\n",
"# - Writes the resulting documents into the document store (ElasticsearchDocumentStore node from the previous cell)\n",
"indexing_pipeline.add_node(document_store, name=\"document_store\", inputs=[\"doc2speech\"])\n",
"\n",
"# Then we run it with the documents and their metadata as input\n",
"output = indexing_pipeline.run(file_paths=file_paths, meta=files_metadata)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from pprint import pprint\n",
"\n",
"# You can now check the document store and verify that documents have been enriched with a path\n",
"# to the generated audio file\n",
"document = next(document_store.get_all_documents_generator())\n",
"pprint(document)\n",
"\n",
"# Sample output:\n",
"#\n",
"# <Document: {\n",
"# 'content': \"'Stormborn' received praise from critics, who considered Euron Greyjoy's raid on Yara's Iron Fleet,\n",
"# the assembly of Daenerys' allies at Dragonstone, and Arya's reunion with her direwolf Nymeria as\n",
"# highlights of the episode. In the United States, it achieved a viewership of 9.27 million in its\n",
"# initial broadcast.\",\n",
"# 'content_type': 'audio',\n",
"# 'score': None,\n",
"# 'meta': {\n",
"# 'content_audio': './generated_audio_documents/f218707624d9c4f9487f508e4603bf5b.wav',\n",
"# '__initialised__': True,\n",
"# 'type': 'generative',\n",
"# '_split_id': 0,\n",
"# 'audio_format': 'wav',\n",
"# 'sample_rate': 22050,\n",
"# 'name': '2_Stormborn.txt'},\n",
"# 'embedding': None,\n",
"# 'id': '2733e698301f8f94eb70430b874177fd'\n",
"# }>"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "zW5qaqn1B3St"
},
"source": [
"### Querying\n",
" \n",
"Now we will create a pipeline very similar to the basic `ExtractiveQAPipeline` of Tutorial 1,\n",
"with the addition of a node that converts our answers into audio files! Once the answer is retrieved, we can also listen to the audio version of the document where the answer came from."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "m_oecui1B3Sw"
},
"outputs": [],
"source": [
"from pathlib import Path\n",
"from haystack import Pipeline\n",
"from haystack.nodes import BM25Retriever, FARMReader, AnswerToSpeech\n",
"\n",
"retriever = BM25Retriever(document_store=document_store)\n",
"reader = FARMReader(model_name_or_path=\"deepset/roberta-base-squad2-distilled\", use_gpu=True)\n",
"answer2speech = AnswerToSpeech(\n",
" model_name_or_path=\"espnet/kan-bayashi_ljspeech_vits\", generated_audio_dir=Path(\"./audio_answers\")\n",
")\n",
"\n",
"audio_pipeline = Pipeline()\n",
"audio_pipeline.add_node(retriever, name=\"Retriever\", inputs=[\"Query\"])\n",
"audio_pipeline.add_node(reader, name=\"Reader\", inputs=[\"Retriever\"])\n",
"audio_pipeline.add_node(answer2speech, name=\"AnswerToSpeech\", inputs=[\"Reader\"])"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "oV1KHzXGB3Sy"
},
"source": [
"## Ask a question!"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "S-ZMUBzpB3Sz",
"pycharm": {
"is_executing": false
`AnswerToSpeech` (#2584) * Add new audio answer primitives * Add AnswerToSpeech * Add dependency group * Update Documentation & Code Style * Extract TextToSpeech in a helper class, create DocumentToSpeech and primitives * Add tests * Update Documentation & Code Style * Add ability to compress audio and more tests * Add audio group to test, all and all-gpu * fix pylint * Update Documentation & Code Style * Accidental git tag * Try pleasing mypy * Update Documentation & Code Style * fix pylint * Add warning for missing OS library and support in CI * Try fixing mypy * Update Documentation & Code Style * Add docs, simplify args for audio nodes and add tutorials * Fix mypy * Fix run_batch * Feedback on tutorials * fix mypy and pylint * Fix mypy again * Fix mypy yet again * Fix the ci * Fix dicts merge and install ffmpeg on CI * Make the audio nodes import safe * Trying to increase tolerance in audio test * Fix import paths * fix linter * Update Documentation & Code Style * Add audio libs in unit tests * Update _text_to_speech.py * Update answer_to_speech.py * Use dedicated dataset & update telemetry * Remove and use distilled roberta * Revert special primitives so that the nodes run in indexing * Improve tutorials and fix smaller bugs * Update Documentation & Code Style * Fix serialization issue * Update Documentation & Code Style * Improve tutorial * Update Documentation & Code Style * Update _text_to_speech.py * Minor lg updates * Minor lg updates to tutorial * Making indexing work in tutorials * Update Documentation & Code Style * Improve docstrings * Try to use GPU when available * Update Documentation & Code Style * Fixi mypy and pylint * Try to pass the device correctly * Update Documentation & Code Style * Use type of device * use .cpu() * Improve .ipynb * update apt index to be able to download libsndfile1 * Fix SpeechDocument.from_dict() * Change pip URL Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>
2022-06-15 10:13:18 +02:00
}
},
"outputs": [],
"source": [
"# You can configure how many candidates the Reader and Retriever shall return\n",
"# The higher top_k_retriever, the better (but also the slower) your answers.\n",
"prediction = audio_pipeline.run(\n",
" query=\"Who is the father of Arya Stark?\", params={\"Retriever\": {\"top_k\": 10}, \"Reader\": {\"top_k\": 5}}\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "vpFSxtNNB3S1"
},
"outputs": [],
"source": [
"# Now you can either print the object directly...\n",
"from pprint import pprint\n",
"\n",
"pprint(prediction)\n",
"\n",
"# Sample output:\n",
"# {\n",
"# 'answers': [ <SpeechAnswer:\n",
"# answer_audio=PosixPath('generated_audio_answers/fc704210136643b833515ba628eb4b2a.wav'),\n",
"# answer=\"Daenerys Targaryen\",\n",
"# context_audio=PosixPath('generated_audio_answers/8c562ebd7e7f41e1f9208384957df173.wav'),\n",
"# context='...'\n",
"# type='extractive', score=0.9919578731060028,\n",
"# offsets_in_document=[{'start': 608, 'end': 615}], offsets_in_context=[{'start': 72, 'end': 79}],\n",
"# document_id='cc75f739897ecbf8c14657b13dda890e', meta={'name': '43_Arya_Stark.txt'}} >,\n",
"# <SpeechAnswer:\n",
"# answer_audio=PosixPath('generated_audio_answers/07d6265486b22356362387c5a098ba7d.wav'),\n",
"# answer=\"Daenerys\",\n",
"# context_audio=PosixPath('generated_audio_answers/3f1ca228d6c4cfb633e55f89e97de7ac.wav'),\n",
"# context='...'\n",
"# type='extractive', score=0.9767240881919861,\n",
"# offsets_in_document=[{'start': 3687, 'end': 3801}], offsets_in_context=[{'start': 18, 'end': 132}],\n",
"# document_id='9acf17ec9083c4022f69eb4a37187080', meta={'name': '43_Arya_Stark.txt'}}>,\n",
"# ...\n",
"# ]\n",
"# 'documents': [ <SpeechDocument:\n",
"# content_type='text', score=0.8034909798951382, meta={'name': '43_Arya_Stark.txt'}, embedding=None, id=d1f36ec7170e4c46cde65787fe125dfe',\n",
"# content_audio=PosixPath('generated_audio_documents/07d6265486b22356362387c5a098ba7d.wav'),\n",
"# content='The title of the episode refers to both Daenerys Targaryen, who was born during a ...'>,\n",
"# <SpeechDocument:\n",
"# content_type='text', score=0.8002150354529785, meta={'name': '191_Gendry.txt'}, embedding=None, id='dd4e070a22896afa81748d6510006d2',\n",
"# content_audio=PosixPath('generated_audio_documents/07d6265486b22356362387c5a098ba7d.wav'),\n",
"# content='\"Stormborn\" received praise from critics, who considered Euron Greyjoy's raid on ...'>,\n",
"# ...\n",
"# ],\n",
"# 'no_ans_gap': 11.688868522644043,\n",
"# 'node_id': 'Reader',\n",
"# 'params': {'Reader': {'top_k': 5}, 'Retriever': {'top_k': 5}},\n",
"# 'query': 'Who was born during a storm?',\n",
"# 'root_node': 'Query'\n",
"# }"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "YFCfiP97B3S3",
"pycharm": {
"is_executing": false,
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"from haystack.utils import print_answers\n",
"\n",
"# ...or use a util to simplify the output\n",
"# Change `minimum` to `medium` or `all` to raise the level of detail\n",
"print_answers(prediction, details=\"minimum\")\n",
"\n",
"# Sample output:\n",
"#\n",
"# Query: Who was born during a storm\n",
"# Answers:\n",
"# [ { 'answer_audio': PosixPath('generated_audio_answers/07d6265486b22356362387c5a098ba7d.wav'),\n",
"# 'answer': 'Daenerys Targaryen',\n",
"# 'context_transcript': PosixPath('generated_audio_answers/3f1ca228d6c4cfb633e55f89e97de7ac.wav'),\n",
"# 'context': ' refers to both Daenerys Targaryen, who was born during a terrible storm, and '},\n",
"# { 'answer_audio': PosixPath('generated_audio_answers/83c3a02141cac4caffe0718cfd6c405c.wav'),\n",
"# 'answer': 'Daenerys',\n",
"# 'context_audio': PosixPath('generated_audio_answers/8c562ebd7e7f41e1f9208384957df173.wav'),\n",
"# 'context': 'The title of the episode refers to both Daenerys Targaryen, who was born during a terrible storm'},\n",
"# ..."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# The document the first answer was extracted from\n",
"original_document = [doc for doc in prediction[\"documents\"] if doc.id == prediction[\"answers\"][0].document_id][0]\n",
"pprint(original_document)\n",
"\n",
"# Sample output\n",
"#\n",
"# <Document: {\n",
"# 'content': '\"'''Stormborn'''\" is the second episode of the seventh season of HBO's fantasy television\n",
"# series ''Game of Thrones'', and the 62nd overall. The episode was written by Bryan Cogman,\n",
"# and directed by Mark Mylod. The title of the episode refers to both Daenerys Targaryen,\n",
"# who was born during a terrible storm, and Euron Greyjoy, who declares himself to be \"the storm\".',\n",
"# 'content_type': 'audio',\n",
"# 'score': 0.6269117688771539,\n",
"# 'embedding': None,\n",
"# 'id': '9352f650b36f93ab99684fd4746af5c1'\n",
"# 'meta': {\n",
"# 'content_audio': '/home/sara/work/haystack/generated_audio_documents/2c9223d47801b0918f2db2ad778c3d5a.wav',\n",
"# 'type': 'generative',\n",
"# '_split_id': 19,\n",
"# 'audio_format': 'wav',\n",
"# 'sample_rate': 22050,\n",
"# 'name': '2_Stormborn.txt'}\n",
"# }>"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "FXf-kTn4B3S6"
},
"source": [
"### Hear them out!"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "cJJVpT7dB3S7"
},
"outputs": [],
"source": [
"from IPython.display import display, Audio\n",
"import soundfile as sf"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "usGVf1N6B3S8"
},
"outputs": [],
"source": [
"# The first answer in isolation\n",
"\n",
"print(\"Answer: \", prediction[\"answers\"][0].answer)\n",
"\n",
"speech, _ = sf.read(prediction[\"answers\"][0].answer_audio)\n",
"display(Audio(speech, rate=24000))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "yTFwNJqtB3S9"
},
"outputs": [],
"source": [
"# The context of the first answer\n",
"\n",
"print(\"Context: \", prediction[\"answers\"][0].context)\n",
"\n",
"speech, _ = sf.read(prediction[\"answers\"][0].context_audio)\n",
"display(Audio(speech, rate=24000))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "xAj7Xm0EB3S-"
},
"outputs": [],
"source": [
"# The document the first answer was extracted from\n",
"\n",
"document = [doc for doc in prediction[\"documents\"] if doc.id == prediction[\"answers\"][0].document_id][0]\n",
"\n",
"print(\"Document: \", document.content)\n",
"\n",
"speech, _ = sf.read(document.meta[\"content_audio\"])\n",
"display(Audio(speech, rate=24000))"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false,
"id": "wJpoQQNdB3S-",
"pycharm": {
"name": "#%% md\n"
}
},
"source": [
"## About us\n",
"\n",
"This [Haystack](https://github.com/deepset-ai/haystack/) notebook was made with love by [deepset](https://deepset.ai/) in Berlin, Germany\n",
"\n",
"We bring NLP to the industry via open source! \n",
"Our focus: Industry specific language models & large scale QA systems. \n",
" \n",
"Some of our other work: \n",
"- [German BERT](https://deepset.ai/german-bert)\n",
"- [GermanQuAD and GermanDPR](https://deepset.ai/germanquad)\n",
"- [FARM](https://github.com/deepset-ai/FARM)\n",
"\n",
"Get in touch:\n",
"[Twitter](https://twitter.com/deepset_ai) | [LinkedIn](https://www.linkedin.com/company/deepset-ai/) | [Slack](https://haystack.deepset.ai/community/join) | [GitHub Discussions](https://github.com/deepset-ai/haystack/discussions) | [Website](https://deepset.ai)\n",
"\n",
"By the way: [we're hiring!](https://www.deepset.ai/jobs)\n"
]
}
],
"metadata": {
"accelerator": "GPU",
"colab": {
"name": "Tutorial17_Audio.ipynb",
"provenance": []
},
"interpreter": {
"hash": "608574092bbd30ec12f87341bba285fb17e1c9fb49d850a21d7829c65ef2f8c3"
},
"kernelspec": {
"display_name": "Python 3.9.7 ('venv': venv)",
"language": "python",
"name": "python3"
`AnswerToSpeech` (#2584) * Add new audio answer primitives * Add AnswerToSpeech * Add dependency group * Update Documentation & Code Style * Extract TextToSpeech in a helper class, create DocumentToSpeech and primitives * Add tests * Update Documentation & Code Style * Add ability to compress audio and more tests * Add audio group to test, all and all-gpu * fix pylint * Update Documentation & Code Style * Accidental git tag * Try pleasing mypy * Update Documentation & Code Style * fix pylint * Add warning for missing OS library and support in CI * Try fixing mypy * Update Documentation & Code Style * Add docs, simplify args for audio nodes and add tutorials * Fix mypy * Fix run_batch * Feedback on tutorials * fix mypy and pylint * Fix mypy again * Fix mypy yet again * Fix the ci * Fix dicts merge and install ffmpeg on CI * Make the audio nodes import safe * Trying to increase tolerance in audio test * Fix import paths * fix linter * Update Documentation & Code Style * Add audio libs in unit tests * Update _text_to_speech.py * Update answer_to_speech.py * Use dedicated dataset & update telemetry * Remove and use distilled roberta * Revert special primitives so that the nodes run in indexing * Improve tutorials and fix smaller bugs * Update Documentation & Code Style * Fix serialization issue * Update Documentation & Code Style * Improve tutorial * Update Documentation & Code Style * Update _text_to_speech.py * Minor lg updates * Minor lg updates to tutorial * Making indexing work in tutorials * Update Documentation & Code Style * Improve docstrings * Try to use GPU when available * Update Documentation & Code Style * Fixi mypy and pylint * Try to pass the device correctly * Update Documentation & Code Style * Use type of device * use .cpu() * Improve .ipynb * update apt index to be able to download libsndfile1 * Fix SpeechDocument.from_dict() * Change pip URL Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>
2022-06-15 10:13:18 +02:00
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.7"
}
},
"nbformat": 4,
"nbformat_minor": 0
`AnswerToSpeech` (#2584) * Add new audio answer primitives * Add AnswerToSpeech * Add dependency group * Update Documentation & Code Style * Extract TextToSpeech in a helper class, create DocumentToSpeech and primitives * Add tests * Update Documentation & Code Style * Add ability to compress audio and more tests * Add audio group to test, all and all-gpu * fix pylint * Update Documentation & Code Style * Accidental git tag * Try pleasing mypy * Update Documentation & Code Style * fix pylint * Add warning for missing OS library and support in CI * Try fixing mypy * Update Documentation & Code Style * Add docs, simplify args for audio nodes and add tutorials * Fix mypy * Fix run_batch * Feedback on tutorials * fix mypy and pylint * Fix mypy again * Fix mypy yet again * Fix the ci * Fix dicts merge and install ffmpeg on CI * Make the audio nodes import safe * Trying to increase tolerance in audio test * Fix import paths * fix linter * Update Documentation & Code Style * Add audio libs in unit tests * Update _text_to_speech.py * Update answer_to_speech.py * Use dedicated dataset & update telemetry * Remove and use distilled roberta * Revert special primitives so that the nodes run in indexing * Improve tutorials and fix smaller bugs * Update Documentation & Code Style * Fix serialization issue * Update Documentation & Code Style * Improve tutorial * Update Documentation & Code Style * Update _text_to_speech.py * Minor lg updates * Minor lg updates to tutorial * Making indexing work in tutorials * Update Documentation & Code Style * Improve docstrings * Try to use GPU when available * Update Documentation & Code Style * Fixi mypy and pylint * Try to pass the device correctly * Update Documentation & Code Style * Use type of device * use .cpu() * Improve .ipynb * update apt index to be able to download libsndfile1 * Fix SpeechDocument.from_dict() * Change pip URL Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>
2022-06-15 10:13:18 +02:00
}