diff --git a/docs/_src/api/api/pipelines.md b/docs/_src/api/api/pipelines.md index da0b500bc..64dc013b3 100644 --- a/docs/_src/api/api/pipelines.md +++ b/docs/_src/api/api/pipelines.md @@ -425,7 +425,7 @@ E.g. you can call execute_eval_run() multiple times with different retrievers in - `index_pipeline`: The indexing pipeline to use. - `query_pipeline`: The query pipeline to evaluate. -- `evaluation_set_labels`: The labels to evaluate on forming an evalution set. +- `evaluation_set_labels`: The labels to evaluate on forming an evaluation set. - `corpus_file_paths`: The files to be indexed and searched during evaluation forming a corpus. - `experiment_name`: The name of the experiment - `experiment_run_name`: The name of the experiment run diff --git a/docs/_src/tutorials/tutorials/1.md b/docs/_src/tutorials/tutorials/1.md index 64c107788..c6a5b1382 100644 --- a/docs/_src/tutorials/tutorials/1.md +++ b/docs/_src/tutorials/tutorials/1.md @@ -139,7 +139,7 @@ print(docs[:3]) document_store.write_documents(docs) ``` -## Initalize Retriever, Reader, & Pipeline +## Initialize Retriever, Reader & Pipeline ### Retriever diff --git a/docs/_src/tutorials/tutorials/12.md b/docs/_src/tutorials/tutorials/12.md index 3acf19f7f..93156caac 100644 --- a/docs/_src/tutorials/tutorials/12.md +++ b/docs/_src/tutorials/tutorials/12.md @@ -76,7 +76,7 @@ docs = convert_files_to_docs(dir_path=doc_dir, clean_func=clean_wiki_text, split document_store.write_documents(docs) ``` -### Initalize Retriever and Reader/Generator +### Initialize Retriever and Reader/Generator #### Retriever diff --git a/docs/_src/tutorials/tutorials/15.md b/docs/_src/tutorials/tutorials/15.md index 136ca41ae..e585380fc 100644 --- a/docs/_src/tutorials/tutorials/15.md +++ b/docs/_src/tutorials/tutorials/15.md @@ -130,7 +130,7 @@ print(tables[0].content) print(tables[0].meta) ``` -## Initalize Retriever, Reader, & Pipeline +## Initialize Retriever, Reader & Pipeline ### Retriever diff --git a/docs/_src/tutorials/tutorials/3.md b/docs/_src/tutorials/tutorials/3.md index d6a5bfb30..8626fda67 100644 --- a/docs/_src/tutorials/tutorials/3.md +++ b/docs/_src/tutorials/tutorials/3.md @@ -98,7 +98,7 @@ print(docs[:3]) document_store.write_documents(docs) ``` -## Initalize Retriever, Reader & Pipeline +## Initialize Retriever, Reader & Pipeline ### Retriever diff --git a/docs/_src/tutorials/tutorials/5.md b/docs/_src/tutorials/tutorials/5.md index 2e9407b0c..fbe1ac04d 100644 --- a/docs/_src/tutorials/tutorials/5.md +++ b/docs/_src/tutorials/tutorials/5.md @@ -180,7 +180,7 @@ Here we evaluate retriever and reader in open domain fashion on the full corpus correctly retrieved if it contains the gold answer string within it. The reader is evaluated based purely on the predicted answer string, regardless of which document this came from and the position of the extracted span. -The generation of predictions is seperated from the calculation of metrics. This allows you to run the computation-heavy model predictions only once and then iterate flexibly on the metrics or reports you want to generate. +The generation of predictions is separated from the calculation of metrics. This allows you to run the computation-heavy model predictions only once and then iterate flexibly on the metrics or reports you want to generate. diff --git a/docs/_src/tutorials/tutorials/6.md b/docs/_src/tutorials/tutorials/6.md index 3ba561d2a..0f4d2aff6 100644 --- a/docs/_src/tutorials/tutorials/6.md +++ b/docs/_src/tutorials/tutorials/6.md @@ -45,7 +45,7 @@ Recent work suggests that dual encoders work better, likely because they can dea ### "Dense Passage Retrieval" In this Tutorial, we want to highlight one "Dense Dual-Encoder" called Dense Passage Retriever. -It was introdoced by Karpukhin et al. (2020, https://arxiv.org/abs/2004.04906. +It was introduced by Karpukhin et al. (2020, https://arxiv.org/abs/2004.04906. Original Abstract: @@ -147,7 +147,7 @@ docs = convert_files_to_docs(dir_path=doc_dir, clean_func=clean_wiki_text, split document_store.write_documents(docs) ``` -### Initalize Retriever, Reader & Pipeline +### Initialize Retriever, Reader & Pipeline #### Retriever diff --git a/docs/_src/tutorials/tutorials/8.md b/docs/_src/tutorials/tutorials/8.md index fdd648bfd..7d8d09170 100644 --- a/docs/_src/tutorials/tutorials/8.md +++ b/docs/_src/tutorials/tutorials/8.md @@ -66,7 +66,7 @@ fetch_archive_from_http(url=s3_url, output_dir=doc_dir) Haystack's converter classes are designed to help you turn files on your computer into the documents that can be processed by the Haystack pipeline. There are file converters for txt, pdf, docx files as well as a converter that is powered by Apache Tika. -The parameter `valid_langugages` does not convert files to the target language, but checks if the conversion worked as expected. +The parameter `valid_languages` does not convert files to the target language, but checks if the conversion worked as expected. ```python diff --git a/docs/v0.10.0/_src/tutorials/tutorials/1.md b/docs/v0.10.0/_src/tutorials/tutorials/1.md index 1017de4c3..a2bc86b73 100644 --- a/docs/v0.10.0/_src/tutorials/tutorials/1.md +++ b/docs/v0.10.0/_src/tutorials/tutorials/1.md @@ -142,7 +142,7 @@ print(dicts[:3]) document_store.write_documents(dicts) ``` -## Initalize Retriever, Reader, & Pipeline +## Initialize Retriever, Reader & Pipeline ### Retriever diff --git a/docs/v0.10.0/_src/tutorials/tutorials/12.md b/docs/v0.10.0/_src/tutorials/tutorials/12.md index 4dca92d3e..240b06962 100644 --- a/docs/v0.10.0/_src/tutorials/tutorials/12.md +++ b/docs/v0.10.0/_src/tutorials/tutorials/12.md @@ -73,7 +73,7 @@ dicts = convert_files_to_dicts(dir_path=doc_dir, clean_func=clean_wiki_text, spl document_store.write_documents(dicts) ``` -### Initalize Retriever and Reader/Generator +### Initialize Retriever and Reader/Generator #### Retriever diff --git a/docs/v0.10.0/_src/tutorials/tutorials/3.md b/docs/v0.10.0/_src/tutorials/tutorials/3.md index c8d78e121..e5a38e6be 100644 --- a/docs/v0.10.0/_src/tutorials/tutorials/3.md +++ b/docs/v0.10.0/_src/tutorials/tutorials/3.md @@ -101,7 +101,7 @@ print(dicts[:3]) document_store.write_documents(dicts) ``` -## Initalize Retriever, Reader & Pipeline +## Initialize Retriever, Reader & Pipeline ### Retriever diff --git a/docs/v0.10.0/_src/tutorials/tutorials/6.md b/docs/v0.10.0/_src/tutorials/tutorials/6.md index ca52ee2b4..3fc77aa7a 100644 --- a/docs/v0.10.0/_src/tutorials/tutorials/6.md +++ b/docs/v0.10.0/_src/tutorials/tutorials/6.md @@ -45,7 +45,7 @@ Recent work suggests that dual encoders work better, likely because they can dea ### "Dense Passage Retrieval" In this Tutorial, we want to highlight one "Dense Dual-Encoder" called Dense Passage Retriever. -It was introdoced by Karpukhin et al. (2020, https://arxiv.org/abs/2004.04906. +It was introduced by Karpukhin et al. (2020, https://arxiv.org/abs/2004.04906. Original Abstract: @@ -145,7 +145,7 @@ dicts = convert_files_to_dicts(dir_path=doc_dir, clean_func=clean_wiki_text, spl document_store.write_documents(dicts) ``` -### Initalize Retriever, Reader & Pipeline +### Initialize Retriever, Reader & Pipeline #### Retriever diff --git a/docs/v0.4.0/_src/tutorials/tutorials/1.md b/docs/v0.4.0/_src/tutorials/tutorials/1.md index ce1d7aed0..4a5a61707 100644 --- a/docs/v0.4.0/_src/tutorials/tutorials/1.md +++ b/docs/v0.4.0/_src/tutorials/tutorials/1.md @@ -127,7 +127,7 @@ print(dicts[:3]) document_store.write_documents(dicts) ``` -## Initalize Retriever, Reader, & Finder +## Initialize Retriever, Reader & Finder ### Retriever diff --git a/docs/v0.4.0/_src/tutorials/tutorials/3.md b/docs/v0.4.0/_src/tutorials/tutorials/3.md index d177bd1e0..c5a84b956 100644 --- a/docs/v0.4.0/_src/tutorials/tutorials/3.md +++ b/docs/v0.4.0/_src/tutorials/tutorials/3.md @@ -87,7 +87,7 @@ print(dicts[:3]) document_store.write_documents(dicts) ``` -## Initalize Retriever, Reader, & Finder +## Initialize Retriever, Reader & Finder ### Retriever diff --git a/docs/v0.4.0/_src/tutorials/tutorials/6.md b/docs/v0.4.0/_src/tutorials/tutorials/6.md index ec426daad..777e7a772 100644 --- a/docs/v0.4.0/_src/tutorials/tutorials/6.md +++ b/docs/v0.4.0/_src/tutorials/tutorials/6.md @@ -45,7 +45,7 @@ Recent work suggests that dual encoders work better, likely because they can dea ### "Dense Passage Retrieval" In this Tutorial, we want to highlight one "Dense Dual-Encoder" called Dense Passage Retriever. -It was introdoced by Karpukhin et al. (2020, https://arxiv.org/abs/2004.04906. +It was introduced by Karpukhin et al. (2020, https://arxiv.org/abs/2004.04906. Original Abstract: @@ -124,7 +124,7 @@ dicts = convert_files_to_dicts(dir_path=doc_dir, clean_func=clean_wiki_text, spl document_store.write_documents(dicts) ``` -### Initalize Retriever, Reader, & Finder +### Initialize Retriever, Reader & Finder #### Retriever diff --git a/docs/v0.5.0/_src/tutorials/tutorials/1.md b/docs/v0.5.0/_src/tutorials/tutorials/1.md index 92faee639..ea01c1ece 100644 --- a/docs/v0.5.0/_src/tutorials/tutorials/1.md +++ b/docs/v0.5.0/_src/tutorials/tutorials/1.md @@ -127,7 +127,7 @@ print(dicts[:3]) document_store.write_documents(dicts) ``` -## Initalize Retriever, Reader, & Finder +## Initialize Retriever, Reader & Finder ### Retriever diff --git a/docs/v0.5.0/_src/tutorials/tutorials/3.md b/docs/v0.5.0/_src/tutorials/tutorials/3.md index 2a42ae103..76eb56da8 100644 --- a/docs/v0.5.0/_src/tutorials/tutorials/3.md +++ b/docs/v0.5.0/_src/tutorials/tutorials/3.md @@ -87,7 +87,7 @@ print(dicts[:3]) document_store.write_documents(dicts) ``` -## Initalize Retriever, Reader, & Finder +## Initialize Retriever, Reader & Finder ### Retriever diff --git a/docs/v0.5.0/_src/tutorials/tutorials/6.md b/docs/v0.5.0/_src/tutorials/tutorials/6.md index ec426daad..777e7a772 100644 --- a/docs/v0.5.0/_src/tutorials/tutorials/6.md +++ b/docs/v0.5.0/_src/tutorials/tutorials/6.md @@ -45,7 +45,7 @@ Recent work suggests that dual encoders work better, likely because they can dea ### "Dense Passage Retrieval" In this Tutorial, we want to highlight one "Dense Dual-Encoder" called Dense Passage Retriever. -It was introdoced by Karpukhin et al. (2020, https://arxiv.org/abs/2004.04906. +It was introduced by Karpukhin et al. (2020, https://arxiv.org/abs/2004.04906. Original Abstract: @@ -124,7 +124,7 @@ dicts = convert_files_to_dicts(dir_path=doc_dir, clean_func=clean_wiki_text, spl document_store.write_documents(dicts) ``` -### Initalize Retriever, Reader, & Finder +### Initialize Retriever, Reader & Finder #### Retriever diff --git a/docs/v0.6.0/_src/tutorials/tutorials/1.md b/docs/v0.6.0/_src/tutorials/tutorials/1.md index d7ffdb814..9c2fa7599 100644 --- a/docs/v0.6.0/_src/tutorials/tutorials/1.md +++ b/docs/v0.6.0/_src/tutorials/tutorials/1.md @@ -129,7 +129,7 @@ print(dicts[:3]) document_store.write_documents(dicts) ``` -## Initalize Retriever, Reader, & Finder +## Initialize Retriever, Reader & Finder ### Retriever diff --git a/docs/v0.6.0/_src/tutorials/tutorials/3.md b/docs/v0.6.0/_src/tutorials/tutorials/3.md index ebd353bae..15f520182 100644 --- a/docs/v0.6.0/_src/tutorials/tutorials/3.md +++ b/docs/v0.6.0/_src/tutorials/tutorials/3.md @@ -89,7 +89,7 @@ print(dicts[:3]) document_store.write_documents(dicts) ``` -## Initalize Retriever, Reader, & Finder +## Initialize Retriever, Reader & Finder ### Retriever diff --git a/docs/v0.6.0/_src/tutorials/tutorials/6.md b/docs/v0.6.0/_src/tutorials/tutorials/6.md index 1b5ecb3b7..b37d1990a 100644 --- a/docs/v0.6.0/_src/tutorials/tutorials/6.md +++ b/docs/v0.6.0/_src/tutorials/tutorials/6.md @@ -45,7 +45,7 @@ Recent work suggests that dual encoders work better, likely because they can dea ### "Dense Passage Retrieval" In this Tutorial, we want to highlight one "Dense Dual-Encoder" called Dense Passage Retriever. -It was introdoced by Karpukhin et al. (2020, https://arxiv.org/abs/2004.04906. +It was introduced by Karpukhin et al. (2020, https://arxiv.org/abs/2004.04906. Original Abstract: @@ -129,7 +129,7 @@ dicts = convert_files_to_dicts(dir_path=doc_dir, clean_func=clean_wiki_text, spl document_store.write_documents(dicts) ``` -### Initalize Retriever, Reader, & Finder +### Initialize Retriever, Reader & Finder #### Retriever diff --git a/docs/v0.7.0/_src/tutorials/tutorials/1.md b/docs/v0.7.0/_src/tutorials/tutorials/1.md index 03dc3438e..81374f64c 100644 --- a/docs/v0.7.0/_src/tutorials/tutorials/1.md +++ b/docs/v0.7.0/_src/tutorials/tutorials/1.md @@ -142,7 +142,7 @@ print(dicts[:3]) document_store.write_documents(dicts) ``` -## Initalize Retriever, Reader, & Finder +## Initialize Retriever, Reader & Finder ### Retriever diff --git a/docs/v0.7.0/_src/tutorials/tutorials/3.md b/docs/v0.7.0/_src/tutorials/tutorials/3.md index aa1f10f4f..1f4066230 100644 --- a/docs/v0.7.0/_src/tutorials/tutorials/3.md +++ b/docs/v0.7.0/_src/tutorials/tutorials/3.md @@ -103,7 +103,7 @@ print(dicts[:3]) document_store.write_documents(dicts) ``` -## Initalize Retriever, Reader, & Finder +## Initialize Retriever, Reader & Finder ### Retriever diff --git a/docs/v0.7.0/_src/tutorials/tutorials/6.md b/docs/v0.7.0/_src/tutorials/tutorials/6.md index 76adde7c2..c07a2a5f4 100644 --- a/docs/v0.7.0/_src/tutorials/tutorials/6.md +++ b/docs/v0.7.0/_src/tutorials/tutorials/6.md @@ -45,7 +45,7 @@ Recent work suggests that dual encoders work better, likely because they can dea ### "Dense Passage Retrieval" In this Tutorial, we want to highlight one "Dense Dual-Encoder" called Dense Passage Retriever. -It was introdoced by Karpukhin et al. (2020, https://arxiv.org/abs/2004.04906. +It was introduced by Karpukhin et al. (2020, https://arxiv.org/abs/2004.04906. Original Abstract: @@ -129,7 +129,7 @@ dicts = convert_files_to_dicts(dir_path=doc_dir, clean_func=clean_wiki_text, spl document_store.write_documents(dicts) ``` -### Initalize Retriever, Reader, & Finder +### Initialize Retriever, Reader & Finder #### Retriever diff --git a/docs/v0.8.0/_src/tutorials/tutorials/1.md b/docs/v0.8.0/_src/tutorials/tutorials/1.md index 98ad3bac4..e1d4d815a 100644 --- a/docs/v0.8.0/_src/tutorials/tutorials/1.md +++ b/docs/v0.8.0/_src/tutorials/tutorials/1.md @@ -140,7 +140,7 @@ print(dicts[:3]) document_store.write_documents(dicts) ``` -## Initalize Retriever, Reader, & Finder +## Initialize Retriever, Reader & Finder ### Retriever diff --git a/docs/v0.8.0/_src/tutorials/tutorials/3.md b/docs/v0.8.0/_src/tutorials/tutorials/3.md index b49c16645..f1d3c5b0a 100644 --- a/docs/v0.8.0/_src/tutorials/tutorials/3.md +++ b/docs/v0.8.0/_src/tutorials/tutorials/3.md @@ -101,7 +101,7 @@ print(dicts[:3]) document_store.write_documents(dicts) ``` -## Initalize Retriever, Reader, & Finder +## Initialize Retriever, Reader & Finder ### Retriever diff --git a/docs/v0.8.0/_src/tutorials/tutorials/6.md b/docs/v0.8.0/_src/tutorials/tutorials/6.md index b4e18c209..02418f3d2 100644 --- a/docs/v0.8.0/_src/tutorials/tutorials/6.md +++ b/docs/v0.8.0/_src/tutorials/tutorials/6.md @@ -45,7 +45,7 @@ Recent work suggests that dual encoders work better, likely because they can dea ### "Dense Passage Retrieval" In this Tutorial, we want to highlight one "Dense Dual-Encoder" called Dense Passage Retriever. -It was introdoced by Karpukhin et al. (2020, https://arxiv.org/abs/2004.04906. +It was introduced by Karpukhin et al. (2020, https://arxiv.org/abs/2004.04906. Original Abstract: @@ -127,7 +127,7 @@ dicts = convert_files_to_dicts(dir_path=doc_dir, clean_func=clean_wiki_text, spl document_store.write_documents(dicts) ``` -### Initalize Retriever, Reader, & Finder +### Initialize Retriever, Reader & Finder #### Retriever diff --git a/docs/v0.9.0/_src/tutorials/tutorials/1.md b/docs/v0.9.0/_src/tutorials/tutorials/1.md index c567da277..a1fb220cd 100644 --- a/docs/v0.9.0/_src/tutorials/tutorials/1.md +++ b/docs/v0.9.0/_src/tutorials/tutorials/1.md @@ -142,7 +142,7 @@ print(dicts[:3]) document_store.write_documents(dicts) ``` -## Initalize Retriever, Reader, & Finder +## Initialize Retriever, Reader & Finder ### Retriever diff --git a/docs/v0.9.0/_src/tutorials/tutorials/12.md b/docs/v0.9.0/_src/tutorials/tutorials/12.md index ec692c907..17aade676 100644 --- a/docs/v0.9.0/_src/tutorials/tutorials/12.md +++ b/docs/v0.9.0/_src/tutorials/tutorials/12.md @@ -73,7 +73,7 @@ dicts = convert_files_to_dicts(dir_path=doc_dir, clean_func=clean_wiki_text, spl document_store.write_documents(dicts) ``` -### Initalize Retriever and Reader/Generator +### Initialize Retriever and Reader/Generator #### Retriever diff --git a/docs/v0.9.0/_src/tutorials/tutorials/3.md b/docs/v0.9.0/_src/tutorials/tutorials/3.md index 455e4640e..ce3289383 100644 --- a/docs/v0.9.0/_src/tutorials/tutorials/3.md +++ b/docs/v0.9.0/_src/tutorials/tutorials/3.md @@ -102,7 +102,7 @@ print(dicts[:3]) document_store.write_documents(dicts) ``` -## Initalize Retriever, Reader, & Finder +## Initialize Retriever, Reader & Finder ### Retriever diff --git a/docs/v0.9.0/_src/tutorials/tutorials/6.md b/docs/v0.9.0/_src/tutorials/tutorials/6.md index 55df9c87e..df4878066 100644 --- a/docs/v0.9.0/_src/tutorials/tutorials/6.md +++ b/docs/v0.9.0/_src/tutorials/tutorials/6.md @@ -45,7 +45,7 @@ Recent work suggests that dual encoders work better, likely because they can dea ### "Dense Passage Retrieval" In this Tutorial, we want to highlight one "Dense Dual-Encoder" called Dense Passage Retriever. -It was introdoced by Karpukhin et al. (2020, https://arxiv.org/abs/2004.04906. +It was introduced by Karpukhin et al. (2020, https://arxiv.org/abs/2004.04906. Original Abstract: @@ -146,7 +146,7 @@ dicts = convert_files_to_dicts(dir_path=doc_dir, clean_func=clean_wiki_text, spl document_store.write_documents(dicts) ``` -### Initalize Retriever, Reader, & Finder +### Initialize Retriever, Reader & Finder #### Retriever diff --git a/docs/v1.0.0/_src/api/api/primitives.md b/docs/v1.0.0/_src/api/api/primitives.md index a1c51e899..3ae170694 100644 --- a/docs/v1.0.0/_src/api/api/primitives.md +++ b/docs/v1.0.0/_src/api/api/primitives.md @@ -280,10 +280,10 @@ The DataFrames have the following schema: - context (answers only): the surrounding context of the answer within the document - offsets_in_document (answers only): the position or offsets within the document the answer was found - gold_answers (answers only): the answers to be given -- gold_offsets_in_documents (answers only): the positon or offsets of the gold answer within the document +- gold_offsets_in_documents (answers only): the position or offsets of the gold answer within the document - exact_match (answers only): metric depicting if the answer exactly matches the gold label - f1 (answers only): metric depicting how well the answer overlaps with the gold label on token basis -- sas (answers only, optional): metric depciting how well the answer matches the gold label on a semantic basis +- sas (answers only, optional): metric depicting how well the answer matches the gold label on a semantic basis **Arguments**: diff --git a/docs/v1.0.0/_src/tutorials/tutorials/1.md b/docs/v1.0.0/_src/tutorials/tutorials/1.md index c81f9b780..c24ea5d2d 100644 --- a/docs/v1.0.0/_src/tutorials/tutorials/1.md +++ b/docs/v1.0.0/_src/tutorials/tutorials/1.md @@ -141,7 +141,7 @@ print(dicts[:3]) document_store.write_documents(dicts) ``` -## Initalize Retriever, Reader, & Pipeline +## Initialize Retriever, Reader & Pipeline ### Retriever diff --git a/docs/v1.0.0/_src/tutorials/tutorials/12.md b/docs/v1.0.0/_src/tutorials/tutorials/12.md index 8f7744dea..a27f77914 100644 --- a/docs/v1.0.0/_src/tutorials/tutorials/12.md +++ b/docs/v1.0.0/_src/tutorials/tutorials/12.md @@ -75,7 +75,7 @@ dicts = convert_files_to_dicts(dir_path=doc_dir, clean_func=clean_wiki_text, spl document_store.write_documents(dicts) ``` -### Initalize Retriever and Reader/Generator +### Initialize Retriever and Reader/Generator #### Retriever diff --git a/docs/v1.0.0/_src/tutorials/tutorials/15.md b/docs/v1.0.0/_src/tutorials/tutorials/15.md index e9d8a6fe2..444dbad6b 100644 --- a/docs/v1.0.0/_src/tutorials/tutorials/15.md +++ b/docs/v1.0.0/_src/tutorials/tutorials/15.md @@ -136,7 +136,7 @@ print(tables[0].content) print(tables[0].meta) ``` -## Initalize Retriever, Reader, & Pipeline +## Initialize Retriever, Reader & Pipeline ### Retriever diff --git a/docs/v1.0.0/_src/tutorials/tutorials/3.md b/docs/v1.0.0/_src/tutorials/tutorials/3.md index bc4c97a0c..9d2b5a4bc 100644 --- a/docs/v1.0.0/_src/tutorials/tutorials/3.md +++ b/docs/v1.0.0/_src/tutorials/tutorials/3.md @@ -100,7 +100,7 @@ print(dicts[:3]) document_store.write_documents(dicts) ``` -## Initalize Retriever, Reader & Pipeline +## Initialize Retriever, Reader & Pipeline ### Retriever diff --git a/docs/v1.0.0/_src/tutorials/tutorials/5.md b/docs/v1.0.0/_src/tutorials/tutorials/5.md index 3a0d78903..71ce98c88 100644 --- a/docs/v1.0.0/_src/tutorials/tutorials/5.md +++ b/docs/v1.0.0/_src/tutorials/tutorials/5.md @@ -176,7 +176,7 @@ Here we evaluate retriever and reader in open domain fashion on the full corpus correctly retrieved if it contains the gold answer string within it. The reader is evaluated based purely on the predicted answer string, regardless of which document this came from and the position of the extracted span. -The generation of predictions is seperated from the calculation of metrics. This allows you to run the computation-heavy model predictions only once and then iterate flexibly on the metrics or reports you want to generate. +The generation of predictions is separated from the calculation of metrics. This allows you to run the computation-heavy model predictions only once and then iterate flexibly on the metrics or reports you want to generate. diff --git a/docs/v1.0.0/_src/tutorials/tutorials/6.md b/docs/v1.0.0/_src/tutorials/tutorials/6.md index c174bc9b3..e21936bdd 100644 --- a/docs/v1.0.0/_src/tutorials/tutorials/6.md +++ b/docs/v1.0.0/_src/tutorials/tutorials/6.md @@ -45,7 +45,7 @@ Recent work suggests that dual encoders work better, likely because they can dea ### "Dense Passage Retrieval" In this Tutorial, we want to highlight one "Dense Dual-Encoder" called Dense Passage Retriever. -It was introdoced by Karpukhin et al. (2020, https://arxiv.org/abs/2004.04906. +It was introduced by Karpukhin et al. (2020, https://arxiv.org/abs/2004.04906. Original Abstract: @@ -145,7 +145,7 @@ dicts = convert_files_to_dicts(dir_path=doc_dir, clean_func=clean_wiki_text, spl document_store.write_documents(dicts) ``` -### Initalize Retriever, Reader & Pipeline +### Initialize Retriever, Reader & Pipeline #### Retriever diff --git a/docs/v1.0.0/_src/tutorials/tutorials/8.md b/docs/v1.0.0/_src/tutorials/tutorials/8.md index 430e05d84..d93201fd8 100644 --- a/docs/v1.0.0/_src/tutorials/tutorials/8.md +++ b/docs/v1.0.0/_src/tutorials/tutorials/8.md @@ -65,7 +65,7 @@ fetch_archive_from_http(url=s3_url, output_dir=doc_dir) Haystack's converter classes are designed to help you turn files on your computer into the documents that can be processed by the Haystack pipeline. There are file converters for txt, pdf, docx files as well as a converter that is powered by Apache Tika. -The parameter `valid_langugages` does not convert files to the target language, but checks if the conversion worked as expected. +The parameter `valid_languages` does not convert files to the target language, but checks if the conversion worked as expected. For converting PDFs, try changing the encoding to UTF-8 if the conversion isn't great. diff --git a/docs/v1.1.0/_src/api/api/primitives.md b/docs/v1.1.0/_src/api/api/primitives.md index 457112423..e06358364 100644 --- a/docs/v1.1.0/_src/api/api/primitives.md +++ b/docs/v1.1.0/_src/api/api/primitives.md @@ -272,7 +272,7 @@ The DataFrames have the following schema: - context (answers only): the surrounding context of the answer within the document - exact_match (answers only): metric depicting if the answer exactly matches the gold label - f1 (answers only): metric depicting how well the answer overlaps with the gold label on token basis -- sas (answers only, optional): metric depciting how well the answer matches the gold label on a semantic basis +- sas (answers only, optional): metric depicting how well the answer matches the gold label on a semantic basis - gold_document_contents (documents only): the contents of the gold documents - content (documents only): the content of the document - gold_id_match (documents only): metric depicting whether one of the gold document ids matches the document @@ -282,7 +282,7 @@ The DataFrames have the following schema: - document_id: the id of the document that has been retrieved or that contained the answer - gold_document_ids: the documents to be retrieved - offsets_in_document (answers only): the position or offsets within the document the answer was found -- gold_offsets_in_documents (answers only): the positon or offsets of the gold answer within the document +- gold_offsets_in_documents (answers only): the position or offsets of the gold answer within the document - type: 'answer' or 'document' - node: the node name - eval_mode: evaluation mode depicting whether the evaluation was executed in integrated or isolated mode. diff --git a/docs/v1.1.0/_src/tutorials/tutorials/1.md b/docs/v1.1.0/_src/tutorials/tutorials/1.md index c81f9b780..c24ea5d2d 100644 --- a/docs/v1.1.0/_src/tutorials/tutorials/1.md +++ b/docs/v1.1.0/_src/tutorials/tutorials/1.md @@ -141,7 +141,7 @@ print(dicts[:3]) document_store.write_documents(dicts) ``` -## Initalize Retriever, Reader, & Pipeline +## Initialize Retriever, Reader & Pipeline ### Retriever diff --git a/docs/v1.1.0/_src/tutorials/tutorials/12.md b/docs/v1.1.0/_src/tutorials/tutorials/12.md index f5283e70a..80cc99387 100644 --- a/docs/v1.1.0/_src/tutorials/tutorials/12.md +++ b/docs/v1.1.0/_src/tutorials/tutorials/12.md @@ -75,7 +75,7 @@ dicts = convert_files_to_dicts(dir_path=doc_dir, clean_func=clean_wiki_text, spl document_store.write_documents(dicts) ``` -### Initalize Retriever and Reader/Generator +### Initialize Retriever and Reader/Generator #### Retriever diff --git a/docs/v1.1.0/_src/tutorials/tutorials/15.md b/docs/v1.1.0/_src/tutorials/tutorials/15.md index af340b277..15a8feee6 100644 --- a/docs/v1.1.0/_src/tutorials/tutorials/15.md +++ b/docs/v1.1.0/_src/tutorials/tutorials/15.md @@ -137,7 +137,7 @@ print(tables[0].content) print(tables[0].meta) ``` -## Initalize Retriever, Reader, & Pipeline +## Initialize Retriever, Reader & Pipeline ### Retriever diff --git a/docs/v1.1.0/_src/tutorials/tutorials/3.md b/docs/v1.1.0/_src/tutorials/tutorials/3.md index bc4c97a0c..9d2b5a4bc 100644 --- a/docs/v1.1.0/_src/tutorials/tutorials/3.md +++ b/docs/v1.1.0/_src/tutorials/tutorials/3.md @@ -100,7 +100,7 @@ print(dicts[:3]) document_store.write_documents(dicts) ``` -## Initalize Retriever, Reader & Pipeline +## Initialize Retriever, Reader & Pipeline ### Retriever diff --git a/docs/v1.1.0/_src/tutorials/tutorials/5.md b/docs/v1.1.0/_src/tutorials/tutorials/5.md index 174e7e7b7..7f8583f4c 100644 --- a/docs/v1.1.0/_src/tutorials/tutorials/5.md +++ b/docs/v1.1.0/_src/tutorials/tutorials/5.md @@ -176,7 +176,7 @@ Here we evaluate retriever and reader in open domain fashion on the full corpus correctly retrieved if it contains the gold answer string within it. The reader is evaluated based purely on the predicted answer string, regardless of which document this came from and the position of the extracted span. -The generation of predictions is seperated from the calculation of metrics. This allows you to run the computation-heavy model predictions only once and then iterate flexibly on the metrics or reports you want to generate. +The generation of predictions is separated from the calculation of metrics. This allows you to run the computation-heavy model predictions only once and then iterate flexibly on the metrics or reports you want to generate. diff --git a/docs/v1.1.0/_src/tutorials/tutorials/6.md b/docs/v1.1.0/_src/tutorials/tutorials/6.md index 4d05426eb..8835ebe79 100644 --- a/docs/v1.1.0/_src/tutorials/tutorials/6.md +++ b/docs/v1.1.0/_src/tutorials/tutorials/6.md @@ -45,7 +45,7 @@ Recent work suggests that dual encoders work better, likely because they can dea ### "Dense Passage Retrieval" In this Tutorial, we want to highlight one "Dense Dual-Encoder" called Dense Passage Retriever. -It was introdoced by Karpukhin et al. (2020, https://arxiv.org/abs/2004.04906. +It was introduced by Karpukhin et al. (2020, https://arxiv.org/abs/2004.04906. Original Abstract: @@ -145,7 +145,7 @@ dicts = convert_files_to_dicts(dir_path=doc_dir, clean_func=clean_wiki_text, spl document_store.write_documents(dicts) ``` -### Initalize Retriever, Reader & Pipeline +### Initialize Retriever, Reader & Pipeline #### Retriever diff --git a/docs/v1.1.0/_src/tutorials/tutorials/8.md b/docs/v1.1.0/_src/tutorials/tutorials/8.md index 430e05d84..d93201fd8 100644 --- a/docs/v1.1.0/_src/tutorials/tutorials/8.md +++ b/docs/v1.1.0/_src/tutorials/tutorials/8.md @@ -65,7 +65,7 @@ fetch_archive_from_http(url=s3_url, output_dir=doc_dir) Haystack's converter classes are designed to help you turn files on your computer into the documents that can be processed by the Haystack pipeline. There are file converters for txt, pdf, docx files as well as a converter that is powered by Apache Tika. -The parameter `valid_langugages` does not convert files to the target language, but checks if the conversion worked as expected. +The parameter `valid_languages` does not convert files to the target language, but checks if the conversion worked as expected. For converting PDFs, try changing the encoding to UTF-8 if the conversion isn't great. diff --git a/docs/v1.2.0/_src/api/api/primitives.md b/docs/v1.2.0/_src/api/api/primitives.md index 0a4c02efd..0270fa175 100644 --- a/docs/v1.2.0/_src/api/api/primitives.md +++ b/docs/v1.2.0/_src/api/api/primitives.md @@ -301,7 +301,7 @@ The DataFrames have the following schema: - context (answers only): the surrounding context of the answer within the document - exact_match (answers only): metric depicting if the answer exactly matches the gold label - f1 (answers only): metric depicting how well the answer overlaps with the gold label on token basis -- sas (answers only, optional): metric depciting how well the answer matches the gold label on a semantic basis +- sas (answers only, optional): metric depicting how well the answer matches the gold label on a semantic basis - gold_document_contents (documents only): the contents of the gold documents - content (documents only): the content of the document - gold_id_match (documents only): metric depicting whether one of the gold document ids matches the document @@ -311,7 +311,7 @@ The DataFrames have the following schema: - document_id: the id of the document that has been retrieved or that contained the answer - gold_document_ids: the documents to be retrieved - offsets_in_document (answers only): the position or offsets within the document the answer was found -- gold_offsets_in_documents (answers only): the positon or offsets of the gold answer within the document +- gold_offsets_in_documents (answers only): the position or offsets of the gold answer within the document - type: 'answer' or 'document' - node: the node name - eval_mode: evaluation mode depicting whether the evaluation was executed in integrated or isolated mode. diff --git a/docs/v1.2.0/_src/tutorials/tutorials/1.md b/docs/v1.2.0/_src/tutorials/tutorials/1.md index 75233a7d4..c93f07bd4 100644 --- a/docs/v1.2.0/_src/tutorials/tutorials/1.md +++ b/docs/v1.2.0/_src/tutorials/tutorials/1.md @@ -139,7 +139,7 @@ print(dicts[:3]) document_store.write_documents(dicts) ``` -## Initalize Retriever, Reader, & Pipeline +## Initialize Retriever, Reader & Pipeline ### Retriever diff --git a/docs/v1.2.0/_src/tutorials/tutorials/12.md b/docs/v1.2.0/_src/tutorials/tutorials/12.md index 1961289b7..6948b42d7 100644 --- a/docs/v1.2.0/_src/tutorials/tutorials/12.md +++ b/docs/v1.2.0/_src/tutorials/tutorials/12.md @@ -76,7 +76,7 @@ dicts = convert_files_to_dicts(dir_path=doc_dir, clean_func=clean_wiki_text, spl document_store.write_documents(dicts) ``` -### Initalize Retriever and Reader/Generator +### Initialize Retriever and Reader/Generator #### Retriever diff --git a/docs/v1.2.0/_src/tutorials/tutorials/15.md b/docs/v1.2.0/_src/tutorials/tutorials/15.md index c79957ecd..c8365ef48 100644 --- a/docs/v1.2.0/_src/tutorials/tutorials/15.md +++ b/docs/v1.2.0/_src/tutorials/tutorials/15.md @@ -136,7 +136,7 @@ print(tables[0].content) print(tables[0].meta) ``` -## Initalize Retriever, Reader, & Pipeline +## Initialize Retriever, Reader & Pipeline ### Retriever diff --git a/docs/v1.2.0/_src/tutorials/tutorials/3.md b/docs/v1.2.0/_src/tutorials/tutorials/3.md index dfed9b162..09256bc75 100644 --- a/docs/v1.2.0/_src/tutorials/tutorials/3.md +++ b/docs/v1.2.0/_src/tutorials/tutorials/3.md @@ -98,7 +98,7 @@ print(dicts[:3]) document_store.write_documents(dicts) ``` -## Initalize Retriever, Reader & Pipeline +## Initialize Retriever, Reader & Pipeline ### Retriever diff --git a/docs/v1.2.0/_src/tutorials/tutorials/5.md b/docs/v1.2.0/_src/tutorials/tutorials/5.md index c840d3aa2..3e4aff9c7 100644 --- a/docs/v1.2.0/_src/tutorials/tutorials/5.md +++ b/docs/v1.2.0/_src/tutorials/tutorials/5.md @@ -180,7 +180,7 @@ Here we evaluate retriever and reader in open domain fashion on the full corpus correctly retrieved if it contains the gold answer string within it. The reader is evaluated based purely on the predicted answer string, regardless of which document this came from and the position of the extracted span. -The generation of predictions is seperated from the calculation of metrics. This allows you to run the computation-heavy model predictions only once and then iterate flexibly on the metrics or reports you want to generate. +The generation of predictions is separated from the calculation of metrics. This allows you to run the computation-heavy model predictions only once and then iterate flexibly on the metrics or reports you want to generate. diff --git a/docs/v1.2.0/_src/tutorials/tutorials/6.md b/docs/v1.2.0/_src/tutorials/tutorials/6.md index 16317e001..42d057034 100644 --- a/docs/v1.2.0/_src/tutorials/tutorials/6.md +++ b/docs/v1.2.0/_src/tutorials/tutorials/6.md @@ -45,7 +45,7 @@ Recent work suggests that dual encoders work better, likely because they can dea ### "Dense Passage Retrieval" In this Tutorial, we want to highlight one "Dense Dual-Encoder" called Dense Passage Retriever. -It was introdoced by Karpukhin et al. (2020, https://arxiv.org/abs/2004.04906. +It was introduced by Karpukhin et al. (2020, https://arxiv.org/abs/2004.04906. Original Abstract: @@ -143,7 +143,7 @@ dicts = convert_files_to_dicts(dir_path=doc_dir, clean_func=clean_wiki_text, spl document_store.write_documents(dicts) ``` -### Initalize Retriever, Reader & Pipeline +### Initialize Retriever, Reader & Pipeline #### Retriever diff --git a/docs/v1.2.0/_src/tutorials/tutorials/8.md b/docs/v1.2.0/_src/tutorials/tutorials/8.md index 88797e34e..43210c6de 100644 --- a/docs/v1.2.0/_src/tutorials/tutorials/8.md +++ b/docs/v1.2.0/_src/tutorials/tutorials/8.md @@ -61,7 +61,7 @@ fetch_archive_from_http(url=s3_url, output_dir=doc_dir) Haystack's converter classes are designed to help you turn files on your computer into the documents that can be processed by the Haystack pipeline. There are file converters for txt, pdf, docx files as well as a converter that is powered by Apache Tika. -The parameter `valid_langugages` does not convert files to the target language, but checks if the conversion worked as expected. +The parameter `valid_languages` does not convert files to the target language, but checks if the conversion worked as expected. For converting PDFs, try changing the encoding to UTF-8 if the conversion isn't great. diff --git a/docs/v1.3.0/_src/api/api/primitives.md b/docs/v1.3.0/_src/api/api/primitives.md index 0a4c02efd..0270fa175 100644 --- a/docs/v1.3.0/_src/api/api/primitives.md +++ b/docs/v1.3.0/_src/api/api/primitives.md @@ -301,7 +301,7 @@ The DataFrames have the following schema: - context (answers only): the surrounding context of the answer within the document - exact_match (answers only): metric depicting if the answer exactly matches the gold label - f1 (answers only): metric depicting how well the answer overlaps with the gold label on token basis -- sas (answers only, optional): metric depciting how well the answer matches the gold label on a semantic basis +- sas (answers only, optional): metric depicting how well the answer matches the gold label on a semantic basis - gold_document_contents (documents only): the contents of the gold documents - content (documents only): the content of the document - gold_id_match (documents only): metric depicting whether one of the gold document ids matches the document @@ -311,7 +311,7 @@ The DataFrames have the following schema: - document_id: the id of the document that has been retrieved or that contained the answer - gold_document_ids: the documents to be retrieved - offsets_in_document (answers only): the position or offsets within the document the answer was found -- gold_offsets_in_documents (answers only): the positon or offsets of the gold answer within the document +- gold_offsets_in_documents (answers only): the position or offsets of the gold answer within the document - type: 'answer' or 'document' - node: the node name - eval_mode: evaluation mode depicting whether the evaluation was executed in integrated or isolated mode. diff --git a/docs/v1.3.0/_src/tutorials/tutorials/1.md b/docs/v1.3.0/_src/tutorials/tutorials/1.md index 2f544bb81..0e94bf2e3 100644 --- a/docs/v1.3.0/_src/tutorials/tutorials/1.md +++ b/docs/v1.3.0/_src/tutorials/tutorials/1.md @@ -139,7 +139,7 @@ print(dicts[:3]) document_store.write_documents(dicts) ``` -## Initalize Retriever, Reader, & Pipeline +## Initialize Retriever, Reader & Pipeline ### Retriever diff --git a/docs/v1.3.0/_src/tutorials/tutorials/12.md b/docs/v1.3.0/_src/tutorials/tutorials/12.md index 0ccb72fdc..cb7f0bbd9 100644 --- a/docs/v1.3.0/_src/tutorials/tutorials/12.md +++ b/docs/v1.3.0/_src/tutorials/tutorials/12.md @@ -76,7 +76,7 @@ dicts = convert_files_to_dicts(dir_path=doc_dir, clean_func=clean_wiki_text, spl document_store.write_documents(dicts) ``` -### Initalize Retriever and Reader/Generator +### Initialize Retriever and Reader/Generator #### Retriever diff --git a/docs/v1.3.0/_src/tutorials/tutorials/15.md b/docs/v1.3.0/_src/tutorials/tutorials/15.md index 1cacbb136..0a780cbc8 100644 --- a/docs/v1.3.0/_src/tutorials/tutorials/15.md +++ b/docs/v1.3.0/_src/tutorials/tutorials/15.md @@ -137,7 +137,7 @@ print(tables[0].content) print(tables[0].meta) ``` -## Initalize Retriever, Reader, & Pipeline +## Initialize Retriever, Reader & Pipeline ### Retriever diff --git a/docs/v1.3.0/_src/tutorials/tutorials/3.md b/docs/v1.3.0/_src/tutorials/tutorials/3.md index f785e9c77..9c1b1d85a 100644 --- a/docs/v1.3.0/_src/tutorials/tutorials/3.md +++ b/docs/v1.3.0/_src/tutorials/tutorials/3.md @@ -98,7 +98,7 @@ print(dicts[:3]) document_store.write_documents(dicts) ``` -## Initalize Retriever, Reader & Pipeline +## Initialize Retriever, Reader & Pipeline ### Retriever diff --git a/docs/v1.3.0/_src/tutorials/tutorials/5.md b/docs/v1.3.0/_src/tutorials/tutorials/5.md index 9c038dcf6..f4122f9a2 100644 --- a/docs/v1.3.0/_src/tutorials/tutorials/5.md +++ b/docs/v1.3.0/_src/tutorials/tutorials/5.md @@ -177,7 +177,7 @@ Here we evaluate retriever and reader in open domain fashion on the full corpus correctly retrieved if it contains the gold answer string within it. The reader is evaluated based purely on the predicted answer string, regardless of which document this came from and the position of the extracted span. -The generation of predictions is seperated from the calculation of metrics. This allows you to run the computation-heavy model predictions only once and then iterate flexibly on the metrics or reports you want to generate. +The generation of predictions is separated from the calculation of metrics. This allows you to run the computation-heavy model predictions only once and then iterate flexibly on the metrics or reports you want to generate. diff --git a/docs/v1.3.0/_src/tutorials/tutorials/6.md b/docs/v1.3.0/_src/tutorials/tutorials/6.md index 7aaeea279..033401fc7 100644 --- a/docs/v1.3.0/_src/tutorials/tutorials/6.md +++ b/docs/v1.3.0/_src/tutorials/tutorials/6.md @@ -45,7 +45,7 @@ Recent work suggests that dual encoders work better, likely because they can dea ### "Dense Passage Retrieval" In this Tutorial, we want to highlight one "Dense Dual-Encoder" called Dense Passage Retriever. -It was introdoced by Karpukhin et al. (2020, https://arxiv.org/abs/2004.04906. +It was introduced by Karpukhin et al. (2020, https://arxiv.org/abs/2004.04906. Original Abstract: @@ -147,7 +147,7 @@ dicts = convert_files_to_dicts(dir_path=doc_dir, clean_func=clean_wiki_text, spl document_store.write_documents(dicts) ``` -### Initalize Retriever, Reader & Pipeline +### Initialize Retriever, Reader & Pipeline #### Retriever diff --git a/docs/v1.3.0/_src/tutorials/tutorials/8.md b/docs/v1.3.0/_src/tutorials/tutorials/8.md index ffbd75de3..9a9139d93 100644 --- a/docs/v1.3.0/_src/tutorials/tutorials/8.md +++ b/docs/v1.3.0/_src/tutorials/tutorials/8.md @@ -63,7 +63,7 @@ fetch_archive_from_http(url=s3_url, output_dir=doc_dir) Haystack's converter classes are designed to help you turn files on your computer into the documents that can be processed by the Haystack pipeline. There are file converters for txt, pdf, docx files as well as a converter that is powered by Apache Tika. -The parameter `valid_langugages` does not convert files to the target language, but checks if the conversion worked as expected. +The parameter `valid_languages` does not convert files to the target language, but checks if the conversion worked as expected. For converting PDFs, try changing the encoding to UTF-8 if the conversion isn't great. diff --git a/docs/v1.4.0/_src/api/api/pipelines.md b/docs/v1.4.0/_src/api/api/pipelines.md index 6b60f4b59..272bafdf8 100644 --- a/docs/v1.4.0/_src/api/api/pipelines.md +++ b/docs/v1.4.0/_src/api/api/pipelines.md @@ -373,7 +373,7 @@ E.g. you can call execute_eval_run() multiple times with different retrievers in - `index_pipeline`: The indexing pipeline to use. - `query_pipeline`: The query pipeline to evaluate. -- `evaluation_set_labels`: The labels to evaluate on forming an evalution set. +- `evaluation_set_labels`: The labels to evaluate on forming an evaluation set. - `corpus_file_paths`: The files to be indexed and searched during evaluation forming a corpus. - `experiment_name`: The name of the experiment - `experiment_run_name`: The name of the experiment run diff --git a/docs/v1.4.0/_src/api/api/primitives.md b/docs/v1.4.0/_src/api/api/primitives.md index 1beac3c0e..5b1a42b91 100644 --- a/docs/v1.4.0/_src/api/api/primitives.md +++ b/docs/v1.4.0/_src/api/api/primitives.md @@ -301,7 +301,7 @@ The DataFrames have the following schema: - context (answers only): the surrounding context of the answer within the document - exact_match (answers only): metric depicting if the answer exactly matches the gold label - f1 (answers only): metric depicting how well the answer overlaps with the gold label on token basis -- sas (answers only, optional): metric depciting how well the answer matches the gold label on a semantic basis +- sas (answers only, optional): metric depicting how well the answer matches the gold label on a semantic basis - gold_document_contents (documents only): the contents of the gold documents - content (documents only): the content of the document - gold_id_match (documents only): metric depicting whether one of the gold document ids matches the document @@ -311,7 +311,7 @@ The DataFrames have the following schema: - document_id: the id of the document that has been retrieved or that contained the answer - gold_document_ids: the documents to be retrieved - offsets_in_document (answers only): the position or offsets within the document the answer was found -- gold_offsets_in_documents (answers only): the positon or offsets of the gold answer within the document +- gold_offsets_in_documents (answers only): the position or offsets of the gold answer within the document - type: 'answer' or 'document' - node: the node name - eval_mode: evaluation mode depicting whether the evaluation was executed in integrated or isolated mode. diff --git a/docs/v1.4.0/_src/tutorials/tutorials/1.md b/docs/v1.4.0/_src/tutorials/tutorials/1.md index 8a16402f9..7b5aec205 100644 --- a/docs/v1.4.0/_src/tutorials/tutorials/1.md +++ b/docs/v1.4.0/_src/tutorials/tutorials/1.md @@ -139,7 +139,7 @@ print(docs[:3]) document_store.write_documents(docs) ``` -## Initalize Retriever, Reader, & Pipeline +## Initialize Retriever, Reader & Pipeline ### Retriever diff --git a/docs/v1.4.0/_src/tutorials/tutorials/12.md b/docs/v1.4.0/_src/tutorials/tutorials/12.md index a141e1c72..5553479f7 100644 --- a/docs/v1.4.0/_src/tutorials/tutorials/12.md +++ b/docs/v1.4.0/_src/tutorials/tutorials/12.md @@ -76,7 +76,7 @@ docs = convert_files_to_docs(dir_path=doc_dir, clean_func=clean_wiki_text, split document_store.write_documents(docs) ``` -### Initalize Retriever and Reader/Generator +### Initialize Retriever and Reader/Generator #### Retriever diff --git a/docs/v1.4.0/_src/tutorials/tutorials/15.md b/docs/v1.4.0/_src/tutorials/tutorials/15.md index 136ca41ae..e585380fc 100644 --- a/docs/v1.4.0/_src/tutorials/tutorials/15.md +++ b/docs/v1.4.0/_src/tutorials/tutorials/15.md @@ -130,7 +130,7 @@ print(tables[0].content) print(tables[0].meta) ``` -## Initalize Retriever, Reader, & Pipeline +## Initialize Retriever, Reader & Pipeline ### Retriever diff --git a/docs/v1.4.0/_src/tutorials/tutorials/3.md b/docs/v1.4.0/_src/tutorials/tutorials/3.md index af2425f1f..837488860 100644 --- a/docs/v1.4.0/_src/tutorials/tutorials/3.md +++ b/docs/v1.4.0/_src/tutorials/tutorials/3.md @@ -98,7 +98,7 @@ print(docs[:3]) document_store.write_documents(docs) ``` -## Initalize Retriever, Reader & Pipeline +## Initialize Retriever, Reader & Pipeline ### Retriever diff --git a/docs/v1.4.0/_src/tutorials/tutorials/5.md b/docs/v1.4.0/_src/tutorials/tutorials/5.md index 4b4306c35..5660817f8 100644 --- a/docs/v1.4.0/_src/tutorials/tutorials/5.md +++ b/docs/v1.4.0/_src/tutorials/tutorials/5.md @@ -180,7 +180,7 @@ Here we evaluate retriever and reader in open domain fashion on the full corpus correctly retrieved if it contains the gold answer string within it. The reader is evaluated based purely on the predicted answer string, regardless of which document this came from and the position of the extracted span. -The generation of predictions is seperated from the calculation of metrics. This allows you to run the computation-heavy model predictions only once and then iterate flexibly on the metrics or reports you want to generate. +The generation of predictions is separated from the calculation of metrics. This allows you to run the computation-heavy model predictions only once and then iterate flexibly on the metrics or reports you want to generate. diff --git a/docs/v1.4.0/_src/tutorials/tutorials/6.md b/docs/v1.4.0/_src/tutorials/tutorials/6.md index 0b69a80ae..4163e0499 100644 --- a/docs/v1.4.0/_src/tutorials/tutorials/6.md +++ b/docs/v1.4.0/_src/tutorials/tutorials/6.md @@ -45,7 +45,7 @@ Recent work suggests that dual encoders work better, likely because they can dea ### "Dense Passage Retrieval" In this Tutorial, we want to highlight one "Dense Dual-Encoder" called Dense Passage Retriever. -It was introdoced by Karpukhin et al. (2020, https://arxiv.org/abs/2004.04906. +It was introduced by Karpukhin et al. (2020, https://arxiv.org/abs/2004.04906. Original Abstract: @@ -147,7 +147,7 @@ docs = convert_files_to_docs(dir_path=doc_dir, clean_func=clean_wiki_text, split document_store.write_documents(docs) ``` -### Initalize Retriever, Reader & Pipeline +### Initialize Retriever, Reader & Pipeline #### Retriever diff --git a/docs/v1.4.0/_src/tutorials/tutorials/8.md b/docs/v1.4.0/_src/tutorials/tutorials/8.md index 7204c0b77..e4cda13a0 100644 --- a/docs/v1.4.0/_src/tutorials/tutorials/8.md +++ b/docs/v1.4.0/_src/tutorials/tutorials/8.md @@ -66,7 +66,7 @@ fetch_archive_from_http(url=s3_url, output_dir=doc_dir) Haystack's converter classes are designed to help you turn files on your computer into the documents that can be processed by the Haystack pipeline. There are file converters for txt, pdf, docx files as well as a converter that is powered by Apache Tika. -The parameter `valid_langugages` does not convert files to the target language, but checks if the conversion worked as expected. +The parameter `valid_languages` does not convert files to the target language, but checks if the conversion worked as expected. ```python diff --git a/haystack/pipelines/base.py b/haystack/pipelines/base.py index 38d2548c9..5a2620d56 100644 --- a/haystack/pipelines/base.py +++ b/haystack/pipelines/base.py @@ -850,7 +850,7 @@ class Pipeline: :param index_pipeline: The indexing pipeline to use. :param query_pipeline: The query pipeline to evaluate. - :param evaluation_set_labels: The labels to evaluate on forming an evalution set. + :param evaluation_set_labels: The labels to evaluate on forming an evaluation set. :param corpus_file_paths: The files to be indexed and searched during evaluation forming a corpus. :param experiment_name: The name of the experiment :param experiment_run_name: The name of the experiment run diff --git a/test/document_stores/test_document_store.py b/test/document_stores/test_document_store.py index 04442a712..05ed1dc17 100644 --- a/test/document_stores/test_document_store.py +++ b/test/document_stores/test_document_store.py @@ -1858,7 +1858,7 @@ def test_DeepsetCloudDocumentStore_fetches_labels_for_evaluation_set(deepset_clo @responses.activate -def test_DeepsetCloudDocumentStore_fetches_lables_for_evaluation_set_raises_deepsetclouderror_when_nothing_found( +def test_DeepsetCloudDocumentStore_fetches_labels_for_evaluation_set_raises_deepsetclouderror_when_nothing_found( deepset_cloud_document_store, ): if MOCK_DC: diff --git a/tutorials/Tutorial12_LFQA.ipynb b/tutorials/Tutorial12_LFQA.ipynb index 0809393c4..ceca8bf61 100644 --- a/tutorials/Tutorial12_LFQA.ipynb +++ b/tutorials/Tutorial12_LFQA.ipynb @@ -142,7 +142,7 @@ "id": "wgjedxx_A6N6" }, "source": [ - "### Initalize Retriever and Reader/Generator\n", + "### Initialize Retriever and Reader/Generator\n", "\n", "#### Retriever\n", "\n", diff --git a/tutorials/Tutorial12_LFQA.py b/tutorials/Tutorial12_LFQA.py index 71b43e160..1a9724bc4 100644 --- a/tutorials/Tutorial12_LFQA.py +++ b/tutorials/Tutorial12_LFQA.py @@ -36,7 +36,7 @@ def tutorial12_lfqa(): document_store.write_documents(docs) """ - Initalize Retriever and Reader/Generator: + Initialize Retriever and Reader/Generator: We use a `DensePassageRetriever` and we invoke `update_embeddings` to index the embeddings of documents in the `FAISSDocumentStore` """ diff --git a/tutorials/Tutorial15_TableQA.ipynb b/tutorials/Tutorial15_TableQA.ipynb index 2782125a4..f80267b9a 100644 --- a/tutorials/Tutorial15_TableQA.ipynb +++ b/tutorials/Tutorial15_TableQA.ipynb @@ -231,7 +231,7 @@ "id": "hmQC1sDmw3d7" }, "source": [ - "## Initalize Retriever, Reader, & Pipeline\n", + "## Initialize Retriever, Reader & Pipeline\n", "\n", "### Retriever\n", "\n", diff --git a/tutorials/Tutorial1_Basic_QA_Pipeline.ipynb b/tutorials/Tutorial1_Basic_QA_Pipeline.ipynb index 50e9b4dc0..2ea98e10b 100644 --- a/tutorials/Tutorial1_Basic_QA_Pipeline.ipynb +++ b/tutorials/Tutorial1_Basic_QA_Pipeline.ipynb @@ -202,7 +202,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "## Initalize Retriever, Reader, & Pipeline\n", + "## Initialize Retriever, Reader & Pipeline\n", "\n", "### Retriever\n", "\n", diff --git a/tutorials/Tutorial1_Basic_QA_Pipeline.py b/tutorials/Tutorial1_Basic_QA_Pipeline.py index c260748ae..e66692173 100755 --- a/tutorials/Tutorial1_Basic_QA_Pipeline.py +++ b/tutorials/Tutorial1_Basic_QA_Pipeline.py @@ -65,7 +65,7 @@ def tutorial1_basic_qa_pipeline(): # Now, let's write the docs to our DB. document_store.write_documents(docs) - # ## Initalize Retriever & Reader + # ## Initialize Retriever & Reader # # ### Retriever # diff --git a/tutorials/Tutorial3_Basic_QA_Pipeline_without_Elasticsearch.ipynb b/tutorials/Tutorial3_Basic_QA_Pipeline_without_Elasticsearch.ipynb index 17171ce98..316d587e2 100644 --- a/tutorials/Tutorial3_Basic_QA_Pipeline_without_Elasticsearch.ipynb +++ b/tutorials/Tutorial3_Basic_QA_Pipeline_without_Elasticsearch.ipynb @@ -166,7 +166,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "## Initalize Retriever, Reader & Pipeline\n", + "## Initialize Retriever, Reader & Pipeline\n", "\n", "### Retriever\n", "\n", diff --git a/tutorials/Tutorial3_Basic_QA_Pipeline_without_Elasticsearch.py b/tutorials/Tutorial3_Basic_QA_Pipeline_without_Elasticsearch.py index 4a07a8200..d37fc130e 100644 --- a/tutorials/Tutorial3_Basic_QA_Pipeline_without_Elasticsearch.py +++ b/tutorials/Tutorial3_Basic_QA_Pipeline_without_Elasticsearch.py @@ -42,7 +42,7 @@ def tutorial3_basic_qa_pipeline_without_elasticsearch(): # Now, let's write the docs to our DB. document_store.write_documents(docs) - # ## Initalize Retriever, Reader & Pipeline + # ## Initialize Retriever, Reader & Pipeline # # ### Retriever # diff --git a/tutorials/Tutorial5_Evaluation.ipynb b/tutorials/Tutorial5_Evaluation.ipynb index c2f7ca9f7..28a97dbd3 100644 --- a/tutorials/Tutorial5_Evaluation.ipynb +++ b/tutorials/Tutorial5_Evaluation.ipynb @@ -389,7 +389,7 @@ "correctly retrieved if it contains the gold answer string within it. The reader is evaluated based purely on the\n", "predicted answer string, regardless of which document this came from and the position of the extracted span.\n", "\n", - "The generation of predictions is seperated from the calculation of metrics. This allows you to run the computation-heavy model predictions only once and then iterate flexibly on the metrics or reports you want to generate.\n" + "The generation of predictions is separated from the calculation of metrics. This allows you to run the computation-heavy model predictions only once and then iterate flexibly on the metrics or reports you want to generate.\n" ] }, { diff --git a/tutorials/Tutorial5_Evaluation.py b/tutorials/Tutorial5_Evaluation.py index 6eed2caca..e429d549a 100644 --- a/tutorials/Tutorial5_Evaluation.py +++ b/tutorials/Tutorial5_Evaluation.py @@ -95,7 +95,7 @@ def tutorial5_evaluation(): # i.e. a document is considered # correctly retrieved if it contains the gold answer string within it. The reader is evaluated based purely on the # predicted answer string, regardless of which document this came from and the position of the extracted span. - # The generation of predictions is seperated from the calculation of metrics. + # The generation of predictions is separated from the calculation of metrics. # This allows you to run the computation-heavy model predictions only once and then iterate flexibly on the metrics or reports you want to generate. pipeline = ExtractiveQAPipeline(reader=reader, retriever=retriever) diff --git a/tutorials/Tutorial6_Better_Retrieval_via_DPR.ipynb b/tutorials/Tutorial6_Better_Retrieval_via_DPR.ipynb index acf25e346..2989bfc8c 100644 --- a/tutorials/Tutorial6_Better_Retrieval_via_DPR.ipynb +++ b/tutorials/Tutorial6_Better_Retrieval_via_DPR.ipynb @@ -45,7 +45,7 @@ "### \"Dense Passage Retrieval\"\n", "\n", "In this Tutorial, we want to highlight one \"Dense Dual-Encoder\" called Dense Passage Retriever. \n", - "It was introdoced by Karpukhin et al. (2020, https://arxiv.org/abs/2004.04906. \n", + "It was introduced by Karpukhin et al. (2020, https://arxiv.org/abs/2004.04906. \n", "\n", "Original Abstract: \n", "\n", @@ -262,7 +262,7 @@ "id": "wgjedxx_A6N6" }, "source": [ - "### Initalize Retriever, Reader & Pipeline\n", + "### Initialize Retriever, Reader & Pipeline\n", "\n", "#### Retriever\n", "\n", diff --git a/tutorials/Tutorial8_Preprocessing.ipynb b/tutorials/Tutorial8_Preprocessing.ipynb index 3f002f76a..0b73814d4 100644 --- a/tutorials/Tutorial8_Preprocessing.ipynb +++ b/tutorials/Tutorial8_Preprocessing.ipynb @@ -150,7 +150,7 @@ "Haystack's converter classes are designed to help you turn files on your computer into the documents\n", "that can be processed by the Haystack pipeline.\n", "There are file converters for txt, pdf, docx files as well as a converter that is powered by Apache Tika.\n", - "The parameter `valid_langugages` does not convert files to the target language, but checks if the conversion worked as expected." + "The parameter `valid_languages` does not convert files to the target language, but checks if the conversion worked as expected." ] }, { diff --git a/tutorials/Tutorial8_Preprocessing.py b/tutorials/Tutorial8_Preprocessing.py index be342c5d5..f7a3a8a4f 100644 --- a/tutorials/Tutorial8_Preprocessing.py +++ b/tutorials/Tutorial8_Preprocessing.py @@ -37,7 +37,7 @@ def tutorial8_preprocessing(): Haystack's converter classes are designed to help you turn files on your computer into the documents that can be processed by the Haystack pipeline. There are file converters for txt, pdf, docx files as well as a converter that is powered by Apache Tika. - The parameter `valid_langugages` does not convert files to the target language, but checks if the conversion worked as expected. + The parameter `valid_languages` does not convert files to the target language, but checks if the conversion worked as expected. """ # Here are some examples of how you would use file converters diff --git a/ui/README.md b/ui/README.md index 660907307..9eb160370 100644 --- a/ui/README.md +++ b/ui/README.md @@ -34,9 +34,9 @@ The evaluation mode leverages the feedback REST API endpoint of haystack. The us In order to use the UI in evaluation mode, you need an ElasticSearch instance with pre-indexed files and the Haystack REST API. You can set the environment up via docker images. For ElasticSearch, you can check out our [documentation](https://haystack.deepset.ai/usage/document-store#initialisation) and for setting up the REST API this [link](https://github.com/deepset-ai/haystack/blob/master/README.md#7-rest-api). -To enter the evaluation mode, select the checkbox "Evaluation mode" in the sidebar. The UI will load the predefined questions from the file [`eval_lables_examles`](https://raw.githubusercontent.com/deepset-ai/haystack/master/ui/eval_labels_example.csv). The file needs to be prefilled with your data. This way, the user will get a random question from the set and can give his feedback with the buttons below the questions. To load a new question, click the button "Get random question". +To enter the evaluation mode, select the checkbox "Evaluation mode" in the sidebar. The UI will load the predefined questions from the file [`eval_labels_examples`](https://raw.githubusercontent.com/deepset-ai/haystack/master/ui/eval_labels_example.csv). The file needs to be prefilled with your data. This way, the user will get a random question from the set and can give his feedback with the buttons below the questions. To load a new question, click the button "Get random question". -The file just needs to have two columns separated by semicolon. You can add more columns but the UI will ignore them. Every line represents a questions answer pair. The columns with the questions needs to be named “Question Text” and the answer column “Answer” so that they can be loaded correctly. Currently, the easiest way to create the file is manully by adding question answer pairs. +The file just needs to have two columns separated by semicolon. You can add more columns but the UI will ignore them. Every line represents a questions answer pair. The columns with the questions needs to be named “Question Text” and the answer column “Answer” so that they can be loaded correctly. Currently, the easiest way to create the file is manually by adding question answer pairs. The feedback can be exported with the API endpoint `export-doc-qa-feedback`. To learn more about finetuning a model with user feedback, please check out our [docs](https://haystack.deepset.ai/usage/domain-adaptation#user-feedback).