Add boxes for recommendations (#629)

* add boxes for recommendations

* add more recommendation boxes

Co-authored-by: brandenchan <brandenchan@icloud.com>
This commit is contained in:
Markus Paff 2020-11-27 16:00:20 +01:00 committed by GitHub
parent 58bc9aa7f0
commit 88d0ee2c98
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
6 changed files with 40 additions and 4 deletions

View File

@ -197,6 +197,8 @@ The Document Stores have different characteristics. You should choose one depend
</div>
<div class="recommendation">
#### Our Recommendations
**Restricted environment:** Use the `InMemoryDocumentStore`, if you are just giving Haystack a quick try on a small sample and are working in a restricted environment that complicates running Elasticsearch or other databases
@ -204,3 +206,5 @@ The Document Stores have different characteristics. You should choose one depend
**Allrounder:** Use the `ElasticSearchDocumentStore`, if you want to evaluate the performance of different retrieval options (dense vs. sparse) and are aiming for a smooth transition from PoC to production
**Vector Specialist:** Use the `FAISSDocumentStore`, if you want to focus on dense retrieval and possibly deal with larger datasets
</div>

View File

@ -17,11 +17,18 @@ Though SQuAD is composed entirely of Wikipedia articles, these models are flexib
Before trying to adapt these models to your domain, wed recommend trying one of the off the shelf models.
Weve found that these models are often flexible enough for a wide range of use cases.
**Intuition**: Most people probably dont know what an HP Valve is.
<div class="recommendation">
**Intuition**
Most people probably dont know what an HP Valve is.
But you dont always need to know what a HP Valve is to answer “What is connected to a HP Valve?”
The answer might be there in plain language.
In the same way, many QA models have a good enough grasp of language to answer questions about concepts in an unseen domain.
</div>
## Finetuning
Any model that can be loaded into Haystack can also be finetuned within Haystack.
@ -35,10 +42,17 @@ reader.train(data_dir=train_data,
```
At the end of training, the finetuned model will be saved in the specified `save_dir` and can be loaded as a `Reader`.
<div class="recommendation">
**Recommendation**
See Tutorial 2 for a runnable example of this process.
If youre interested in measuring how much your model has improved,
please also check out Tutorial 5 which walks through the steps needed to perform evaluation.
</div>
## Generating Labels
Using our [Haystack Annotate tool](https://annotate.deepset.ai/login) (Beta),

View File

@ -9,8 +9,14 @@ id: "generatormd"
# Generator
<div class="recommendation">
**Example**
See [Tutorial 7](/docs/latest/tutorial7md) for a guide on how to build your own generative QA system.
</div>
While extractive QA highlights the span of text that answers a query,
generative QA can return a novel text answer that it has composed.
The best current approaches, such as [Retriever-Augmented Generation](https://arxiv.org/abs/2005.11401),

View File

@ -99,9 +99,13 @@ When we talk about Documents in Haystack, we are referring specifically to the i
You might want to use all the text in one file as a Document, or split it into multiple Documents.
This splitting can have a big impact on speed and performance.
**General Guide**: If Haystack is running very slowly, you might want to try splitting your text into smaller Documents.
<div class="recommendation">
**Tip:** If Haystack is running very slowly, you might want to try splitting your text into smaller Documents.
If you want an improvement to performance, you might want to try concatenating text to make larger Documents.
</div>
## Running Queries
**Querying** involves searching for an answer to a given question within the full document store.

View File

@ -204,6 +204,10 @@ reader = TransformersReader("ahotrod/albert_xxlargev1_squad2_512")
</div>
<div class="recommendation">
**Recommendations:**
**All-rounder**: In the class of base sized models trained on SQuAD, **RoBERTa** has shown better performance than BERT
and can be capably handled by any machine equipped with a single NVidia V100 GPU.
We recommend this as the starting point for anyone wanting to create a performant and computationally reasonable instance of Haystack.
@ -220,6 +224,8 @@ you might like to try ALBERT XXL which has set SoTA performance on SQuAD 2.0.
<!-- _comment: !! How good is it? How much computation resource do you need to run it? !! -->
</div>
<!-- farm-vs-trans: -->
## Deeper Dive: FARM vs Transformers

View File

@ -12,14 +12,16 @@ id: "retrievermd"
The Retriever is a lightweight filter that can quickly go through the full document store and pass a set of candidate documents to the Reader.
It is an tool for sifting out the obvious negative cases, saving the Reader from doing more work than it needs to and speeding up the querying process.
Recommendations:
<div class="recommendation">
** Recommendations**
* BM25 (sparse)
* Dense Passage Retrieval (dense)
</div>
<!-- _comment: !! Example speedup from slides !! -->
<!-- _comment: !! Benchmarks !! -->
Note that not all Retrievers can be paired with every DocumentStore.