mirror of
https://github.com/deepset-ai/haystack.git
synced 2025-08-31 11:56:35 +00:00
Add boxes for recommendations (#629)
* add boxes for recommendations * add more recommendation boxes Co-authored-by: brandenchan <brandenchan@icloud.com>
This commit is contained in:
parent
58bc9aa7f0
commit
88d0ee2c98
@ -197,6 +197,8 @@ The Document Stores have different characteristics. You should choose one depend
|
||||
|
||||
</div>
|
||||
|
||||
<div class="recommendation">
|
||||
|
||||
#### Our Recommendations
|
||||
|
||||
**Restricted environment:** Use the `InMemoryDocumentStore`, if you are just giving Haystack a quick try on a small sample and are working in a restricted environment that complicates running Elasticsearch or other databases
|
||||
@ -204,3 +206,5 @@ The Document Stores have different characteristics. You should choose one depend
|
||||
**Allrounder:** Use the `ElasticSearchDocumentStore`, if you want to evaluate the performance of different retrieval options (dense vs. sparse) and are aiming for a smooth transition from PoC to production
|
||||
|
||||
**Vector Specialist:** Use the `FAISSDocumentStore`, if you want to focus on dense retrieval and possibly deal with larger datasets
|
||||
|
||||
</div>
|
||||
|
@ -17,11 +17,18 @@ Though SQuAD is composed entirely of Wikipedia articles, these models are flexib
|
||||
Before trying to adapt these models to your domain, we’d recommend trying one of the off the shelf models.
|
||||
We’ve found that these models are often flexible enough for a wide range of use cases.
|
||||
|
||||
**Intuition**: Most people probably don’t know what an HP Valve is.
|
||||
<div class="recommendation">
|
||||
|
||||
**Intuition**
|
||||
|
||||
Most people probably don’t know what an HP Valve is.
|
||||
But you don’t always need to know what a HP Valve is to answer “What is connected to a HP Valve?”
|
||||
The answer might be there in plain language.
|
||||
In the same way, many QA models have a good enough grasp of language to answer questions about concepts in an unseen domain.
|
||||
|
||||
</div>
|
||||
|
||||
|
||||
## Finetuning
|
||||
|
||||
Any model that can be loaded into Haystack can also be finetuned within Haystack.
|
||||
@ -35,10 +42,17 @@ reader.train(data_dir=train_data,
|
||||
```
|
||||
|
||||
At the end of training, the finetuned model will be saved in the specified `save_dir` and can be loaded as a `Reader`.
|
||||
|
||||
<div class="recommendation">
|
||||
|
||||
**Recommendation**
|
||||
|
||||
See Tutorial 2 for a runnable example of this process.
|
||||
If you’re interested in measuring how much your model has improved,
|
||||
please also check out Tutorial 5 which walks through the steps needed to perform evaluation.
|
||||
|
||||
</div>
|
||||
|
||||
## Generating Labels
|
||||
|
||||
Using our [Haystack Annotate tool](https://annotate.deepset.ai/login) (Beta),
|
||||
|
@ -9,8 +9,14 @@ id: "generatormd"
|
||||
|
||||
# Generator
|
||||
|
||||
<div class="recommendation">
|
||||
|
||||
**Example**
|
||||
|
||||
See [Tutorial 7](/docs/latest/tutorial7md) for a guide on how to build your own generative QA system.
|
||||
|
||||
</div>
|
||||
|
||||
While extractive QA highlights the span of text that answers a query,
|
||||
generative QA can return a novel text answer that it has composed.
|
||||
The best current approaches, such as [Retriever-Augmented Generation](https://arxiv.org/abs/2005.11401),
|
||||
|
@ -99,9 +99,13 @@ When we talk about Documents in Haystack, we are referring specifically to the i
|
||||
You might want to use all the text in one file as a Document, or split it into multiple Documents.
|
||||
This splitting can have a big impact on speed and performance.
|
||||
|
||||
**General Guide**: If Haystack is running very slowly, you might want to try splitting your text into smaller Documents.
|
||||
<div class="recommendation">
|
||||
|
||||
**Tip:** If Haystack is running very slowly, you might want to try splitting your text into smaller Documents.
|
||||
If you want an improvement to performance, you might want to try concatenating text to make larger Documents.
|
||||
|
||||
</div>
|
||||
|
||||
## Running Queries
|
||||
|
||||
**Querying** involves searching for an answer to a given question within the full document store.
|
||||
|
@ -204,6 +204,10 @@ reader = TransformersReader("ahotrod/albert_xxlargev1_squad2_512")
|
||||
|
||||
</div>
|
||||
|
||||
<div class="recommendation">
|
||||
|
||||
**Recommendations:**
|
||||
|
||||
**All-rounder**: In the class of base sized models trained on SQuAD, **RoBERTa** has shown better performance than BERT
|
||||
and can be capably handled by any machine equipped with a single NVidia V100 GPU.
|
||||
We recommend this as the starting point for anyone wanting to create a performant and computationally reasonable instance of Haystack.
|
||||
@ -220,6 +224,8 @@ you might like to try ALBERT XXL which has set SoTA performance on SQuAD 2.0.
|
||||
|
||||
<!-- _comment: !! How good is it? How much computation resource do you need to run it? !! -->
|
||||
|
||||
</div>
|
||||
|
||||
<!-- farm-vs-trans: -->
|
||||
## Deeper Dive: FARM vs Transformers
|
||||
|
||||
|
@ -12,14 +12,16 @@ id: "retrievermd"
|
||||
The Retriever is a lightweight filter that can quickly go through the full document store and pass a set of candidate documents to the Reader.
|
||||
It is an tool for sifting out the obvious negative cases, saving the Reader from doing more work than it needs to and speeding up the querying process.
|
||||
|
||||
Recommendations:
|
||||
<div class="recommendation">
|
||||
|
||||
** Recommendations**
|
||||
|
||||
* BM25 (sparse)
|
||||
|
||||
|
||||
* Dense Passage Retrieval (dense)
|
||||
|
||||
</div>
|
||||
|
||||
<!-- _comment: !! Example speedup from slides !! -->
|
||||
<!-- _comment: !! Benchmarks !! -->
|
||||
Note that not all Retrievers can be paired with every DocumentStore.
|
||||
|
Loading…
x
Reference in New Issue
Block a user