mirror of
https://github.com/deepset-ai/haystack.git
synced 2025-09-03 21:33:40 +00:00
Add boxes for recommendations (#629)
* add boxes for recommendations * add more recommendation boxes Co-authored-by: brandenchan <brandenchan@icloud.com>
This commit is contained in:
parent
58bc9aa7f0
commit
88d0ee2c98
@ -197,6 +197,8 @@ The Document Stores have different characteristics. You should choose one depend
|
|||||||
|
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
|
<div class="recommendation">
|
||||||
|
|
||||||
#### Our Recommendations
|
#### Our Recommendations
|
||||||
|
|
||||||
**Restricted environment:** Use the `InMemoryDocumentStore`, if you are just giving Haystack a quick try on a small sample and are working in a restricted environment that complicates running Elasticsearch or other databases
|
**Restricted environment:** Use the `InMemoryDocumentStore`, if you are just giving Haystack a quick try on a small sample and are working in a restricted environment that complicates running Elasticsearch or other databases
|
||||||
@ -204,3 +206,5 @@ The Document Stores have different characteristics. You should choose one depend
|
|||||||
**Allrounder:** Use the `ElasticSearchDocumentStore`, if you want to evaluate the performance of different retrieval options (dense vs. sparse) and are aiming for a smooth transition from PoC to production
|
**Allrounder:** Use the `ElasticSearchDocumentStore`, if you want to evaluate the performance of different retrieval options (dense vs. sparse) and are aiming for a smooth transition from PoC to production
|
||||||
|
|
||||||
**Vector Specialist:** Use the `FAISSDocumentStore`, if you want to focus on dense retrieval and possibly deal with larger datasets
|
**Vector Specialist:** Use the `FAISSDocumentStore`, if you want to focus on dense retrieval and possibly deal with larger datasets
|
||||||
|
|
||||||
|
</div>
|
||||||
|
@ -17,11 +17,18 @@ Though SQuAD is composed entirely of Wikipedia articles, these models are flexib
|
|||||||
Before trying to adapt these models to your domain, we’d recommend trying one of the off the shelf models.
|
Before trying to adapt these models to your domain, we’d recommend trying one of the off the shelf models.
|
||||||
We’ve found that these models are often flexible enough for a wide range of use cases.
|
We’ve found that these models are often flexible enough for a wide range of use cases.
|
||||||
|
|
||||||
**Intuition**: Most people probably don’t know what an HP Valve is.
|
<div class="recommendation">
|
||||||
|
|
||||||
|
**Intuition**
|
||||||
|
|
||||||
|
Most people probably don’t know what an HP Valve is.
|
||||||
But you don’t always need to know what a HP Valve is to answer “What is connected to a HP Valve?”
|
But you don’t always need to know what a HP Valve is to answer “What is connected to a HP Valve?”
|
||||||
The answer might be there in plain language.
|
The answer might be there in plain language.
|
||||||
In the same way, many QA models have a good enough grasp of language to answer questions about concepts in an unseen domain.
|
In the same way, many QA models have a good enough grasp of language to answer questions about concepts in an unseen domain.
|
||||||
|
|
||||||
|
</div>
|
||||||
|
|
||||||
|
|
||||||
## Finetuning
|
## Finetuning
|
||||||
|
|
||||||
Any model that can be loaded into Haystack can also be finetuned within Haystack.
|
Any model that can be loaded into Haystack can also be finetuned within Haystack.
|
||||||
@ -35,10 +42,17 @@ reader.train(data_dir=train_data,
|
|||||||
```
|
```
|
||||||
|
|
||||||
At the end of training, the finetuned model will be saved in the specified `save_dir` and can be loaded as a `Reader`.
|
At the end of training, the finetuned model will be saved in the specified `save_dir` and can be loaded as a `Reader`.
|
||||||
|
|
||||||
|
<div class="recommendation">
|
||||||
|
|
||||||
|
**Recommendation**
|
||||||
|
|
||||||
See Tutorial 2 for a runnable example of this process.
|
See Tutorial 2 for a runnable example of this process.
|
||||||
If you’re interested in measuring how much your model has improved,
|
If you’re interested in measuring how much your model has improved,
|
||||||
please also check out Tutorial 5 which walks through the steps needed to perform evaluation.
|
please also check out Tutorial 5 which walks through the steps needed to perform evaluation.
|
||||||
|
|
||||||
|
</div>
|
||||||
|
|
||||||
## Generating Labels
|
## Generating Labels
|
||||||
|
|
||||||
Using our [Haystack Annotate tool](https://annotate.deepset.ai/login) (Beta),
|
Using our [Haystack Annotate tool](https://annotate.deepset.ai/login) (Beta),
|
||||||
|
@ -9,8 +9,14 @@ id: "generatormd"
|
|||||||
|
|
||||||
# Generator
|
# Generator
|
||||||
|
|
||||||
|
<div class="recommendation">
|
||||||
|
|
||||||
|
**Example**
|
||||||
|
|
||||||
See [Tutorial 7](/docs/latest/tutorial7md) for a guide on how to build your own generative QA system.
|
See [Tutorial 7](/docs/latest/tutorial7md) for a guide on how to build your own generative QA system.
|
||||||
|
|
||||||
|
</div>
|
||||||
|
|
||||||
While extractive QA highlights the span of text that answers a query,
|
While extractive QA highlights the span of text that answers a query,
|
||||||
generative QA can return a novel text answer that it has composed.
|
generative QA can return a novel text answer that it has composed.
|
||||||
The best current approaches, such as [Retriever-Augmented Generation](https://arxiv.org/abs/2005.11401),
|
The best current approaches, such as [Retriever-Augmented Generation](https://arxiv.org/abs/2005.11401),
|
||||||
|
@ -99,9 +99,13 @@ When we talk about Documents in Haystack, we are referring specifically to the i
|
|||||||
You might want to use all the text in one file as a Document, or split it into multiple Documents.
|
You might want to use all the text in one file as a Document, or split it into multiple Documents.
|
||||||
This splitting can have a big impact on speed and performance.
|
This splitting can have a big impact on speed and performance.
|
||||||
|
|
||||||
**General Guide**: If Haystack is running very slowly, you might want to try splitting your text into smaller Documents.
|
<div class="recommendation">
|
||||||
|
|
||||||
|
**Tip:** If Haystack is running very slowly, you might want to try splitting your text into smaller Documents.
|
||||||
If you want an improvement to performance, you might want to try concatenating text to make larger Documents.
|
If you want an improvement to performance, you might want to try concatenating text to make larger Documents.
|
||||||
|
|
||||||
|
</div>
|
||||||
|
|
||||||
## Running Queries
|
## Running Queries
|
||||||
|
|
||||||
**Querying** involves searching for an answer to a given question within the full document store.
|
**Querying** involves searching for an answer to a given question within the full document store.
|
||||||
|
@ -204,6 +204,10 @@ reader = TransformersReader("ahotrod/albert_xxlargev1_squad2_512")
|
|||||||
|
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
|
<div class="recommendation">
|
||||||
|
|
||||||
|
**Recommendations:**
|
||||||
|
|
||||||
**All-rounder**: In the class of base sized models trained on SQuAD, **RoBERTa** has shown better performance than BERT
|
**All-rounder**: In the class of base sized models trained on SQuAD, **RoBERTa** has shown better performance than BERT
|
||||||
and can be capably handled by any machine equipped with a single NVidia V100 GPU.
|
and can be capably handled by any machine equipped with a single NVidia V100 GPU.
|
||||||
We recommend this as the starting point for anyone wanting to create a performant and computationally reasonable instance of Haystack.
|
We recommend this as the starting point for anyone wanting to create a performant and computationally reasonable instance of Haystack.
|
||||||
@ -220,6 +224,8 @@ you might like to try ALBERT XXL which has set SoTA performance on SQuAD 2.0.
|
|||||||
|
|
||||||
<!-- _comment: !! How good is it? How much computation resource do you need to run it? !! -->
|
<!-- _comment: !! How good is it? How much computation resource do you need to run it? !! -->
|
||||||
|
|
||||||
|
</div>
|
||||||
|
|
||||||
<!-- farm-vs-trans: -->
|
<!-- farm-vs-trans: -->
|
||||||
## Deeper Dive: FARM vs Transformers
|
## Deeper Dive: FARM vs Transformers
|
||||||
|
|
||||||
|
@ -12,14 +12,16 @@ id: "retrievermd"
|
|||||||
The Retriever is a lightweight filter that can quickly go through the full document store and pass a set of candidate documents to the Reader.
|
The Retriever is a lightweight filter that can quickly go through the full document store and pass a set of candidate documents to the Reader.
|
||||||
It is an tool for sifting out the obvious negative cases, saving the Reader from doing more work than it needs to and speeding up the querying process.
|
It is an tool for sifting out the obvious negative cases, saving the Reader from doing more work than it needs to and speeding up the querying process.
|
||||||
|
|
||||||
Recommendations:
|
<div class="recommendation">
|
||||||
|
|
||||||
|
** Recommendations**
|
||||||
|
|
||||||
* BM25 (sparse)
|
* BM25 (sparse)
|
||||||
|
|
||||||
|
|
||||||
* Dense Passage Retrieval (dense)
|
* Dense Passage Retrieval (dense)
|
||||||
|
|
||||||
|
</div>
|
||||||
|
|
||||||
<!-- _comment: !! Example speedup from slides !! -->
|
<!-- _comment: !! Example speedup from slides !! -->
|
||||||
<!-- _comment: !! Benchmarks !! -->
|
<!-- _comment: !! Benchmarks !! -->
|
||||||
Note that not all Retrievers can be paired with every DocumentStore.
|
Note that not all Retrievers can be paired with every DocumentStore.
|
||||||
|
Loading…
x
Reference in New Issue
Block a user