kolk
f2b6cc761b
Refactor DPR from FB to Transformers codebase ( #308 )
...
* change_HFBertEncoder to transformers DPREncoder
* Removed BertTensorizer
* model download relative path
* Refactor model load
* Tutorial5 DPR updated
* fix print_eval_results typo
* copy transformers DPR modules in dpr_utils and test
* transformer v3.0.2 import errors fixed
* remove dependency of DPRConfig on attribute use_return_tuple
* Adjust transformers 302 locally to work with dpr
* projection layer removed from DPR encoders
* fixed mypy errors
* transformers DPR compatible code added
* transformers DPR compatibility added
* bug fix in tutorial 6 notebook
* Docstring update and variable naming issues fix
* tutorial modified to reflect DPR variable naming change
* title addition to passage use-cases handled
* modified handling untitled batch
* resolved mypy errors
* typos in docstrings and comments fixed
* cleaned DPR code and added new test cases
* warnings added for non-bert model [SEP] token removal
* changed warning to logger warning
* title mask creation refactored
* bug fix on cuda issues
* tutorial 6 instantiates modified DPR
* tutorial 5 modified
* tutorial 5 ipython notebook modified: DPR instantiation
* batch_size added to DPR instantiation
* tutorial 5 jupyter notebook typos fixed
* improved docstrings, fixed typos
* Update docstring
Co-authored-by: Timo Moeller <timo.moeller@deepset.ai>
Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>
2020-08-25 20:16:00 +05:30
Branden Chan
a54d6a5bd7
Make Tutorials Work on Colab GPUs ( #322 )
...
* Add pip install torch+cu
2020-08-19 14:52:50 +02:00
bogdankostic
72b1013560
Restructure update embeddings ( #304 )
...
* Restructure update embeddings
* Adapt FAISSDocStore
* Adapt test and tutorial
Co-authored-by: Timo Moeller <timo.moeller@deepset.ai>
2020-08-18 14:04:31 +02:00
brandenchan
8a3eca05c3
Change to retriever eval top_k to match notebook
2020-08-18 11:39:49 +02:00
Tanay Soni
200bb4bafd
Refactor the DPR tutorial to use FAISS ( #317 )
2020-08-17 13:30:02 +02:00
Timo Moeller
72e6867278
Aggregate label objects for same questions ( #292 )
...
* Add aggregate labels obj, use in retriever eval function
* Change launch ES param
* Move aggregation from ES document store to base class
* Fix type annotations
2020-08-07 11:24:41 +02:00
Malte Pietsch
29a15c0d59
Add eval for Dense Passage Retriever & Refactor handling of labels/feedback ( #243 )
2020-07-31 11:34:06 +02:00
Malte Pietsch
5b1be233d0
Update Tutorial 4
2020-07-17 19:31:00 +02:00
Malte Pietsch
6bed2f509f
Refactor DPR for latest transformers version & change init arg gpu
-> use_gpu
for DPR and EmbeddingRetriever ( #239 )
...
* fix tokenizer warning in latest transformers
* change dpr arg from gpu to use_gpu
* change gpu arg for EmbeddingRetriever
2020-07-16 10:45:01 +02:00
Malte Pietsch
c9d3146fae
Fix multi-gpu training via DataParallel ( #234 )
2020-07-15 18:34:55 +02:00
Branden Chan
36867dabac
change from top_n_recall to accuracy
2020-07-15 17:05:08 +02:00
Branden Chan
64721d3196
One more update
2020-07-15 16:24:10 +02:00
Branden Chan
c55477e0ce
update eval dataset
2020-07-15 16:14:52 +02:00
Malte Pietsch
99a6a34047
Upgrade to new FARM / Transformers / PyTorch versions ( #212 )
2020-07-14 18:53:15 +02:00
Tanay Soni
4c21556a79
Fix embedding method for Retriever ( #220 )
2020-07-13 12:38:01 +02:00
Malte Pietsch
fe33a481ad
Update tutorials ( #200 )
...
* fix link in readme. update installation in tutorials
* update haystack version to latest master
* add basic documentation for input to write_documents()
* add docstring for sqldocumentstore
* comment out docker in notebook
2020-07-07 14:59:01 +02:00
Malte Pietsch
c36f8c991e
Update Tutorial 6
2020-07-03 16:06:46 +02:00
Malte Pietsch
8a9f97fad3
Tutorial for Dense Passage Retriever ( #186 )
2020-07-03 15:53:58 +02:00
Malte Pietsch
07ecfb60b9
Dense Passage Retriever (Inference) ( #167 )
2020-06-30 19:05:45 +02:00
Timo Moeller
c53aaddb78
Fix document id missing in farm inference output ( #174 )
2020-06-26 11:01:10 +02:00
Tanay Soni
44f89c94ab
Upgrade FARM version ( #172 )
2020-06-24 15:14:09 +02:00
Yaser Martinez Palenzuela
97bbb4280c
Correct field in evaluation tutorial ( #139 )
2020-06-08 16:38:09 +02:00
Tanay Soni
71e15a5a11
Update Haystack version in tutorials ( #136 )
2020-06-08 11:31:12 +02:00
Tanay Soni
ef9e4f4467
Add PDF text extraction ( #109 )
2020-06-08 11:07:19 +02:00
bogdankostic
479fcb1ace
Fix evaluation ( #132 )
...
* Fix bugs in Tutorial 5
* Adapt tutorials to new metrics
2020-06-05 18:33:50 +02:00
bogdankostic
bbfccf5cf6
Add Evaluation of Reader, Retriever and Finder ( #92 )
2020-05-29 15:57:07 +02:00
Branden Chan
5c68a5d755
Move save_dir from FARMReader() to reader.train()
2020-05-26 12:14:35 +02:00
Branden Chan
cbe62044b1
Update colab link
2020-05-26 11:56:24 +02:00
Malte Pietsch
c468200a19
Split docs into passages in Tutorial
2020-05-21 13:01:48 +02:00
Malte Pietsch
d5443b36ec
Split docs into passages in Tutorial
2020-05-21 13:01:04 +02:00
Malte Pietsch
a431a94b04
Add basic tutorial for FAQ-based QA & batch comp. of embeddings ( #98 )
...
* Add basic tutorial for FAQ-based QA and switch to bach computation of embeddings
* update readme & haystack version in tutorial
2020-05-07 10:19:26 +02:00
Malte Pietsch
f58f58fc86
Make saving more explicit in tutorial 2 ( #95 )
2020-05-06 12:13:49 +02:00
Malte Pietsch
d595886630
split docs into passages in tutorials
2020-04-30 19:27:15 +02:00
Malte Pietsch
7b01fb3fbc
Merge branch 'master' of github.com:deepset-ai/haystack
2020-04-30 19:03:44 +02:00
Malte Pietsch
7972038afc
update tutorials
2020-04-30 19:00:41 +02:00
Malte Pietsch
438543a18a
pin haystack version in tutorials until release ( #87 )
2020-04-30 18:44:44 +02:00
Tanay Soni
887bdcc376
Update tutorials to use Elasticsearch, new Retrievers ( #79 )
2020-04-29 14:01:05 +02:00
Branden Chan
420e11695b
Remove use_gpu param
2020-03-24 17:47:00 +01:00
bogdankostic
0048ee9c5c
Added Jupyter notebooks of Tutorials ( #43 )
...
Add Jupyter and Colab notebooks of tutorials
2020-03-17 19:58:53 +01:00
timoeller
f681026a56
Simplify no ans handling, disable no ans + sorting in private function
2020-02-24 16:15:06 +01:00
timoeller
ef9b99c3cc
Add no answer handling and sort no answer into positive predictions
2020-02-21 18:27:53 +01:00
timoeller
840b368732
Add no ans example
2020-02-19 14:51:12 +01:00
timoeller
c6d9da8827
Add doc for no answer boosting
2020-02-19 13:02:51 +01:00
timoeller
dc9188361c
Add ranking of no ans relative to positive answers
2020-02-19 12:57:35 +01:00
Malte Pietsch
d33ef9c345
Add minimal example for ES ( #19 )
2020-02-10 18:10:18 +01:00
Tanay Soni
f83a164095
Add Elasticsearch Document Store ( #13 )
2020-01-24 18:24:07 +01:00
Tanay Soni
c52266e520
Update tutorials ( #12 )
...
Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>
2020-01-23 15:18:41 +01:00
Malte Pietsch
1718ea55b8
Add method to train a reader on custom data ( #5 )
...
* initial version of training a reader WIP
* update for latest changes in FARM inferencer. Update tutorial. Add basic docs
2020-01-23 14:49:17 +01:00
Malte Pietsch
cab0932fab
Refactor pipeline for better generalizability & Add TransformersReader ( #1 )
...
* add flag to skip writing docs to non-empty db
* change finder pipeline structure for better generalizability
* add basic TransformersReader
* update tutorials and requirements
2020-01-13 18:56:22 +01:00
Tanay Soni
6bc228fa6a
Fetch QA model from remote in tutorial notebook
2019-11-28 12:07:04 +01:00