Tanay Soni
4c2804e38e
Add support for aggregating scores in JoinDocuments node ( #683 )
2020-12-16 15:54:58 +01:00
demSd
143da4cb3f
Fix a typo in DPR args, num_negatives -> num_positives ( #681 )
...
* fix a typo, num_negatives -> num_positives
* default value for num_positives
* Update dense.py
2020-12-15 10:10:41 +01:00
Tanay Soni
369e237fd4
Add DocumentStore for Open Distro Elasticsearch ( #676 )
2020-12-15 09:28:40 +01:00
Tanay Soni
33fe597949
Cleanup Pytest Fixtures ( #639 )
2020-12-14 18:15:44 +01:00
Branden Chan
d8154939fc
Scale dot product into probabilities ( #667 )
...
* scale dot product
* Add tip in documentation
* Add recommendation boxes
* WIP: Use similarity attribute in all doc stores
* Implement similarity for InMemoryDS
* Add FAISS support
* Clean printout
* Update documentation
* Implement document field map
2020-12-11 12:10:24 +01:00
demSd
a0e146dde6
add gpu support for rag ( #669 )
...
* add gpu support for rag
* Update transformers.py
2020-12-11 12:08:01 +01:00
Malte Pietsch
149d98a0fd
Add latest benchmark run ( #652 )
...
* add latest benchmark run
* update templates and fix small json errors
* Change scale
Co-authored-by: brandenchan <brandenchan@icloud.com>
2020-12-10 16:25:51 +01:00
Timo Moeller
efc754b166
Redone: Fix concatenation of sentences in PreProcessor. Add stride for word-based splits with sentence boundaries ( #641 )
...
* Update preprocessor.py
Concatenation of sentences done correctly. Stride functionality enabled for splitting by words while respecting sentence boundaries.
* Simplify code, add test
Co-authored-by: Krak91 <45461739+Krak91@users.noreply.github.com>
2020-12-09 16:12:36 +01:00
Branden Chan
8c904d79d6
Fix links ( #663 )
2020-12-08 10:28:31 +01:00
Tanay Soni
c4a5de59aa
Add set_node() for Pipeline ( #659 )
2020-12-07 19:16:35 +01:00
Tanay Soni
4152ad8426
Enable dynamic parameter updates for the FARMReader ( #650 )
2020-12-07 14:07:20 +01:00
Malte Pietsch
e6ada08d0e
Update query arg in Tutorial 7 ( #656 )
2020-12-04 08:42:09 +01:00
Tanay Soni
8e52b48e1d
Add pipelines for GenerativeQA & FAQs ( #645 )
2020-12-03 10:27:06 +01:00
Malte Pietsch
216787ed34
Fix benchmarks ( #648 )
...
* disable fasttokenizer, increase ES timeout for delete requests
* add session.close()
* fix deletion of docs
2020-12-02 16:59:42 +01:00
Branden Chan
79555148ac
Add link to FAISS Info in documentation ( #643 )
...
* Add link to FAISS info
* Clean link
2020-12-02 15:24:22 +01:00
brandenchan
cdd009d1ef
Better payload example spacing
2020-12-01 13:07:29 +01:00
Branden Chan
e573c9e27d
Improve User Feedback Documentation ( #539 )
...
* Extend docs
* Add User Feedback API calls
* Incorporate reviewer feedback
2020-12-01 12:55:31 +01:00
Malte Pietsch
a9107d29eb
Refactor DensePassageRetriever._get_predictions ( #642 )
2020-12-01 09:22:15 +01:00
Tanay Soni
5e62e54875
Rename question parameter to query ( #614 )
2020-11-30 17:50:04 +01:00
Branden Chan
5e5dba9587
Add api md ( #631 )
2020-11-27 17:26:53 +01:00
Branden Chan
9fbd845ef3
Clean API docs and increase coverage ( #621 )
...
* Fix docstrings
* Fix docstrings
* docstrings for retrievers and docstores
* Clean and add more docstrings
2020-11-27 17:17:58 +01:00
Tanay Soni
fa55de2fab
Add refresh_type param for Elasticsearch update_embeddings() ( #630 )
2020-11-27 16:10:04 +01:00
brandenchan
ce6cba227f
Fix website typo
2020-11-27 16:07:29 +01:00
Markus Paff
88d0ee2c98
Add boxes for recommendations ( #629 )
...
* add boxes for recommendations
* add more recommendation boxes
Co-authored-by: brandenchan <brandenchan@icloud.com>
2020-11-27 16:00:20 +01:00
Malte Pietsch
58bc9aa7f0
Add contributor hall of fame ( #628 )
2020-11-26 14:52:20 +01:00
Ky-Anh Huynh
0edd127f35
Add formatting checks for shell scripts ( #627 )
2020-11-26 14:36:35 +01:00
Ky-Anh Huynh
4bd4a61e65
README: Fix link to roadmap ( #626 )
...
Co-authored-by: Ky-Anh Huynh <kyanh.huynh@viettug.org>
2020-11-26 14:01:05 +01:00
Tanay Soni
ea976ba5b5
Add return_embedding parameter for get_all_documents() ( #615 )
2020-11-26 10:32:30 +01:00
Branden Chan
09690b84b4
Move DPR embeddings from GPU to CPU straight away ( #618 )
...
* Start
* Move embeddings from gpu to cpu
2020-11-25 14:22:43 +01:00
Branden Chan
ae530c3a41
Fix docstring examples ( #604 )
...
* Fix docstring examples
* Unify code example format
* Add md files
2020-11-25 14:19:49 +01:00
Markus Paff
3dee284f20
cleaning the api docs ( #616 )
2020-11-24 18:49:14 +01:00
Branden Chan
e192387e65
Fix link ( #613 )
2020-11-24 11:11:20 +01:00
Tanay Soni
e3a68aedaf
Add support for building custom Search Pipelines ( #596 )
2020-11-20 17:41:08 +01:00
Guillim
65cf9547d2
Allow setting return_no_answers for TransformersReader in REST API (SQuAD 1.0 format) ( #609 )
...
* Update config.py
* new option
Allow a new option from the settings : tell is a reader model can return a "no answer" like SQuAD2.0 models, or if it's only a SQuAD1.0-like model, always giving an answer.
2020-11-20 14:09:39 +01:00
Branden Chan
1e8af84ecc
Make more changes to documentation ( #578 )
...
* First batch of changes
* Add RAG tutorial links
* Prettify RAG tutorial
* draft of generator doc
* Add text
* Complete generator page
* Create optimization section
* Split intro
* Fix formatting tutorial 7
2020-11-19 14:58:27 +01:00
Branden Chan
2aa3c071fd
Remove column in benchmark website ( #608 )
...
* Make benchmarks clearer
* remove column
2020-11-19 12:18:47 +01:00
Branden Chan
827a40b12a
Make benchmarks clearer ( #606 )
2020-11-19 10:31:43 +01:00
Malte Pietsch
0acafc403a
Automate benchmarks via CML ( #518 )
...
* initial test cml
* Update cml.yaml
* WIP test workflow
* switch to general ubuntu ami
* switch to general ubuntu ami
* disable gpu for tests
* rm gpu infos
* rm gpu infos
* update token env
* switch github token
* add postgres
* test db connection
* fix typo
* remove tty
* add sleep for db
* debug runner
* debug removal postgres
* debug: reset to working commit
* debug: change github token
* switch to new bot token
* debug token
* add back postgres
* adjust network runner docker
* add elastic
* fix typo
* adjust working dir
* fix benchmark execution
* enable s3 downloads
* add query benchmark. fix path
* add saving of markdown files
* cat md files. add faiss+dpr. increase n_queries
* switch to GPU instance
* switch availability zone
* switch to public aws DL ami
* increase volume size
* rm faiss. fix error logging
* save markdown files
* add reader benchmarks
* add download of squad data
* correct reader metric normalization
* fix newlines between reports
* fix max_docs for reader eval data. remove max_docs from ci run config
* fix mypy. switch workflow trigger
* try trigger for label
* try trigger for label
* change trigger syntax
* debug machine shutdown with test workflow
* add es and postgres to test workflow
* Revert "add es and postgres to test workflow"
This reverts commit 6f038d3d7f12eea924b54529e61b192858eaa9d5.
* Revert "debug machine shutdown with test workflow"
This reverts commit db70eabae8850b88e1d61fd79b04d4f49d54990a.
* fix typo in action. set benchmark config back to original
2020-11-18 18:28:17 +01:00
Lalit Pagaria
3f81c93f36
Add document update for SQL and FAISS Document Store ( #584 )
2020-11-16 16:08:13 +01:00
Tanay Soni
3e095ddd7d
Add filters for delete_all_documents() ( #591 )
2020-11-16 14:15:32 +01:00
Lalit Pagaria
b511f9903e
[RAG] Fix top_k generator issue ( #590 )
...
* Removing device information from generator model arguments as it is handled by itself.
* num_return_sequences of should not be greate than num_beams
* Raise error when user use generator with GPU as currently it is not supported
2020-11-16 09:41:30 +01:00
Lalit Pagaria
23f1058b90
Fixing defaults in config for rest_api ( #583 )
...
* Fixing defaults configs for rest_apis
* Reverting change to VALID_LANGUAGES
* Casting EMBEDDING_DIM as int
2020-11-16 06:51:27 +01:00
bogdankostic
b3f7115f71
Add MAP retriever metric for open-domain case ( #572 )
...
* Add MAP metric for closed-domain case
* Add MAP metric for open-domain case
* Adapt MAP for closed-domain setting + add docstring
Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>
2020-11-13 15:04:25 +01:00
Timo Moeller
f118e4b738
Add needed whitespace before sentence start ( #582 )
2020-11-13 14:14:24 +01:00
Branden Chan
44230fca45
Fix CI bug due to new Elasticsearch release and new model release ( #579 )
...
* Cast generator to list
* Restrict ES version range
* Loosen ES requirement
* Change no_answer_test value
2020-11-13 10:35:53 +01:00
brandenchan
090a8cf3e9
Revert "First batch of changes"
...
This reverts commit c07182aa0ab77106cdb142f4ca43ff02476e6fbf.
2020-11-12 12:27:16 +01:00
brandenchan
c07182aa0a
First batch of changes
2020-11-12 12:07:02 +01:00
Branden Chan
e72f4f4299
Update Colab Torch Version ( #576 )
...
* Update torch version
* Update torch version
2020-11-11 13:55:10 +01:00
Tanay Soni
acd088808b
Allow list of filter values in REST API ( #568 )
2020-11-09 20:41:53 +01:00
Malte Pietsch
2b352d6ac4
Update concept image
2020-11-07 08:44:09 +01:00