1461 Commits

Author SHA1 Message Date
bogdankostic
9c409e0012 Remove StreamingDataSilo and fix mypy errors from FARM (#1426)
* Add AdaptiveModel

* Add BiAdaptiveModel

* Add DataSilo

* Remove StreamingDataSilo

* Fix mypy errors
2021-09-09 10:12:35 +02:00
dependabot[bot]
a92f1860f6
Bump pillow from 8.2.0 to 8.3.2 (#1423)
Bumps [pillow](https://github.com/python-pillow/Pillow) from 8.2.0 to 8.3.2.
- [Release notes](https://github.com/python-pillow/Pillow/releases)
- [Changelog](https://github.com/python-pillow/Pillow/blob/master/CHANGES.rst)
- [Commits](https://github.com/python-pillow/Pillow/compare/8.2.0...8.3.2)

---
updated-dependencies:
- dependency-name: pillow
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-09-08 17:51:18 +02:00
Timo Moeller
b4fd08a296
Add testdata, add tests for qa processor, add dpr tests (some failing) 2021-09-08 12:02:08 +02:00
Timo Moeller
024b9e0bf8 Merge previous solutions: fix imports, add needed helper functions or remove unused ones 2021-09-08 11:51:41 +02:00
Timo Moeller
a945b43a57
Farm merging base bogdan (#1424)
* Add AdaptiveModel

* Add BiAdaptiveModel

* Add DataSilo

Co-authored-by: Bogdan Kostić <bogdankostic@web.de>
2021-09-08 10:38:28 +02:00
Timo Moeller
c5999c3c8f Add LMand tokenization 2021-09-07 13:37:36 +02:00
Julian Risch
55a8031aeb
Adding prediction head, trainer, evaluator from FARM (#1419) 2021-09-07 13:33:17 +02:00
Timo Moeller
5bc5665c0b Add processor and processing related scripts 2021-09-07 12:33:33 +02:00
Bob van Luijt
c0cc8bc80f
Bump Weaviate version to 1.7.0 (#1412)
* Bump Weaviate

* Bump Weaviate

* Bump Weaviate client

* Bump Weaviate

* Revert client version

There is a change in the client API that needs to be addressed before bumping its version
2021-09-05 09:28:55 +02:00
Malte Pietsch
f3e7074c13
Remove stale bot 2021-09-03 17:39:24 +02:00
Malte Pietsch
f3d1df1664
Enable docker-compose for GPUs & Add public UI image (#1406)
* add docker-compose-gpu file

* Update README.md

* Update docker-compose.yml

* Update docker-compose-gpu.yml

* Update docker-compose.yml

* Update docker-compose-gpu.yml
2021-09-02 17:39:21 +02:00
Malte Pietsch
bb9ec90d3c
Fix tesseract installation in Dockerfile (#1405)
* Fix Dockerfile

* Update Dockerfile-GPU
2021-09-02 11:09:30 +02:00
bogdankostic
38128c6734
Ensure num_hard_negatives is 0 when embedding passages (#1402) 2021-09-02 10:46:02 +02:00
Julian Risch
b552bf9b4d
Add sentence-transformers as mandatory dependency and remove from dev… (#1387)
* Add sentence-transformers as mandatory dependency and remove from dev dependency

* Pin sentence-transformers version
2021-09-02 09:54:13 +02:00
Branden Chan
980d88a0f2
Update faq model (#1401) 2021-09-01 18:39:06 +02:00
Malte Pietsch
e4c3c3d423
Fix CI (introduced by OCR PR #1349) (#1399)
* satisfy mypy

* add import
2021-09-01 17:16:05 +02:00
Malte Pietsch
6093bf9ff6
Fix Github action 2021-09-01 16:50:29 +02:00
Shahrukh Khan
4822536886
Add ImageToTextConverter and PDFToTextOCRConverter that utilize OCR (#1349)
* add image.py converter

* add PDFtoImageConverter

* add init to PDFtoImageConverter and classes to __init__

* update imagetotext pipeline

* update imagetotext pipeline

* update imagetotext pipeline

* update imagetotext pipeline

* update imagetotext pipeline

* update imagetotext pipeline

* update imagetotext pipeline

* revert change in base.py in file_conv

* Update base.py

* Update pdf.py

* add ocr file_converter testcase & update dockerfile

* fix tesseract exception message typo

* fix _image_to_text doctstring

* add tesseract installation to CI

* add tesseract installation to CI

* add content test for PDF OCR converter

* update PDFToTextOCRConverter constructor doctsring

* replace image files with tmp paths for image.py convert

* replace image files with tmp paths for image.py convert

* Update README.md

Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>
2021-09-01 16:42:25 +02:00
oryx1729
1d2252e96d
docker-compose always pull REST API Image (#1385) 2021-09-01 16:28:25 +02:00
Ikram Ali
3fc7f3f695
[docs] crawler api docs updated. (#1388) 2021-09-01 12:07:32 +02:00
Branden Chan
4021eb838e
Add weaviate to init (#1379) 2021-08-31 15:23:06 +02:00
Branden Chan
1938fb001b
Add support for no Docker envs in Tutorial 13 (#1365)
* Add support for no docker envs e.g. colab

* Generate md
2021-08-31 15:22:51 +02:00
oryx1729
a71180a2ca
Refactor replicas config for Ray Pipelines (#1378) 2021-08-31 10:14:55 +02:00
Ikram Ali
da5ed43734
Catch Elastic's search_phase_execution and raise with descriptive message. (#1371)
* [document_store] Catch Elastic's search_phase_execution_exception (dense retrieval if not all documents have an embedding) closes #1135

* change error msg

* remove unused import

Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>
2021-08-30 19:38:07 +02:00
Jeff Hammerbacher
1c8a03aaa2
Rag tutorial fixes (#1375)
* Update Tutorial7_RAG_Generator.ipynb

`delete_all_documents` --> `delete_documents` (cf. #1045)

* Update Tutorial7_RAG_Generator.py

`delete_all_documents` --> `delete_documents` (cf. #1045)
2021-08-30 15:27:18 +02:00
cambiumproject
4ca97dd5be
Fix behavior of delete_documents() with filters for Milvus (#1354)
* Fix behavior of delete_documents()

Delete filtered set of vectors rather than the whole collection

* Update milvus.py

* Update milvus.py
2021-08-30 15:22:53 +02:00
ramgarg102
51f0a56e5d
delete_all_documents() replaced by delete_documents() (#1377)
* [UPDT] delete_all_documents() replaced by delete_documents()

* [UPDT] warning logs to be fixed

* [UPDT] delete_all_documents() renamed and the same method added

Co-authored-by: Ram Garg <ramgarg102@gmai.com>
2021-08-30 15:18:28 +02:00
Markus Paff
be8d305190
Editing docs read.me for new docs website workflow (#1372)
* editing docs read.me for new docs website workflow

* added new links to docs
2021-08-30 14:59:40 +02:00
Shahrukh Khan
c3d8aa0643
Add query classifier usage docs (#1348)
* Create query_classifier.md

* Update query_classifier.md

* Update query_classifier.md

* Update query_classifier.md

* Update query_classifier.md

Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>
2021-08-24 15:56:11 +02:00
Ikram Ali
ead96730d3
Add Crawler support for indexing pipeline (#1360) 2021-08-24 14:25:22 +02:00
Markus Paff
cac15310bd
adding tutorial 13 and 14 (#1364) 2021-08-23 11:37:06 +02:00
Malte Pietsch
2a226daac4
Add simple docs2answer node to allow FAQ style QA / Doc search in API (#1361)
* minimal docs2answer node

* enable logs again
2021-08-20 17:01:55 +02:00
Markus Paff
ff2049cd45
updated tutorials (#1359) 2021-08-19 21:16:56 +02:00
annagruendler
a3c746abf5
Update test documentation in readme (#1355) 2021-08-19 10:36:21 +02:00
Ikram Ali
ef27f0d386
Add tests for Crawler (#1339) 2021-08-18 14:05:44 +02:00
Branden Chan
a023f0a32a
Support OpenDistro init (#1334)
* Support OpenDistro init

* Fix docstring
2021-08-17 12:07:36 +02:00
Julian Risch
eb990c9688
Removing probability field from answers in favor of score field (#1340)
* Removing probability field from reader and from test cases

* Add switch to FARMReader to choose score/probability

* Remove probability field from doc returned by doc store

* Relax assertion testing joined es and dpr predictions

* Use switch for confidence scores also for no_answer

* Add test that checks switching to old answer scores > 10

* Normalize score in elastic doc store and reset reader.md

* Scale weights of JoinDocuments to sum to 1 and adapt test case
2021-08-17 10:27:11 +02:00
Julian Risch
e7b3e2764c
Add link to arxiv paper on SAS (#1344) 2021-08-16 10:47:27 +02:00
Tanay Pant
79df82aec6
Remove empty bullet points (#1342) 2021-08-12 20:09:18 +02:00
Timo Moeller
07bd3c50ea
Add new QA eval metric: Semantic Answer Similarity (SAS) (#1338)
* init

* Add type annotation

* Add test case, fix mypy

* Add german model to docstring

Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>
2021-08-12 14:31:48 +02:00
Bob van Luijt
ba071cc052
Bump Weaviate version (#1336) 2021-08-12 09:54:09 +02:00
Markus Paff
7569ab97dd
Add faq annotation (#1333)
* add annotation faq to read.me

* design fix

* add faq to docs page

* changed format
2021-08-10 14:55:31 +02:00
Malte Pietsch
be9d19afa5
Remove Finder from tutorials (#1329) 2021-08-10 11:50:59 +02:00
Ikram Ali
d94674c5b6
Remove finder class from tutorial 1 (#1328) 2021-08-10 11:41:07 +02:00
Malte Pietsch
5e16ec4d76
Fix installation in Colab Tutorial 11 2021-08-10 08:50:04 +02:00
Malte Pietsch
a0921f0c35
Remove Finder (#1326)
* deprecate finder

* remove import

* add doc section for moving from finder to pipelines
2021-08-09 13:41:40 +02:00
Bishal gaire
4198dc6feb
Update docstring for RAG (#1149)
* Update 7.md

Initialize retriever in RAG generator

* update docstring

* Update 7.md

* Update 7.md

Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>
2021-08-09 11:52:45 +02:00
Malte Pietsch
66b10a508b
Update TOC of readme 2021-08-09 11:40:20 +02:00
Malte Pietsch
fb4d6e0381
Update README.md 2021-08-09 11:25:47 +02:00
Malte Pietsch
5a3ea5843f
Fix Tutorial Links 2021-08-09 11:22:19 +02:00