1461 Commits

Author SHA1 Message Date
Agnieszka Marzec
425da1fd31
Fix load_from_yaml example in the Pipelines tutorial (#2774)
* Fix load from yaml example and image

* Update Documentation & Code Style

* Fixed pipeline exmple

* Update Documentation & Code Style

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-07-08 11:22:11 +02:00
James Briggs
ea40387b97
added mock pinecone client (#2770) 2022-07-07 19:51:30 +02:00
tstadel
d21b066fc7
fix pipeline run loop on joined pipelines whithout debug flag (#2777)
* fix pipeline run loop on joined pipelines whithout debug flag

* use .keys() consistently
2022-07-07 16:47:59 +02:00
bogdankostic
195aed942f
Add update_document_meta to InMemoryDocumentStore (#2689)
* Add update_document_meta to InMemoryDocumentStore

* Fix typo

* Update Documentation & Code Style

* Add update_document_meta to BaseDocumentStore

* Update Documentation & Code Style

* Fix mypy

* Update Documentation & Code Style

* Add update_document_meta to MockDocumentStore

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-07-07 15:44:07 +02:00
tstadel
45136badfe
Fix _debug info getting lost for previous nodes when using join nodes (#2776)
* fix debug output for pipelines with join nodes

* add test

* Update Documentation & Code Style

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-07-07 15:10:13 +02:00
Vladimir Blagojevic
a766b70a8f
Tutorial 18:Open in Colab doesn't work in Firefox (#2767)
* Tutorial 18:Open in Colab doesn't work in Firefox

* Tutorial 18:Open in Colab doesn't work in Firefox v2

* Update Documentation & Code Style

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-07-06 10:51:09 -04:00
Tuana Celik
917afb1530
Trying out some smaller images for docs (#2772) 2022-07-06 16:11:23 +02:00
tstadel
e9219f4dc2
Fix confusing elasticsearch exception (#2763)
* convert confusing exception to warning and add no docs case.

* blacken

* fix test
2022-07-06 15:40:51 +02:00
Vladimir Blagojevic
a2905d05f7
Bump version to next release candidate (#2765)
* Bump version to next release candidate

* Update Documentation & Code Style

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-07-06 11:26:42 +02:00
Vladimir Blagojevic
c80336c424
Upgrade to v1.6.0 and copy docs folder (#2764)
* Upgrade to v1.6.0 and copy docs folder

* Update Documentation & Code Style

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
v1.6.0
2022-07-06 10:25:15 +02:00
tstadel
2a7c0139f5
double max heap size for elasticsearch in CI (#2756) 2022-07-05 13:53:32 +02:00
bogdankostic
353da8b1c1
Add Tutorials 16, 17 and 18 to README (#2758) 2022-07-05 12:04:58 +02:00
Julian Risch
f70f4e90fd
correct docstring parameter name (#2757) 2022-07-05 12:00:40 +02:00
Patrick Deutschmann
1db3fd0942
Add support for Multi-Hop Dense Retrieval (#2571)
* Implement MDR

* Adapt conftest to new MDR signature

* Update Documentation & Code Style

* Change signature of queries param in batch methods of MDR like in #2575

* Update Documentation & Code Style

* Rename MultihopDenseRetriever to MultihopEmbeddingRetriever

* Fix filters in retrieve_batch

* Add docstring for MultihopEmbeddingRetriever.__init__

* Update Documentation & Code Style

* Revert forward signature of TextSimilarityHead

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-07-05 11:31:11 +02:00
bogdankostic
dc48c444d4
Fix loading of tokenizers in DPR (#2755) 2022-07-04 18:18:14 +02:00
Tuana Celik
2a8b129bae
first version of save_to_remote for HF from FarmReader (#2618)
* first version of save_to_remote for HF from FarmReader

* Update Documentation & Code Style

* Changes based on comments

* Update Documentation & Code Style

* imports order

* making small changes to pydoc

* indent fix

* Update Documentation & Code Style

* keyword arguments instead of positional

* Changing to repo_id

huggingface-hub package would have to be v0.5 or higher - checking how to handle with Thomas

* Update Documentation & Code Style

* adding huggingface-hub dependency 0.5 or above

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Sara Zan <sarazanzo94@gmail.com>
2022-07-04 15:39:56 +02:00
Julian Risch
f7d00476f9
Reduce logging messages and simplify logging (#2682)
* change log levels to debug and use torch.div

* Update Documentation & Code Style

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-07-04 14:02:55 +02:00
tstadel
322d964679
Remove rapidfuzz version pin (#2730)
* remove rapidfuzz version pin

* exclude malicious version 2.0.14

* update rapidfuzz version restrictions
2022-07-04 13:53:39 +02:00
Francesco Castelli
31dcd55c24
Validate max_seq_length in SquadProcessor (#2740)
* added max_len_seq validation in SquadProcessor

* fixed string formatting

* added tests for invalid max_seq_len

* Update Documentation & Code Style

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-07-04 13:35:45 +02:00
Vladimir Blagojevic
ffb7e4e4bd
GPL tutorial - add GPU header and open in colab button (#2736)
* GPL tutorial - add GPU header and open in colab button

* Add GPL tutorial to run exclusion list
2022-07-04 05:23:39 -04:00
Julian Risch
1c1faa4742
Make check of document & embedding count optional in FAISS and Pinecone (#2677)
* make validation optional & add method call in pinecone init

* Update Documentation & Code Style

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-07-04 10:12:31 +02:00
Julian Risch
1781e88802
Upgrade torch to 1.12 (#2741)
* Upgrade torch to 1.12

* upgrade torch-scatter

* add explicit torch-scatter installation

* set torch dependency to range >1.9,<1.13
2022-07-01 20:23:32 +02:00
Daniel Augustus Bichuetti Silva
e3b2ee956a
Improved crawler support for dynamically loaded pages (#2710)
* Improved crawler support for dynamically loaded pages

* Reduced scope of StaleElementReferenceException and removed deprecated code from WebDriver initialization

* Improvements on crawler testing code

* Code format and style applied on f028331948c170448613e86dfdfa222f7c2043fd

* Update Documentation & Code Style

* Remove unused imports/parameters

Co-authored-by: Sara Zan <sara.zanzottera@deepset.ai>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-07-01 10:47:33 +02:00
Massimiliano Pippi
1e01cd0efb
pin es client to include bugfixes (#2735) 2022-06-27 15:13:34 +02:00
mathislucka
8d65bc5f9b
Update document scores based on ranker node (#2048)
* ranker should return scores for later usage

* fix wrong tuple order

* adjust ranker scores; add tests

* Update Documentation & Code Style

* fix mypy

* Update Documentation & Code Style

* fix mypy

* Update Documentation & Code Style

* relax ranker test tolerance

* update ranker test score

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Julian Risch <julian.risch@deepset.ai>
2022-06-27 12:17:18 +02:00
Julian Risch
46c9c8c562
Upgrade transformers to 4.20.1 (#2702)
Co-authored-by: ZanSara <sarazanzo94@gmail.com>
2022-06-27 11:56:58 +02:00
Vladimir Blagojevic
b08c5f81d1
Add GPL adaptation tutorial (#2632)
* Add GPL adaptation tutorial

* Latest round of Aga's corrections

* Update Documentation & Code Style

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-06-26 02:44:57 -04:00
Sara Zan
426f49979b
Change repo with repository in python_cache (#2731)
* Change repo with repository

* remove name

* using owner and name

* use owner name

* replace name with login

* Trying with the PR context instead
2022-06-24 18:36:19 +02:00
Sara Zan
6a7152044e
add repo name as well (#2729) 2022-06-24 17:08:28 +02:00
Stefano Fiorucci
42b1a5c3a4
fix error in log message (#2719)
* fix error in log message

* Update Documentation & Code Style

* pass index to _drop_duplicate_documents

* make the use of index in logging more explicit

* Update Documentation & Code Style

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-06-24 16:53:52 +02:00
Sara Zan
13514f960d
Speficy ref in action (#2727) 2022-06-24 15:56:17 +02:00
Massimiliano Pippi
3207f372ee
Fix bugs in loading code from yaml (#2705)
* fix bug in loading code from yaml
2022-06-24 14:52:13 +02:00
tstadel
ab443aab28
Fix match_context tests in test_utils.py (#2725)
* fix match_context tests

* fix naming of test

* pin rapidfuzz to 2.0.13
2022-06-24 13:23:00 +02:00
Sara Zan
e8546e2124
Replace deprecated Selenium methods (#2724)
* Fix crawler.py

* Fix test_connector.py

* unused import

Co-authored-by: danielbichuetti <daniel.bichuetti@gmail.com>
2022-06-24 12:05:32 +02:00
Sara Zan
400d2cdf77
Fix audio tests on CI (#2718)
* Update Documentation & Code Style

* fix huggingface-hub version

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-06-24 11:36:31 +02:00
tstadel
1168f6365d
Fix using id_hash_keys as pipeline params (#2717)
* Fix using id_hash_keys as pipeline params

* Update Documentation & Code Style

* add tests

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-06-24 09:55:09 +02:00
tstadel
a084a982c4
Show warning in reader.eval() about differences compared to pipeline.eval() (#2477)
* deprecate reader.eval

* Update Documentation & Code Style

* update warning to describe differences between pipeline.eval()

* remove empty lines

* Update Documentation & Code Style

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-06-23 18:40:17 +02:00
Sara Zan
e69492a28f
Tutorial 14 doc changes (#2714)
* let the bot apply changes in this pr

* Update Documentation & Code Style

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-06-23 12:36:12 +02:00
Stefano Fiorucci
b01a7c2259
Add InMemoryKnowledgeGraph (#2678)
* draft for InMemoryKnowledgeGraph

* remove comments

* Update Documentation & Code Style

* fix import and signature

* Fix dependencies for in_memory_knowlede_graph

* updated tutorials

* Update Documentation & Code Style

* fix bug in notebook

* fix other notebook bug

* Update Documentation & Code Style

* improved tutorial notebook

* Update Documentation & Code Style

* better implementation of InMemoryKnowledgeGraph

* fix

* Update Documentation & Code Style

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-06-22 19:16:33 +02:00
Rob Pasternak
b87c0c950b
Tutorial 14 edit (#2663)
* Rewrite Tutorial 14 for increased user-friendliness

* Update Tutorial14 .py file to match .ipynb file

* Update Documentation & Code Style

* unblock the ci

* ignore error in jitterbit/get-changed-files

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Sara Zan <sarazanzo94@gmail.com>
2022-06-22 13:03:07 +02:00
Julian Risch
325bc5466a
Revert "Upgrade transformers to 4.20.0 (#2694)" (#2700)
This reverts commit 4a63707f1a177123c13929eb316d3ecaa7fd6c5f.
2022-06-21 21:17:21 +02:00
Julian Risch
4a63707f1a
Upgrade transformers to 4.20.0 (#2694) 2022-06-21 17:23:31 +02:00
Sara Zan
505ababf43
Skip Pinecone tests (#2696)
* comment out Pinecone tests block

* Add comment
2022-06-21 14:49:36 +02:00
Massimiliano Pippi
5d255f0e4a
replace question issue with link to discussions (#2697) 2022-06-21 14:10:11 +02:00
Sara Zan
a6c06ee376
Update contributor's checklists in PR template (#2659)
* Split contributor's and reviewer's checklists

* contributor-centric checklist

* Move issues at the top and split entry

* phrasing
2022-06-21 10:11:18 +02:00
tstadel
da5ea73339
Fix EvaluationSetCliet.get_labels() (#2690)
* fix EvaluationSetCliet.get_labels()

* Update Documentation & Code Style

* fix tests

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-06-20 19:16:09 +02:00
bogdankostic
b16430b61e
Tutorial 4: Set similarity to "cosine" in DocStore initialization (#2673)
* Set similarity to cosine in DocStore initialization

* Update Documentation & Code Style

* Set `scale_score` to `False`

* Update Documentation & Code Style

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-06-20 18:47:09 +02:00
Massimiliano Pippi
79b287b568
Extract common code for ES and OS into a base class (#2664)
* extract common code for ES and OS into a base class

* Update Documentation & Code Style

* give the base class a more obvious name

* Update Documentation & Code Style

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-06-20 09:47:44 +02:00
MichelBartels
964e6cdafb
Fix JoinAnswer/JoinNode (#2612)
* fix join nodes

* Update Documentation & Code Style

* fix unused import

* change arg order

* Update Documentation & Code Style

* fix kwargs check

* add warning when there is only one input node

* Update Documentation & Code Style

* fix type hint

* fix wrong import order

* Update Documentation & Code Style

* undo kwargs

* add accidentally deleted newline#

* fix type hint

* fix type hint

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-06-17 16:29:15 +02:00
Sara Zan
a26c042994
Fix typo in code_and_docs.sh (#2662)
* Fix typo in code_and_docs.sh & install ffmpeg in autoformat.yml

* apt update to get ffmpeg

* Update Documentation & Code Style

* Add header and better error message

* Update Documentation & Code Style

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-06-15 13:50:55 +02:00