71 Commits

Author SHA1 Message Date
Sara Zan
009c89fc53
Revert "Make the docstring bot work only on master" (#2114)
* Revert "Make the docstring bot work only on master (#2078)"

This reverts commit 649d07405770cd59696d0120107a3b2f0aafe7c2.

* Add latest docstring and tutorial changes

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-02-02 16:08:34 +01:00
mathislucka
88771b2bee
Provide option to recreate es doc store on initialization (#2084)
* provide option to recreate es doc store on initialization

* Add latest docstring and tutorial changes

* Label expects more arguments

* Label expects also an answer

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Sara Zan <sara.zanzottera@deepset.ai>
2022-02-02 11:03:15 +01:00
Sara Zan
9af1292cda
Remove stray requirements.txt files and update README.md (#2075)
* Remove stray requirements.txt files and update README.md

* Remove requirement files

* Add details about pip bug and link to setup.cfg

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-01-27 11:22:14 +01:00
Sara Zan
d470b9d0bd
Improve dependency management (#1994)
* Fist attempt at using setup.cfg for dependency management

* Trying the new package on the CI and in Docker too

* Add composite extras_require

* Add the safe_import function for document store imports and add some try-catch statements on rest_api and ui imports

* Fix bug on class import and rephrase error message

* Introduce typing for optional modules and add type: ignore in sparse.py

* Include importlib_metadata backport for py3.7

* Add colab group to extra_requires

* Fix pillow version

* Fix grpcio

* Separate out the crawler as another extra

* Make paths relative in rest_api and ui

* Update the test matrix in the CI

* Add try catch statements around the optional imports too to account for direct imports

* Never mix direct deps with self-references and add ES deps to the base install

* Refactor several paths in tests to make them insensitive to the execution path

* Include tstadel review and re-introduce Milvus1 in the tests suite, to fix

* Wrap pdf conversion utils into safe_import

* Update some tutorials and rever Milvus1 as default for now, see #2067

* Fix mypy config


Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-01-26 18:12:55 +01:00
MichelBartels
4cc37548e3
Fix finetuning notebook augmentation (#2071)
* fix data augmentation path in finetuning notebook

* Add latest docstring and tutorial changes

* make distillation possible with other models than BERT

* use smaller dataset for distillation in finetuning tutorial

* Add latest docstring and tutorial changes

* make data augmentation in finetuning faster

* update language models forward doc strings

* fix return type of language models

* remove debug output

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-01-26 17:49:14 +01:00
Sowmiya Jaganathan
c4fff19018
Supported Highlighting in Elasticsearch (#1930)
* Supported Highlighting

* Review changes

* add example to docstrings

* Add latest docstring and tutorial changes

* Add latest docstring and tutorial changes

Co-authored-by: sowmiya-emplay <sowmiya.j@emplay.net>
Co-authored-by: Thomas Stadelmann <thomas.stadelmann@deepset.ai>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: tstadel <60758086+tstadel@users.noreply.github.com>
2022-01-26 17:35:33 +01:00
MichelBartels
e8cd5ea943
Add distillation to finetuning tutorial (#2025)
* Add finetuning tutorial

* Add latest docstring and tutorial changes

* fix typo

* Add latest docstring and tutorial changes

* improve distillation explanation in finetuning tutorial

* Add latest docstring and tutorial changes

* allow augment_squad.py to be easier to call from within python

* Update Tutorial2_Finetune_a_model_on_your_data.py

* fix squad augmentation test

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-01-20 12:18:32 +01:00
tstadel
f42d2e8ba0
Add nDCG to pipeline.eval()'s document metrics (#2008)
* add ndcg metric

* fix merge

* Add latest docstring and tutorial changes

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-01-14 18:36:41 +01:00
Julian Risch
2c063e960e
Extend Tutorial 5 with Upper Bound Reader Eval Metrics (#1995)
* print report for closed-domain eval

* Add latest docstring and tutorial changes

* rename parameter and rewrite docs

* Add latest docstring and tutorial changes

* print eval report in separate cell

* Add latest docstring and tutorial changes

* explain when to eval individual components

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-01-14 16:29:18 +01:00
Julian Risch
5695d721aa
update link to annotation tool docu (#2005)
* update link to annotation tool docu

* Add latest docstring and tutorial changes

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-01-14 16:10:59 +01:00
Dmitry Goryunov
79fdda8a7c
Remove hard-coded variables from the Tutorial 15 (#1984)
* Remove hard-coded variables from the Tutorial 15

* Fix missing comma

* Add latest docstring and tutorial changes

* Fix formatting in Tutorial15_TableQA.ipynb

* Add latest docstring and tutorial changes

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-01-11 17:55:20 +01:00
Mathew Kuriakose
a44b6c18c0
Unify vector_dim and embedding_dim parameter in Document Store (#1922)
* Refactored code to unify vector_dim and embedding_dim parameter in DocumentStores

* Unit test cases updated to use `embedding_dim` instead of `vector_dim`

* Unit test case update to use embedding_dim instead of vector_dim

* Add latest docstring and tutorial changes

* Put usage of `vector_dim` param in same if-block as corresponding warning

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: bogdankostic <bogdankostic@web.de>
2022-01-10 18:10:32 +01:00
Julian Risch
a846be99d1
Extend TranslationWrapper to work with QA Generation (#1905)
* draft translationwrapper example

* draft translation of generated qa pairs

* Add latest docstring and tutorial changes

* fixed pass by reference by deepcopy

* delete adapted tutorial 13 (test purposes only)

* adapt method signature and doc string

* Add latest docstring and tutorial changes

* add type ignore

* extend tutorial 13 with TranslationWrapper example

* Add latest docstring and tutorial changes

* removed duplicate code

* indent if statement

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: ArzelaAscoIi <kristof.herrmann@rwth-aachen.de>
2022-01-03 13:30:24 +01:00
Alberto Villa
1bb6244a63
Exchanged minimal with minimum in print_answers function call (#1890) 2021-12-14 15:27:37 +01:00
Alberto Villa
2396f0cd3a
Correct bug with encoding when generating Markdown documentation; linked with issue #1880 (#1881) 2021-12-14 10:50:25 +01:00
Julian Risch
54f776350c
Update evaluation tutorial to cover the new pipeline.eval() (#1765)
* Replace old tutorial 5 with new code based on test cases

* Add latest docstring and tutorial changes

* Use pipeline.eval() in tutorial

* Add latest docstring and tutorial changes

* Restructure notebook

* Add latest docstring and tutorial changes

* Add dataframe example

* Add latest docstring and tutorial changes

* Get eval data from doc store

* Add latest docstring and tutorial changes

* Load data from doc store

* Add latest docstring and tutorial changes

* Clear outputs

* Add latest docstring and tutorial changes

* Change example and add python script

* Add latest docstring and tutorial changes

* Fetch aggregated multilabels from doc store

* Add latest docstring and tutorial changes

* Incorporate review feedback on text comments

* Add latest docstring and tutorial changes

* Add Notebook output

* Remove queries param from pipeline.eval()

* Add latest docstring and tutorial changes

* Add output with all metrics

* Add printing of multiple metrics to script

* Add latest docstring and tutorial changes

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2021-12-03 11:19:41 +01:00
bogdankostic
eb5f7bb4c0
Add AzureConverter to support table parsing from documents (#1813)
* Add FormRecognizerConverter

* Change signature of convert method + change return type of all converters

* Adapt preprocessing util to new return type of converters

* Parametrize number of lines used for surrounding context of table

* Change name from FormRecognizerConverter to AzureConverter

* Set version of azure-ai-formrecognizer package

* Change tutorial 8 based on new return type of converters

* Add tests

* Add latest docstring and tutorial changes

* Fix typo

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>
2021-11-29 18:44:20 +01:00
Julian Risch
3b8e2e7b6c
Fix link to colab notebook in tutorial 16 (#1802)
* Fix link to colab notebook in tutorial 16

* Add latest docstring and tutorial changes

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2021-11-24 13:19:20 +01:00
bogdankostic
c00b32cf67
Fix Tutorial 11 on Google Colab (#1795)
* Remove installation of latest release

* Add latest docstring and tutorial changes

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2021-11-23 15:35:23 +01:00
bogdankostic
a19a9f548b
Upgrade torch to v1.10.0 (#1789)
* Upgrade torch to v1.10.0

* Adapt torch version for torch-scatter in TableQA tutorial

* Add latest docstring and tutorial changes

* Make torch version more flexible

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2021-11-23 11:49:46 +01:00
Julian Risch
9211c4c64d
initialize doc store with doc and label index in tutorial 5 (#1730)
* initialize doc store with doc and label index

* change ipynb according to py for tutorial 5

* Add latest docstring and tutorial changes

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2021-11-22 15:18:02 +01:00
Sara Zan
ea3abd305b
Fix a few details of some tutorials (#1733)
* Make Tutorial10 use print instead of logs and fix a typo in Tutoria15

* Add a type check in 'print_answers'

* Add same checks to print_documents and print_questions

* Make RAGenerator return Answers instead of dictionaries

* Fix RAGenerator tests

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2021-11-12 16:44:28 +01:00
Sara Zan
85a08d671a
Fix another self.device/s typo (#1734)
* Fix yet another self.device(s) typo

* Add typing to 'initialize_device_settings' to try prevent future issues

* Fix bug in Tutorial5

* Fix the same bug in the notebook

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2021-11-11 17:18:06 +01:00
Branden Chan
8082549663
Add debugging example to tutorial (#1731)
* Add debugging example to tutorial

* Add latest docstring and tutorial changes

* Remove Objects suffix

* Add latest docstring and tutorial changes

* Revert "Remove Objects suffix"

This reverts commit 6681cb06510b080775994effe6a50bae42254be4.

* Revert unintentional commit

* Add third debugging option

* Add latest docstring and tutorial changes

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2021-11-11 14:45:06 +01:00
tstadel
14515a861b
Tutorial for DocumentClassifier at Index Time (#1697)
* basic example of document classifier in preprocessing logic

* add batch_size to TransformersDocumentClassifier

* complete tutorial16

* Add latest docstring and tutorial changes

* fix missing batch_size

* add notebook

* test for batch_size use added

* add tutorial 16 to headers.py

* Add latest docstring and tutorial changes

* make DocumentClassifier indexing pipeline rdy

* Add latest docstring and tutorial changes

* flexibility improvements for DocumentClassifier in Pipelines

* Add latest docstring and tutorial changes

* fix index time usage

* remove query from documentclassifier tests

* improve classification_field resolving + minor fixes

* Add latest docstring and tutorial changes

* tutorial 16 extended with zero shot and pipelines

* Add latest docstring and tutorial changes

* install graphviz in notebook

* Add latest docstring and tutorial changes

* remove convert_to_dicts

* Add latest docstring and tutorial changes

* Fix typo

* Add latest docstring and tutorial changes

* remove retriever from indexing pipeline

* Add latest docstring and tutorial changes

* fix save_to_yaml when using FileTypeClassifier

* emphasize the impact with zero shot classification

* Add latest docstring and tutorial changes

* adjust use_gpu to boolean in test

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>
2021-11-09 18:43:00 +01:00
Sara Zan
91cafb49bb
Improve tutorials' output (#1694)
* Modify __str__ and __repr__ for Document and Answer

* Rename QueryClassifier in Tutorial11

* Improve the output of tutorial1

* Make the output of Tutorial8 a bit less dense

* Add a print_questions util to print the output of question generating pipelines

* Replace custom printing with the new utility in Tutorial13

* Ensure all output is printed with minimal details in Tutorial14 and add some titles

* Minor change to print_answers

* Make tutorial3's output the same as tutorial1

* Add __repr__ to Answer and fix to_dict()

* Fix a bug in the Document and Answer's __str__ method

* Improve print_answers, print_documents and print_questions

* Using print_answers in Tutorial7 and fixing typo in the utils

* Remove duplicate line in Tutorial12

* Use print_answers in Tutorial4

* Add explanation of what the documents in the output of the basic QA pipeline are

* Move the fields constant into print_answers

* Normalize all 'minimal' to 'minimum' (they were mixed up)

* Improve the sample output to include all fields from Document and Answer

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2021-11-09 15:09:26 +01:00
Timo Moeller
5ac89332a2
Fix Typo in TableQA Tutorial (#1690) 2021-11-08 17:14:04 +01:00
Julian Risch
efdcd24d70
fixed typo (#1680)
* fixed typo

* Add latest docstring and tutorial changes

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2021-11-01 10:38:39 +01:00
bogdankostic
9025615be7
Add TableQA tutorial (#1670)
* Add TableQA tutorial

* Add tutorial header

* Add latest docstring and tutorial changes

* Add more details

* Add latest docstring and tutorial changes

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2021-10-29 11:07:13 +02:00
Sara Zan
13510aa753
Refactoring of the haystack package (#1624)
* Files moved, imports all broken

* Fix most imports and docstrings into

* Fix the paths to the modules in the API docs

* Add latest docstring and tutorial changes

* Add a few pipelines that were lost in the inports

* Fix a bunch of mypy warnings

* Add latest docstring and tutorial changes

* Create a file_classifier module

* Add docs for file_classifier

* Fixed most circular imports, now the REST API can start

* Add latest docstring and tutorial changes

* Tackling more mypy issues

* Reintroduce  from FARM and fix last mypy issues hopefully

* Re-enable old-style imports

* Fix some more import from the top-level  package in an attempt to sort out circular imports

* Fix some imports in tests to new-style to prevent failed class equalities from breaking tests

* Change document_store into document_stores

* Update imports in tutorials

* Add latest docstring and tutorial changes

* Probably fixes summarizer tests

* Improve the old-style import allowing module imports (should work)

* Try to fix the docs

* Remove dedicated KnowledgeGraph page from autodocs

* Remove dedicated GraphRetriever page from autodocs

* Fix generate_docstrings.sh with an updated list of yaml files to look for

* Fix some more modules in the docs

* Fix the document stores docs too

* Fix a small issue on Tutorial14

* Add latest docstring and tutorial changes

* Add deprecation warning to old-style imports

* Remove stray folder and import Dict into dense.py

* Change import path for MLFlowLogger

* Add old loggers path to the import path aliases

* Fix debug output of convert_ipynb.py

* Fix circular import on BaseRetriever

* Missed one merge block

* re-run tutorial 5

* Fix imports in tutorial 5

* Re-enable squad_to_dpr CLI from the root package and move get_batches_from_generator into document_stores.base

* Add latest docstring and tutorial changes

* Fix typo in utils __init__

* Fix a few more imports

* Fix benchmarks too

* New-style imports in test_knowledge_graph

* Rollback setup.py

* Rollback squad_to_dpr too

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2021-10-25 15:50:23 +02:00
Julian Risch
6033319cfe
Fix parameter names in tutorial 5 and 12 (#1639)
* Fix parameter names in tutorial 5

* Update parameters in tutorial notebook

* Add latest docstring and tutorial changes

* Add latest docstring and tutorial changes

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2021-10-22 17:22:51 +02:00
Timo Moeller
9dc125df9d
Bugfix Tutorial 5 parameters, adjust default split length (#1635)
Bugfix parameters, adjust default split length, add sentencetransformers
2021-10-22 16:03:12 +02:00
Julian Risch
52e1fc991e
Update jobs link to personio (#1611)
* Update jobs link to personio

* Add latest docstring and tutorial changes

* Change jobs link to main website

* Add latest docstring and tutorial changes

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2021-10-21 11:42:32 +02:00
Julian Risch
f2a3f95ab6
add note on gpu runtime to tutorial 13 (#1614)
* add note on gpu runtime to tutorial 13

* Add latest docstring and tutorial changes

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2021-10-20 09:55:56 +02:00
Julian Risch
5ec29a5283
Limit generator tests to memory doc store; split pipeline tests (#1602)
* Limit generator tests to memory doc store; split pipeline tests

* Add latest docstring and tutorial changes

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2021-10-15 15:37:46 +02:00
Malte Pietsch
99c8046367
Fix Tutorials (#1594)
* fix response format of DocumentSearchPipeline

* Add latest docstring and tutorial changes

* fix typos

* change prints in tutorial 4

* Add latest docstring and tutorial changes

* fix tutorial 13

* Add latest docstring and tutorial changes

* remove unused import

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2021-10-14 11:49:35 +02:00
Malte Pietsch
caba590576
Fix answer format in ui (#1591)
* fix answer format in ui

* Add latest docstring and tutorial changes

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2021-10-13 16:48:33 +02:00
Malte Pietsch
4a6c9302b3
Redesign primitives - Document, Answer, Label (#1398)
* first draft / notes on new primitives

* wip label / feedback refactor

* rename doc.text -> doc.content. add doc.content_type

* add datatype for content

* remove faq_question_field from ES and weaviate. rename text_field -> content_field in docstores. update tutorials for content field

* update converters for . Add warning for empty

* renam label.question -> label.query. Allow sorting of Answers.

* WIP primitives

* update ui/reader for new Answer format

* Improve Label. First refactoring of MultiLabel. Adjust eval code

* fixed workflow conflict with introducing new one (#1472)

* Add latest docstring and tutorial changes

* make add_eval_data() work again

* fix reader formats. WIP fix _extract_docs_and_labels_from_dict

* fix test reader

* Add latest docstring and tutorial changes

* fix another test case for reader

* fix mypy in farm reader.eval()

* fix mypy in farm reader.eval()

* WIP ORM refactor

* Add latest docstring and tutorial changes

* fix mypy weaviate

* make label and multilabel dataclasses

* bump mypy env in CI to python 3.8

* WIP refactor Label ORM

* WIP refactor Label ORM

* simplify tests for individual doc stores

* WIP refactoring markers of tests

* test alternative approach for tests with existing parametrization

* WIP refactor ORMs

* fix skip logic of already parametrized tests

* fix weaviate behaviour in tests - not parametrizing it in our general test cases.

* Add latest docstring and tutorial changes

* fix some tests

* remove sql from document_store_types

* fix markers for generator and pipeline test

* remove inmemory marker

* remove unneeded elasticsearch markers

* add dataclasses-json dependency. adjust ORM to just store JSON repr

* ignore type as dataclasses_json seems to miss functionality here

* update readme and contributing.md

* update contributing

* adjust example

* fix duplicate doc handling for custom index

* Add latest docstring and tutorial changes

* fix some ORM issues. fix get_all_labels_aggregated.

* update drop flags where get_all_labels_aggregated() was used before

* Add latest docstring and tutorial changes

* add to_json(). add + fix tests

* fix no_answer handling in label / multilabel

* fix duplicate docs in memory doc store. change primary key for sql doc table

* fix mypy issues

* fix mypy issues

* haystack/retriever/base.py

* fix test_write_document_meta[elastic]

* fix test_elasticsearch_custom_fields

* fix test_labels[elastic]

* fix crawler

* fix converter

* fix docx converter

* fix preprocessor

* fix test_utils

* fix tfidf retriever. fix selection of docstore in tests with multiple fixtures / parameterizations

* Add latest docstring and tutorial changes

* fix crawler test. fix ocrconverter attribute

* fix test_elasticsearch_custom_query

* fix generator pipeline

* fix ocr converter

* fix ragenerator

* Add latest docstring and tutorial changes

* fix test_load_and_save_yaml for elasticsearch

* fixes for pipeline tests

* fix faq pipeline

* fix pipeline tests

* Add latest docstring and tutorial changes

* fix weaviate

* Add latest docstring and tutorial changes

* trigger CI

* satisfy mypy

* Add latest docstring and tutorial changes

* satisfy mypy

* Add latest docstring and tutorial changes

* trigger CI

* fix question generation test

* fix ray. fix Q-generation

* fix translator test

* satisfy mypy

* wip refactor feedback rest api

* fix rest api feedback endpoint

* fix doc classifier

* remove relation of Labels -> Docs in SQL ORM

* fix faiss/milvus tests

* fix doc classifier test

* fix eval test

* fixing eval issues

* Add latest docstring and tutorial changes

* fix mypy

* WIP replace dataclasses-json with manual serialization

* Add latest docstring and tutorial changes

* revert to dataclass-json serialization for now. remove debug prints.

* update docstrings

* fix extractor. fix Answer Span init

* fix api test

* keep meta data of answers in reader.run()

* fix meta handling

* adress review feedback

* Add latest docstring and tutorial changes

* make document=None for open domain labels

* add import

* fix print utils

* fix rest api

* adress review feedback

* Add latest docstring and tutorial changes

* fix mypy

Co-authored-by: Markus Paff <markuspaff.mp@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2021-10-13 14:23:23 +02:00
Sara Zan
6354528336
Add /documents/get_by_filters endpoint (#1580)
* Add endpoint to get documents by filter

* Add test for /documents/get_by_filter and extend the delete documents test

* Add rest_api/file-upload to .gitignore

* Make sure the document store is empty for each test

* Improve docstrings of delete_documents_by_filters and get_documents_by_filters

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2021-10-12 10:53:54 +02:00
Julian Risch
f9d2f786ca
Replace FARM import statements; add dependencies (#1492)
* Replace FARM import statements; add dependencies

* Add InferenceProc., TextCl.Proc., TextPairCl.Proc.

* Remove FARMRanker, add type annotations, rename max_sample

* Add sample_to_features_text for InferenceProc.

* Fix type annotations: model_name_or_path is str not Path

* Fix mypy errors: implement _create_dataset in TextCl.Proc.

* Add task_type "embeddings" in Inferencer

* Allow loading AdaptiveModel for embedding task

* Add SQuAD eval metrics; enable InferenceProc for embedding task

* Add baskets as param to log_samples and handle empty basket list in log_samples

* Remove unused dependencies

* Remove FARMClassifier (doc classificer) due to ref to TextClassificationHead

* Remove FARMRanker and Classifier from doc generation scripts

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2021-09-28 16:34:24 +02:00
bogdankostic
c644e2b4d0
Add comment to tutorial notebooks about restarting runtime in colab (#1486)
* Add comment to tutorial notebooks about restarting runtime in colab

* Add latest docstring and tutorial changes

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2021-09-23 14:36:20 +02:00
Julian Risch
d569e66bc7
Update Tutorial1_Basic_QA_Pipeline.ipynb (#1489)
* Update Tutorial1_Basic_QA_Pipeline.ipynb

passing params to pipeline as dict

* Add latest docstring and tutorial changes

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2021-09-22 16:35:20 +02:00
Branden Chan
bddee2def4
Define SAS model in notebook (#1485)
* Define SAS model in notebook

* Add latest docstring and tutorial changes

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2021-09-21 17:05:16 +02:00
Branden Chan
2c4baa7f4e
Regenerate API and Tutorial md files (#1480)
* Change punctuation

* Add latest docstring and tutorial changes

* Change punctuation

* Add documentation for Docs2Answer

* Add latest docstring and tutorial changes

* Generate new API docs

* Replace Finder with Pipeline

* Add latest docstring and tutorial changes

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2021-09-21 14:42:18 +02:00
Markus Paff
39845c0624
Automate updates docstrings tutorials (#1461)
* remove not needed githab actions and reactivate docstrings and tutorial generation

* test workflow

* update pydoc version

* update python version

* update watchdog

* move to latest version pydoc-markdown

* remove version check

* Add latest docstring and tutorial changes

* remove test workflow

* test for param docstrings

* pin pydoc-markdown version

* add test workflow

* pin watchdog version

* Add latest docstring and tutorial changes

* update original workflow and delete test

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2021-09-17 13:44:31 +02:00
Branden Chan
1938fb001b
Add support for no Docker envs in Tutorial 13 (#1365)
* Add support for no docker envs e.g. colab

* Generate md
2021-08-31 15:22:51 +02:00
Markus Paff
cac15310bd
adding tutorial 13 and 14 (#1364) 2021-08-23 11:37:06 +02:00
Markus Paff
ff2049cd45
updated tutorials (#1359) 2021-08-19 21:16:56 +02:00
Branden Chan
7dbd58f6be
Add about sections (#1195) 2021-06-14 18:37:00 +02:00
vblagoje
2a5882578a
Add Longform-QA (LFQA), Seq2SeqGenerator for generative QA and Retribert Retriever (#1086)
* Integrate LFQA with Haystack

* Integrate LFQA with Haystack - unit tests

* Properly initialize conftest default value for vector_dim

* Update PR after inital feedback

* Fix conftest.py import

* Seq2SeqGenerator uses Callables instead of subclasses for custom model input

* Update docstring

* Fix Callable use

* Add LFQA tutorials

* Improve type error reporting for invalid input converter Callable

* Generate docstrings

* Format comments in tutorial script

* Generate tutorial md

* Add usage page

Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>
Co-authored-by: brandenchan <brandenchan@icloud.com>
2021-06-14 17:53:43 +02:00