haystack

mirror of https://github.com/deepset-ai/haystack.git synced 2025-07-19 06:52:56 +00:00

Author	SHA1	Message	Date
Julian Risch	2c184e467f	Upgrade transformers to 4.13.0 (#1659 ) * upgrade to pytorch 1.10 and transformers 4.11.3 * pin torch to 1.9.1 * Upgrade transformers and torch to 4.12.2 and 1.10.0 * Test transformers 4.10.2 * Pin transformers to 4.10.2 * transformers 4.10.3 * transformers 4.11.0 * transformers 4.11.1 * transformers 4.11.2 * check fix on current transformer's master branch * Install transformers from commit id * update transformers to 4.12.5 * Upgrade torch version for torch-scatter * Upgrade torch version for torch-scatter in Windows CI * Build new cache * Undo last commit * Use transformers v4.11.2 * bump transformers to 4.12.5 * bump transformers to 4.13.0 * re-allow range of torch versions Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai> Co-authored-by: bogdankostic <bogdankostic@web.de>	2021-12-11 12:08:16 +01:00
Fabrice Depaulis	77d52ad215	Rely api healthcheck on status code rather than json decoding (#1871 ) * Rely api healthcheck on status code rather than json decoding * Install UI dependencies on the Linux and Windows CI Co-authored-by: Fabrice Depaulis <fabrice.depaulis@orange.com> Co-authored-by: ZanSara <sarazanzo94@gmail.com>	2021-12-10 18:05:23 +01:00
Andreas Motl	4eb4503f25	Fix typo (#1869 )	2021-12-10 09:39:45 +01:00
Branden Chan	ea5aab23ec	Update pydoc-markdown-file-classifier.yml (#1856 ) * Update pydoc-markdown-file-classifier.yml * Add latest docstring and tutorial changes * Prevent wrapping DataParallel in second DataParallel (#1855) * Prevent wrapping DataParallel in second DataParallel * Add latest docstring and tutorial changes Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Create v1.0 docs (#1862) * Update pydoc-markdown-file-classifier.yml * Add latest docstring and tutorial changes * Rebase and apply change to v1.0 Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: bogdankostic <bogdankostic@web.de>	2021-12-08 18:19:03 +01:00
Branden Chan	ef1e531895	Create v1.0 docs (#1862 )	2021-12-08 17:53:00 +01:00
bogdankostic	cbfe2b4626	Prevent wrapping DataParallel in second DataParallel (#1855 ) * Prevent wrapping DataParallel in second DataParallel * Add latest docstring and tutorial changes Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2021-12-08 09:56:45 +01:00
Malte Pietsch	8cb513c2c6	Bump version to 1.0.0 v1.0.0	2021-12-07 15:13:24 +01:00
Sara Zan	983b20f28d	Demo UI fix debug info (#1846 ) * Fix debug info * Make enter to run work better * Reintroduce default question in the eval dataset * Outputting valid json instead of a Python dict	2021-12-06 18:55:39 +01:00
KUNPENG GUO	160f81aaa3	Fix bug ranker: wrong lambda function (#1824 ) * Fix bug ranker: wrong lambda function The zip function used in line 110 intends to choose the logits array to be the key for the lambda function while it should be the first/second logit of the logit array which corresponds to the classification label (has_answer) * Use label 1 as has_answer label * generic ranker (add if-cond for logits vector shape) * remove test code * remove test code... * add two_logits test case for ranker module. * complete the documentation of ranker, support rankers with 1 or 2 logits as output	2021-12-06 17:13:57 +01:00
Sara Zan	8b7b51f0f5	Typo spotted in one question. Removed question that returned wrong answer. Added a couple more that work. (#1843 )	2021-12-06 15:44:08 +01:00
Julian Risch	aa1520212f	workaround torch bug with non-continguous tensors (#1845 )	2021-12-06 15:10:51 +01:00
Ivan Lopez	4f6dc36869	Deploy demo (#1837 ) * Add GH Actions workflow for demo deployment * update demo ec2 instance type * remove redundant docker-compose build * add custom demo command and env vars * deploy demo on updates to workflow resources	2021-12-03 15:58:47 +01:00
Branden Chan	bec14b63c3	Add live demo link to readme (#1839 )	2021-12-03 14:34:19 +01:00
Malte Pietsch	90ced1b246	Update release.yml	2021-12-03 13:23:55 +01:00
Malte Pietsch	e5599bd337	Extend categories for release notes (#1841 )	2021-12-03 13:19:45 +01:00
Malte Pietsch	4e76129004	Add config for github release notes (#1840 )	2021-12-03 12:27:58 +01:00
Julian Risch	54f776350c	Update evaluation tutorial to cover the new `pipeline.eval()` (#1765 ) * Replace old tutorial 5 with new code based on test cases * Add latest docstring and tutorial changes * Use pipeline.eval() in tutorial * Add latest docstring and tutorial changes * Restructure notebook * Add latest docstring and tutorial changes * Add dataframe example * Add latest docstring and tutorial changes * Get eval data from doc store * Add latest docstring and tutorial changes * Load data from doc store * Add latest docstring and tutorial changes * Clear outputs * Add latest docstring and tutorial changes * Change example and add python script * Add latest docstring and tutorial changes * Fetch aggregated multilabels from doc store * Add latest docstring and tutorial changes * Incorporate review feedback on text comments * Add latest docstring and tutorial changes * Add Notebook output * Remove queries param from pipeline.eval() * Add latest docstring and tutorial changes * Add output with all metrics * Add printing of multiple metrics to script * Add latest docstring and tutorial changes Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2021-12-03 11:19:41 +01:00
tstadel	9293a902d7	Fix OOM in test_eval.py Windows CI (#1830 ) * diable problematic eval tests for windows ci * move standard pipeline eval tests to separate test file * switch to elasticsearch documentstore to reduce inproc mem * Revert "switch to elasticsearch documentstore to reduce inproc mem" This reverts commit 7a75871909c3317a252dff3a4df17e99eff69d05. * get retiever from conftest * use smaller embedding model for summarizer * use smaller summarizer model * remove queries param from pipeline.eval() * isolate problematic tests * rename separate test file * Add latest docstring and tutorial changes Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2021-12-02 19:23:58 +01:00
tstadel	180c05365a	Deprecate old pipeline eval nodes: EvalDocuments and EvalAnswers (#1778 ) * log deprecated warning on init * deprecation warning included into docstrings * Add latest docstring and tutorial changes Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> v1.0.0rc1	2021-12-02 18:09:26 +01:00
tstadel	dc4cd49049	remove queries param from pipeline.eval() (#1836 ) * remove queries param from pipeline.eval() * Add latest docstring and tutorial changes Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2021-12-02 16:04:01 +01:00
Sara Zan	99365e1d8e	Add backlink below the context, if available in the doc's meta (#1834 )	2021-12-02 13:37:23 +01:00
tstadel	bab05c7677	Fix loading and saving of EvaluationReszult (#1831 ) * fix spans in csvs * fix tests	2021-12-02 10:30:11 +01:00
Sara Zan	c21521dc9c	More demo bugfixes (#1832 ) * Trying to fix a bug occurring when dataset is None (happens with many parallel request for some reason) * Change favicon and title and fix bug with version number * Improve the text description and partially fix the enter-to-run function	2021-12-01 22:25:59 +01:00
Sara Zan	e39d015a59	Allow SQLDocumentStore to filter by many filters (#1776 ) * Aliasing the join is not sufficient yet * Update the filter query in some other functions of SQLDocumentStore - this functionality should be centralized * Adding tests for get_all_documents, now failing * Fix tests * Fix typo spotted by mypy	2021-12-01 16:16:17 +01:00
tstadel	c5540d05ed	Calculation of metrics and presentation of eval results (#1760 ) * retriever metrics added * Add latest docstring and tutorial changes * answer and document level matching metrics implemented * Add latest docstring and tutorial changes * answer related metrics for retriever * basic reader metrics implemented * handle no_answers * fix typing * fix tests * fix tests without sas * first draft for simulated top k * rename sas and f1 columns in dataframe * refactoring of EvaluationResult * Add latest docstring and tutorial changes * more eval tests added * fix sas expected value precision * distinction between ir and qa recall * EvaluationResult.worst_queries() implemented * print_evaluation_report() added * eval report for QA Pipeline improved * dynamic metrics for worst queries calc * Add latest docstring and tutorial changes * method names adjusted * simple test for print_eval_report() added * improved documentation * Add latest docstring and tutorial changes * minor formatting * Add latest docstring and tutorial changes * fix no_answer cases * adjust one docstring * Add latest docstring and tutorial changes * fix no_answer cases for sas * batchmode for sas implemented * fix for retriever metrics if there are only no_answers * fix multilabel tests * improve documentation for pipeline.eval() * streamline multilabel aggregates and docs * Add latest docstring and tutorial changes * fix multilabel tests * unify document_id * add dataframe schema description to EvaluationResult * Add latest docstring and tutorial changes * rename worst_queries to wrong_examples * Add latest docstring and tutorial changes * make query digesting standard pipelines work with pipeline.eval() * Add latest docstring and tutorial changes * tests for multi retriever pipelines added * remove unnecessary import * print_eval_report(): support all pipelines without junctions * Add latest docstring and tutorial changes * fix typos * Add latest docstring and tutorial changes * fix minor simulated_top_k bug and use memory documentstore throughout tests * sas model param description improved * Add latest docstring and tutorial changes * rename recall metrics * Add latest docstring and tutorial changes * fix mean average precision link * Add latest docstring and tutorial changes * adjust sas description docstring * Add latest docstring and tutorial changes * Add latest docstring and tutorial changes Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>	2021-11-30 19:26:34 +01:00
ju-gu	4cce7ffe85	bugfix metadata extraction in form recognizer & split of surrounding content length (#1829 ) * bugfix metadata extraxtion in the formrecognizer and seperation of surrounding in preceding and following content length * Fix docstring * fix metadata extraction for content_type text Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>	2021-11-30 19:10:21 +01:00
Sara Zan	935689e630	Demo UI add env vars & other small fixes (#1828 ) * Add more env vars to the streamlit ui * Add some more questions to the random ones * Relax a statuscode check and rename env vars * Make query error message more descriptive * Add log message * Align docker-compose with and without GPU * Typo in pipeline filename * Remove prefix from var in docker_compose * Align docker-compose.yml and add small sleep to the initialized poller to prevent spamming * Fix the name of the dockerfile used to build the GPU image	2021-11-30 18:11:54 +01:00
AhmedIdr	56e4e8486f	Added max_seq_length and batch_size params to embeddingretriever (#1817 ) * Added max_seq_length and batch_size params, added progress_bar to faiss writing_documents * Add latest docstring and tutorial changes * fixed typos * Update dense.py Changed default batch_size and max_seq_len in EmbeddingRetriever * Add latest docstring and tutorial changes * Update faiss.py Change import tqdm.auto to tqdm * Update faiss.py Changing tqdm back to tqdm.auto Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2021-11-29 19:49:51 +01:00
Sara Zan	fb511dc4a3	Remove feedback from no-answers (#1827 ) * Fix some miscopied code * Remove feedback from the no-answer, seems the backend can't take it * Try to raise concurrent requests per worker * Remove the actual number of workers	2021-11-29 19:42:10 +01:00
bogdankostic	eb5f7bb4c0	Add AzureConverter to support table parsing from documents (#1813 ) * Add FormRecognizerConverter * Change signature of convert method + change return type of all converters * Adapt preprocessing util to new return type of converters * Parametrize number of lines used for surrounding context of table * Change name from FormRecognizerConverter to AzureConverter * Set version of azure-ai-formrecognizer package * Change tutorial 8 based on new return type of converters * Add tests * Add latest docstring and tutorial changes * Fix typo Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>	2021-11-29 18:44:20 +01:00
Sara Zan	c29f960c47	Fix UI demo feedback (#1816 ) * Fix the feedback function of the demo with a workaround * Some docstring * Update tests and rename methods in feedback.py * Fix tests * Remove operation_ids * Add a couple of status code checks	2021-11-29 17:03:54 +01:00
MichelBartels	84147edcca	Model Distillation (#1758 ) * initial commit * Add latest docstring and tutorial changes * added comments and fixed bug * fixed bugs, added benchmark and added documentation * Add latest docstring and tutorial changes * fix type: ignore comment * fix logging in benchmark * fixed distillation config * Add latest docstring and tutorial changes * added type annotations * fixed distillation loss calculation * added type annotations * fixed distillation mse loss * improved model distillation benchmark config loading * added temperature for model distillation * removed uncessary imports, added comments, added named parameter calls * Add latest docstring and tutorial changes * added some more comments * added distillation test * fixed distillation test * removed unnecessary import * fix softmax dimension * add grid search * improved model distillation benchmark config * fixed model distillation hyperparameter search * added doc strings and type hints for model distillation * Add latest docstring and tutorial changes * fixed type hints * fixed type hints * fixed type hints * wrote out params instead of kwargs in DistillationDataSilo initializer * fixed type hints * fixed typo * fixed typo Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2021-11-26 18:49:30 +01:00
Sara Zan	1a4ee21b92	Adapt docker-compose-gpu.yml to use DPR by default (#1810 ) * Adapt docker-compose-gpu.yml to use DPR by default * Update the comments * Change the ES image * Increase the context window and allow no-answers in the DPR pipeline too * Re-enable file upload in GPU version * Add env var without value and a commet to explain it	2021-11-25 16:23:18 +01:00
Sara Zan	9ee0ea0c17	Add description to the demo (#1809 ) * Improve the Random Question functionality and add three example questions * Fix the example questions * Change default docs for the retriever * Add example short description and make the no-answer boxes blue * Modify some text and add a fix for the slider's bug * New no-answer message	2021-11-25 15:27:09 +01:00
Sara Zan	742d4b9db9	Improve the Random Question functionality (#1808 ) * Improve the Random Question functionality and add three example questions * Fix the example questions * Change default docs for the retriever	2021-11-24 15:55:44 +01:00
Julian Risch	3b8e2e7b6c	Fix link to colab notebook in tutorial 16 (#1802 ) * Fix link to colab notebook in tutorial 16 * Add latest docstring and tutorial changes Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2021-11-24 13:19:20 +01:00
Sowmiya Jaganathan	04d93ec247	Introduced an arg to add synonyms - Elasticsearch (#1625 ) * Introduced an arg add synonyms to Elasticsearch * Added the test code, removed the whitespace formatting changes, and overwrote the relevant parts from the already existing mapping instead of creating new mapping. * Added the test code * Remove whitespace change * Added the doc_string with examples and link * Removed unneccessary spaces * Add latest docstring and tutorial changes * fix text_field -> content_field Co-authored-by: sowmiya-emplay <sowmiya.j@emplay.net> Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2021-11-23 19:10:34 +01:00
Sara Zan	565cb7035d	Add missing dependency to the Streamlit container (#1798 )	2021-11-23 19:02:54 +01:00
Sara Zan	600662b1f0	Hide 'no answer' responsed from the REST API tests (#1791 )	2021-11-23 17:13:16 +01:00
MichelBartels	e80771f839	Adding yaml functionality to standard pipelines (save/load...) (#1735 ) * adding yaml functionality to BaseStandardPipeline fixes #1681 * Add latest docstring and tutorial changes * Update API Reference Pages for v1.0 (#1729) * Create new API pages and update existing ones * Create query classifier page * Remove Objects suffix * Change answer aggregation key to doc_id, query instead of label_id, query (#1726) * Add debugging example to tutorial (#1731) * Add debugging example to tutorial * Add latest docstring and tutorial changes * Remove Objects suffix * Add latest docstring and tutorial changes * Revert "Remove Objects suffix" This reverts commit 6681cb06510b080775994effe6a50bae42254be4. * Revert unintentional commit * Add third debugging option * Add latest docstring and tutorial changes Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Fix another self.device/s typo (#1734) * Fix yet another self.device(s) typo * Add typing to 'initialize_device_settings' to try prevent future issues * Fix bug in Tutorial5 * Fix the same bug in the notebook Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * added test for saving and loading prebuilt pipelines * fixed typo, changed variable name and added comments * Add latest docstring and tutorial changes * Fix a few details of some tutorials (#1733) * Make Tutorial10 use print instead of logs and fix a typo in Tutoria15 * Add a type check in 'print_answers' * Add same checks to print_documents and print_questions * Make RAGenerator return Answers instead of dictionaries * Fix RAGenerator tests Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Fix `print_answers` (#1743) * Fix a specific path of print_answers that was assuming answers are dictionaries Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Split pipeline tests into three suites (#1755) * Split pipeline tests into three suites * Will this trigger the CI? * Rename duplicate test into test_most_similar_documents_pipeline * Fixing a bug that was probably never noticed * Capitalize starting letter in params (#1750) * Capitalize starting letter in params Capitalized the starting letter in code examples for params in keeping with the latest names for nodes where first letter is capitalized. Refer: https://github.com/deepset-ai/haystack/issues/1748 * Update standard_pipelines.py Capitalized some starting letters in the docstrings in keeping with the updated node names for standard pipelines * Multi query eval (#1746) * add eval() to pipeline * Add latest docstring and tutorial changes * support multiple queries in eval() * Add latest docstring and tutorial changes * keep single query test * fix EvaluationResult node_results default * adjust docstrings * Add latest docstring and tutorial changes * minor improvements from comments * Add latest docstring and tutorial changes * move EvaluationResult and calculate_metrics to schema * Add latest docstring and tutorial changes Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Split summarizer tests in order to make windows CI work again (#1757) * separate testfile for summarizer with translation * Add latest docstring and tutorial changes * import SPLIT_DOCS from test_summarizer * add workflow_dispatch to windows_ci * add worflow_dispatch to linux_ci Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * fix import of EvaluationResult in test case * exclude test_summarizer_translation.py for windows_ci (#1759) * Pipelines now tolerate custom _debug content (#1756) * Pipelines now tolerate custom _debug content * Support Tables in all DocumentStores (#1744) * Add support for tables in SQLDocumentStore, FAISSDocumentStore and MilvuDocumentStore * Add support for WeaviateDocumentStore * Make sure that embedded meta fields are strings + add embedding_dim to WeaviateDocStore in test config * Add latest docstring and tutorial changes * Represent tables in WeaviateDocumentStore as nested lists * Fix mypy Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Allow TableReader models without aggregation classifier (#1772) * Fix usage of filters in `/query` endpoint in REST API (#1774) * WIP filter refactoring * fix filter formatting * remove inplace modification of filters * Public demo (#1747) * Queries now run only when pressing RUN. File upload hidden. Question is not sent if the textbox is empty. * Add latest docstring and tutorial changes * Tidy up: remove needless state, add comments, fix minor bugs * Had to add results to the status to avoid some bugs in eval mode * Added 'credits' * Add footers, update requirements, some random questions for the evaluation * Add requested changes * Temporary rollback the UI to the old GoT dataset Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Facilitate concurrent query / indexing in Elasticsearch with dense retrievers (new `skip_missing_embeddings` param) (#1762) * Filtering records not having embeddings * Added support for skip_missing_embeddings Flag. Default behavior is throw error when embeddings are missing. If skip_missing_embeddings=True then documents without embeddings are ignored for vector similarity * Fix for below error: haystack/document_stores/elasticsearch.py:852: error: Need type annotation for "script_score_query" * docstring for skip_missing_embeddings parameter * Raise exception where no documents with embeddings is found for Embedding retriever. * Default skip_missing_embeddings to True * Explicitly check if embeddings are present if no results are returned by EmbeddingRetriever for Elasticsearch * Added test case for based on Julian's input * Added test case for based on Julian's input. Fix pytest error on the testcase * Added test case for based on Julian's input. Fix pytest error on the testcase * Added test case for based on Julian's input. Fix pytest error on the testcase * Simplify code by using get_embed_count * Adjust docstring & error msg slightly * Revert error msg Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai> * Huggingface private model support via API tokens (FARMReader) (#1775) * passed kwargs to model loading * Pass Auth token explicitly * add use_auth_token to get_language_model_class * added use_auth_token parameter at FARMReader * Add latest docstring and tutorial changes * added docs for parameter `use_auth_token` * Add latest docstring and tutorial changes * adding docs link * Add latest docstring and tutorial changes Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * private hugging face models for retrievers (#1785) * private dpr * Add latest docstring and tutorial changes * added parameters to child functions * Add latest docstring and tutorial changes * added tableextractor * Add latest docstring and tutorial changes Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * ignore empty filters parameter (#1783) * ignore empty filters parameter * Add latest docstring and tutorial changes Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * initialize doc store with doc and label index in tutorial 5 (#1730) * initialize doc store with doc and label index * change ipynb according to py for tutorial 5 * Add latest docstring and tutorial changes Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Small fixes to the public demo (#1781) * Make strealit tolerant to haystack not knowing its version, and adding special error for docstore issues * Add workaround for a Streamlit bug * Make default filters value an empty dict * Return more context for each answer in the rest api * Make the hs_version call not-blocking by adding a very quick timeout * Add disclaimer on low confidence answer * Use the no-answer feature of the reader to highlight questions with no good answer * Upgrade torch to v1.10.0 (#1789) * Upgrade torch to v1.10.0 * Adapt torch version for torch-scatter in TableQA tutorial * Add latest docstring and tutorial changes * Make torch version more flexible Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * adding yaml functionality to BaseStandardPipeline fixes #1681 * Add latest docstring and tutorial changes * added test for saving and loading prebuilt pipelines * fixed typo, changed variable name and added comments * Add latest docstring and tutorial changes * fix code rendering for example * Add latest docstring and tutorial changes Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Branden Chan <33759007+brandenchan@users.noreply.github.com> Co-authored-by: Julian Risch <julian.risch@deepset.ai> Co-authored-by: Sara Zan <sara.zanzottera@deepset.ai> Co-authored-by: nishanthcgit <5066268+nishanthcgit@users.noreply.github.com> Co-authored-by: tstadel <60758086+tstadel@users.noreply.github.com> Co-authored-by: bogdankostic <bogdankostic@web.de> Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai> Co-authored-by: C V Goudar <cvgoudar@users.noreply.github.com> Co-authored-by: Kristof Herrmann <37148029+ArzelaAscoIi@users.noreply.github.com>	2021-11-23 17:01:39 +01:00
bogdankostic	c00b32cf67	Fix Tutorial 11 on Google Colab (#1795 ) * Remove installation of latest release * Add latest docstring and tutorial changes Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2021-11-23 15:35:23 +01:00
bogdankostic	a19a9f548b	Upgrade torch to v1.10.0 (#1789 ) * Upgrade torch to v1.10.0 * Adapt torch version for torch-scatter in TableQA tutorial * Add latest docstring and tutorial changes * Make torch version more flexible Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2021-11-23 11:49:46 +01:00
Sara Zan	7167a26483	Small fixes to the public demo (#1781 ) * Make strealit tolerant to haystack not knowing its version, and adding special error for docstore issues * Add workaround for a Streamlit bug * Make default filters value an empty dict * Return more context for each answer in the rest api * Make the hs_version call not-blocking by adding a very quick timeout * Add disclaimer on low confidence answer * Use the no-answer feature of the reader to highlight questions with no good answer	2021-11-22 19:06:08 +01:00
Julian Risch	9211c4c64d	initialize doc store with doc and label index in tutorial 5 (#1730 ) * initialize doc store with doc and label index * change ipynb according to py for tutorial 5 * Add latest docstring and tutorial changes Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2021-11-22 15:18:02 +01:00
Julian Risch	845905e418	ignore empty filters parameter (#1783 ) * ignore empty filters parameter * Add latest docstring and tutorial changes Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2021-11-22 09:36:14 +01:00
Kristof Herrmann	a8c2cdc565	private hugging face models for retrievers (#1785 ) * private dpr * Add latest docstring and tutorial changes * added parameters to child functions * Add latest docstring and tutorial changes * added tableextractor * Add latest docstring and tutorial changes Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2021-11-22 09:24:02 +01:00
Kristof Herrmann	8aa4ca29c2	Huggingface private model support via API tokens (FARMReader) (#1775 ) * passed kwargs to model loading * Pass Auth token explicitly * add use_auth_token to get_language_model_class * added use_auth_token parameter at FARMReader * Add latest docstring and tutorial changes * added docs for parameter `use_auth_token` * Add latest docstring and tutorial changes * adding docs link * Add latest docstring and tutorial changes Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2021-11-19 16:48:31 +01:00
C V Goudar	a9a379784a	Facilitate concurrent query / indexing in Elasticsearch with dense retrievers (new `skip_missing_embeddings` param) (#1762 ) * Filtering records not having embeddings * Added support for skip_missing_embeddings Flag. Default behavior is throw error when embeddings are missing. If skip_missing_embeddings=True then documents without embeddings are ignored for vector similarity * Fix for below error: haystack/document_stores/elasticsearch.py:852: error: Need type annotation for "script_score_query" * docstring for skip_missing_embeddings parameter * Raise exception where no documents with embeddings is found for Embedding retriever. * Default skip_missing_embeddings to True * Explicitly check if embeddings are present if no results are returned by EmbeddingRetriever for Elasticsearch * Added test case for based on Julian's input * Added test case for based on Julian's input. Fix pytest error on the testcase * Added test case for based on Julian's input. Fix pytest error on the testcase * Added test case for based on Julian's input. Fix pytest error on the testcase * Simplify code by using get_embed_count * Adjust docstring & error msg slightly * Revert error msg Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>	2021-11-19 14:50:23 +01:00
Sara Zan	d81897535e	Public demo (#1747 ) * Queries now run only when pressing RUN. File upload hidden. Question is not sent if the textbox is empty. * Add latest docstring and tutorial changes * Tidy up: remove needless state, add comments, fix minor bugs * Had to add results to the status to avoid some bugs in eval mode * Added 'credits' * Add footers, update requirements, some random questions for the evaluation * Add requested changes * Temporary rollback the UI to the old GoT dataset Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2021-11-19 11:34:32 +01:00
Malte Pietsch	c0892717a0	Fix usage of filters in `/query` endpoint in REST API (#1774 ) * WIP filter refactoring * fix filter formatting * remove inplace modification of filters	2021-11-18 18:13:03 +01:00

... 56 57 58 59 60 ...

3803 Commits