* fix#1687
* fix - UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow..
* fix RuntimeError: received 0 items of ancdata
* Remove set_sharing_strategy from this branch and replace numpy.zeros_like with python numpy
* Add ParsrConverter
* Fix typing error + add Parsr to Linux CI
* Fix valid_language for all converters + fix context generation for ParsrConverter
* Remove ParsrConverter test from WindowsCI
* Add latest docstring and tutorial changes
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* set fixture scope to "function"
* run FARMReader without multiprocessing
* dispose off ray after tests
* run most expensive tasks first in test files
* run expensive tests first
* run garbage collector between tests
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* upgrade to pytorch 1.10 and transformers 4.11.3
* pin torch to 1.9.1
* Upgrade transformers and torch to 4.12.2 and 1.10.0
* Test transformers 4.10.2
* Pin transformers to 4.10.2
* transformers 4.10.3
* transformers 4.11.0
* transformers 4.11.1
* transformers 4.11.2
* check fix on current transformer's master branch
* Install transformers from commit id
* update transformers to 4.12.5
* Upgrade torch version for torch-scatter
* Upgrade torch version for torch-scatter in Windows CI
* Build new cache
* Undo last commit
* Use transformers v4.11.2
* bump transformers to 4.12.5
* bump transformers to 4.13.0
* re-allow range of torch versions
Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>
Co-authored-by: bogdankostic <bogdankostic@web.de>
* Rely api healthcheck on status code rather than json decoding
* Install UI dependencies on the Linux and Windows CI
Co-authored-by: Fabrice Depaulis <fabrice.depaulis@orange.com>
Co-authored-by: ZanSara <sarazanzo94@gmail.com>
* Fix bug ranker: wrong lambda function
The zip function used in line 110 intends to choose the logits array to be the key for the lambda function while it should be the first/second logit of the logit array which corresponds to the classification label (has_answer)
* Use label 1 as has_answer label
* generic ranker (add if-cond for logits vector shape)
* remove test code
* remove test code...
* add two_logits test case for ranker module.
* complete the documentation of ranker, support rankers with 1 or 2 logits as output
* Replace old tutorial 5 with new code based on test cases
* Add latest docstring and tutorial changes
* Use pipeline.eval() in tutorial
* Add latest docstring and tutorial changes
* Restructure notebook
* Add latest docstring and tutorial changes
* Add dataframe example
* Add latest docstring and tutorial changes
* Get eval data from doc store
* Add latest docstring and tutorial changes
* Load data from doc store
* Add latest docstring and tutorial changes
* Clear outputs
* Add latest docstring and tutorial changes
* Change example and add python script
* Add latest docstring and tutorial changes
* Fetch aggregated multilabels from doc store
* Add latest docstring and tutorial changes
* Incorporate review feedback on text comments
* Add latest docstring and tutorial changes
* Add Notebook output
* Remove queries param from pipeline.eval()
* Add latest docstring and tutorial changes
* Add output with all metrics
* Add printing of multiple metrics to script
* Add latest docstring and tutorial changes
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* diable problematic eval tests for windows ci
* move standard pipeline eval tests to separate test file
* switch to elasticsearch documentstore to reduce inproc mem
* Revert "switch to elasticsearch documentstore to reduce inproc mem"
This reverts commit 7a75871909c3317a252dff3a4df17e99eff69d05.
* get retiever from conftest
* use smaller embedding model for summarizer
* use smaller summarizer model
* remove queries param from pipeline.eval()
* isolate problematic tests
* rename separate test file
* Add latest docstring and tutorial changes
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* Trying to fix a bug occurring when dataset is None (happens with many parallel request for some reason)
* Change favicon and title and fix bug with version number
* Improve the text description and partially fix the enter-to-run function
* Aliasing the join is not sufficient yet
* Update the filter query in some other functions of SQLDocumentStore - this functionality should be centralized
* Adding tests for get_all_documents, now failing
* Fix tests
* Fix typo spotted by mypy
* retriever metrics added
* Add latest docstring and tutorial changes
* answer and document level matching metrics implemented
* Add latest docstring and tutorial changes
* answer related metrics for retriever
* basic reader metrics implemented
* handle no_answers
* fix typing
* fix tests
* fix tests without sas
* first draft for simulated top k
* rename sas and f1 columns in dataframe
* refactoring of EvaluationResult
* Add latest docstring and tutorial changes
* more eval tests added
* fix sas expected value precision
* distinction between ir and qa recall
* EvaluationResult.worst_queries() implemented
* print_evaluation_report() added
* eval report for QA Pipeline improved
* dynamic metrics for worst queries calc
* Add latest docstring and tutorial changes
* method names adjusted
* simple test for print_eval_report() added
* improved documentation
* Add latest docstring and tutorial changes
* minor formatting
* Add latest docstring and tutorial changes
* fix no_answer cases
* adjust one docstring
* Add latest docstring and tutorial changes
* fix no_answer cases for sas
* batchmode for sas implemented
* fix for retriever metrics if there are only no_answers
* fix multilabel tests
* improve documentation for pipeline.eval()
* streamline multilabel aggregates and docs
* Add latest docstring and tutorial changes
* fix multilabel tests
* unify document_id
* add dataframe schema description to EvaluationResult
* Add latest docstring and tutorial changes
* rename worst_queries to wrong_examples
* Add latest docstring and tutorial changes
* make query digesting standard pipelines work with pipeline.eval()
* Add latest docstring and tutorial changes
* tests for multi retriever pipelines added
* remove unnecessary import
* print_eval_report(): support all pipelines without junctions
* Add latest docstring and tutorial changes
* fix typos
* Add latest docstring and tutorial changes
* fix minor simulated_top_k bug and use memory documentstore throughout tests
* sas model param description improved
* Add latest docstring and tutorial changes
* rename recall metrics
* Add latest docstring and tutorial changes
* fix mean average precision link
* Add latest docstring and tutorial changes
* adjust sas description docstring
* Add latest docstring and tutorial changes
* Add latest docstring and tutorial changes
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>
* bugfix metadata extraxtion in the formrecognizer and seperation of surrounding in preceding and following content length
* Fix docstring
* fix metadata extraction for content_type text
Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>