haystack

mirror of https://github.com/deepset-ai/haystack.git synced 2025-07-21 16:04:09 +00:00

Author	SHA1	Message	Date
tstadel	9611b64ec5	fix: document retrieval metrics for non-document_id document_relevance_criteria (#3885 ) * fix document retrieval metrics for all document_relevance_criteria * fix tests * fix eval_batch metrics * small refactorings * evaluate metrics on label level * document retrieval tests added * fix pylint * fix test * support file retrieval * add comment about threshold * rename test	2023-02-02 15:00:07 +01:00
Julian Risch	a2c160e7d8	bug: skip empty documents in reader (#3773 ) * skip empty documents * test eval_batch and account for tables	2023-01-03 15:50:14 +01:00
Julian Risch	adb580b6b7	feat: add offsets_in_context to evaluation result (#3640 ) * add offsets_in_context to eval result * extend test case	2022-11-30 11:43:42 +01:00
Stefano Fiorucci	3040e59c63	feat: add support for `BM25Retriever` in `InMemoryDocumentStore` (#3561 ) * very first draft * implement query and query_batch * add more bm25 parameters * add rank_bm25 dependency * fix mypy * remove tokenizer callable parameter * remove unused import * only json serializable attributes * try to fix: pylint too-many-public-methods / R0904 * bm25 attribute always present * convert errors into warnings to make the tutorial 1 work * add docstrings; tests * try to make tests run * better docstrings; revert not running tests * some suggestions from review * rename elasticsearch retriever as bm25 in tests; try to test memory_bm25 * exclude tests with filters * change elasticsearch to bm25 retriever in test_summarizer * add tests * try to improve tests * better type hint * adapt test_table_text_retriever_embedding * handle non-textual docs * query only textual documents	2022-11-22 09:24:52 +01:00
tstadel	0d45cbce56	convert eval metrics to python float (#3612 )	2022-11-22 09:05:10 +01:00
Massimiliano Pippi	6a48ace9b9	BREAKING CHANGE: remove Milvus1DocumentStore along with support for Milvus < 2.x (#3552 ) * remove milvus1 * leftover * revert deprecation process	2022-11-15 09:54:55 +01:00
Stefano Fiorucci	1a60e21137	refactor: simplify Summarizer, add Document Merger (#3452 ) * remove generate_single_summary * update schemas * remove unused import * fix mypy * fix mypy * test: summarizer doesnt change content * other test correction * move test_summarizer_translation to test_extractor_translation * fix test * first try for doc merger * reintroduce and deprecate generate_single_summary * progress in document merger * document merger! * mypy, pylint fixes * use generator * added test that will fail in 1.12 * adapt to review * extended deprecation docstring * Update test/nodes/test_extractor_translation.py * Update test/nodes/test_summarizer.py * Update test/nodes/test_summarizer.py * black * documents fixture Co-authored-by: Sara Zan <sarazanzo94@gmail.com>	2022-11-03 16:04:53 +01:00
Sebastian	59857cb492	feat: Speed up reader tests (#3476 ) * Use a smaller reader where possible * Change scope to module of reader to get faster load times	2022-10-26 19:04:18 +02:00
tstadel	7fe5003c97	fix: eval() with `add_isolated_node_eval=True` breaks if no node supports it (#3347 ) * fix isolated eval for pipelines without a node supporting isolated mode * reformat * add test	2022-10-10 20:48:13 +02:00
Vladimir Blagojevic	938e6fda5b	Classify pipeline's type based on its components (#3132 ) * Add pipeline get_type mehod * Add pipeline uptime * Add pipeline telemetry event sending * Send pipeline telemetry once a day (at most) * Add pipeline invocation counter, change invocation counter logic * Update allowed telemetry parameters - allow pipeline parameters * PR review: add unit test	2022-09-21 14:53:42 +02:00
Julian Risch	3e3ff33cdd	feat: add batch evaluation method for pipelines (#2942 ) * add basic pipeline.eval_batch for qa without filters * black formatting * pydoc-markdown * remove batch eval tests failing due to bugs * remove comment * explain commented out tests * avoid code duplication * black * mypy * pydoc markdown * add batch option to execute_eval_run * pydoc markdown * Apply documentation suggestions from code review Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com> * Apply documentation suggestion from code review Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com> * add documentation based on review comments * black * black * schema updates * remove duplicate tests * add separate method for column reordering * merge _build_eval_dataframe methods * pylint ignore in function * change type annotation of queries to list only * one-liner addressing review comment on params dict * markdown files updated Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>	2022-08-25 17:50:57 +02:00
tstadel	0efad96e08	DC SDK: Add possibility to upload evaluation sets to DC (#2610 ) * Add possibility to upload evaluation sets to DC * fix test_eval sas comparisons * quickwin docstring feedback changes * Add hint about annotation tool and mark optional and required columns * minor changes to docstrings	2022-05-31 17:08:19 +02:00
tstadel	7caca41c5d	Support context matching in `pipeline.eval()` (#2482 ) * calculate context pred metrics * Update Documentation & Code Style * extend doc_relevance_col values * fix import order * Update Documentation & Code Style * fix mypy * fix typings literal import * add option for custom document_id_field * Update Documentation & Code Style * fix tests and dataframe col-order * Update Documentation & Code Style * rename content to context in eval dataframe * add backward compatibility to EvaluationResult.load() * Update Documentation & Code Style * add docstrings * Update Documentation & Code Style * support sas * Update Documentation & Code Style * add answer_scope param * Update Documentation & Code Style * rework doc_relevance_col and keep document_id col in case of custom_document_id_field * Update Documentation & Code Style * improve docstrings * Update Documentation & Code Style * rename document_relevance_criterion into document_scope * Update Documentation & Code Style * add document_scope and answer_scope to print_eval_report * support all new features in execute_eval_run() * fix imports * fix mypy * Update Documentation & Code Style * rename pred_label_sas_grid into pred_label_matrix * update dataframe schema and sorting * Update Documentation & Code Style * pass through context_matching params and extend document_scope test * Update Documentation & Code Style * add answer_scope tests * fix context_matching_threshold for document metrics * shorten dataframe apply calls * Update Documentation & Code Style * fix queries getting lost if nothing was retrieved * Update Documentation & Code Style * Update Documentation & Code Style * use document_id scopes * Update Documentation & Code Style * fix answer_scope literal * Update Documentation & Code Style * update the docs (lg changes) * Update Documentation & Code Style * update tutorial 5 * Update Documentation & Code Style * fix tests * Add minor lg updates * final docstring changes * fix single quotes in docstrings * Update Documentation & Code Style * dataframe scopes added for each column * better docstrings for context_matching params * Update Documentation & Code Style * fix summarizer eval test * Update Documentation & Code Style * fix test * Update Documentation & Code Style Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: agnieszka-m <amarzec13@gmail.com>	2022-05-24 18:11:52 +02:00
Sara Zan	ff4303c51b	[CI refactoring] Categorize tests into folders (#2554 ) * Categorize tests into folders * Fix linux_ci.yml and an import * Wrong path	2022-05-17 09:55:53 +01:00

14 Commits