haystack

mirror of https://github.com/deepset-ai/haystack.git synced 2025-11-17 10:34:10 +00:00

Author	SHA1	Message	Date
Stefano Fiorucci	5ae94886b2	fix: fix test failures with Transformers models in PRs from forks (#8809 ) * trigger * try pinning sentence transformers * make integr tests run right away * pin transformers instead * older transformers version * rm transformers pin * try ignoring cache * change ubuntu version * try removing token * try again * more HF_API_TOKEN local deletions * restore test priority * rm leftover * more deletions * moreee * more * deletions * restore jobs order	2025-02-04 19:08:37 +01:00
Ajit Singh	6cf13e8b98	enhancement: reduced usage of numpy and substituted built-in libraries (#8418 ) * reduced usage of numpy and substituted built-in libraries * added release note * edited expit function to support both float as well as list (this case was giving error CI) * revert code , numpy can't be removed here * more cleaning * fix relnote --------- Co-authored-by: anakin87 <stefanofiorucci@gmail.com>	2024-10-18 15:42:19 +02:00
Julian Risch	08686d90af	feat: Add DocumentNDCGEvaluator component (#8419 ) * draft new component and tests * draft new component and tests * fix tests, replace usage of get_attr * improve docstrings, refactor tests * add test for mixed documents w/wo scores * add test with multiple lists and update docstring * validate inputs, add tests, make methods static * change fallback to binary relevance * rename validate_init_parameters to validate_inputs	2024-10-01 16:15:02 +02:00
Sriniketh J	066e2e3ec5	Make api_key param optional in LLMEvaluator (#8340 )	2024-09-20 10:47:13 +02:00
Madeesh Kannan	672bcf7e03	fix: Add constraints to `set_input_type(s)` based on `run` method (#8358 ) * fix: Prevent the usage of `set_input_type(s)` when the `run` method doesn't have kwargs, raise if `set_input_type(s)` overrides `run` method parameters * fix: update components and tests * reno	2024-09-12 15:58:16 +02:00
David S. Batista	0c9dc008f0	fix: improve context relevancy metric (#7964 ) * fixing tests * fixing tests * updating tests * updating tests * updating docstring * adding release notes * making the insufficient information more robust * updating docstring and release notes * empty list instead of informative string * Update haystack/components/evaluators/context_relevance.py Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com> * Update haystack/components/evaluators/context_relevance.py Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com> * fixing tests * Update haystack/components/evaluators/context_relevance.py Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com> * reverting commit * reverting again commit * fixing docstrings * removing deprecation warning * removing warning import --------- Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com> Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>	2024-07-22 15:13:46 +02:00
Ulises M	6f8834d036	feat: add and expose api_params for OpenAIGenerator in LLMEvaluator based classes (#7987 ) * initial support for api_params * add tests and reno * resolve suggestions and add integration test * fix mypy --------- Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>	2024-07-11 13:14:03 +02:00
David S. Batista	186512459d	feat: LLM-based evaluators return meta info from OpenAI (#7947 ) * LLM-Evaluator returns metadata from OpenAI * adding tests * adding release notes * updating test * updating release notes * fixing live tests * attending PR comments * fixing tests * Update releasenotes/notes/adding-metadata-info-from-OpenAI-f5309af5f59bb6a7.yaml Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com> * Update llm_evaluator.py --------- Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>	2024-07-02 11:31:51 +02:00
Vladimir Blagojevic	535a281eec	feat: Add option to use `HF_TOKEN` as env var for authentication across all HF components (#7942 ) * Read both HF_API_TOKEN and HF_TOKEN env vars in all HF related components * Add reno note * Test fixes * More test updates * More test updates	2024-06-27 10:31:58 +02:00
David S. Batista	8b9eddcd94	fix: explicitly tell `ContextRelevanceEvaluator` that each statement should be scored (#7904 ) * initial import * adding release notes * adding pytest decorator for live test * make examples more readable * updating tests * reverting progress_bar = False	2024-06-25 16:59:37 +02:00
Amna Mubashar	fc011d7b04	bug: fix MRR and MAP calculations (#7841 ) * bug: fix MRR and MAP calculations	2024-06-25 12:07:11 +02:00
Ulises M	9c45203a76	fix: check for None in SAS eval input (#7909 ) * check for None in SAS input * Update releasenotes/notes/check-for-None-SAS-eval-0b982ccc1491ee83.yaml --------- Co-authored-by: David S. Batista <dsbatista@gmail.com>	2024-06-21 14:22:33 +02:00
Madeesh Kannan	fe60eedee9	fix: Fix deserialization of pipelines that contain `LLMEvaluator` subclasses (#7891 )	2024-06-19 13:47:38 +02:00
Madeesh Kannan	63226dad34	fix: Fix `LLMEvaluator` serialization (#7818 ) * fix: Fix `LLMEvaluator` serialization * `reno`	2024-06-07 12:49:23 +02:00
David S. Batista	38747ff7a3	fix: failsafe for non-valid json and failed LLM calls (#7723 ) * wip * initial import * adding tests * adding params * adding safeguards for nan in evaluators * adding docstrings * fixing tests * removing unused imports * adding tests to context and faithfullness evaluators * fixing docstrings * nit * removing unused imports * adding release notes * attending PR comments * fixing tests * fixing tests * adding types * removing unused imports * Update haystack/components/evaluators/context_relevance.py Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com> * Update haystack/components/evaluators/faithfulness.py Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com> * attending PR comments --------- Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>	2024-05-23 15:41:29 +00:00
David S. Batista	a4fc2b66e6	style: adding progress bar to llm-based evaluators (#7726 ) * adding progress bar * fixing typo * fixing tests * Update test_llm_evaluator.py * fixing missing colon * passing directly to parent * adding docstrings	2024-05-23 09:22:14 +02:00
David S. Batista	798dc4a4a5	fix: avoid FaithfulnessEvaluator and ContextRelevanceEvaluator return `Nan` (#7685 ) * initial import * fixing tests * relaxing condition * adding safeguard for ContextRelevanceEvaluator as well * adding release notes	2024-05-14 17:08:51 +02:00
Massimiliano Pippi	10c675d534	chore: add license header to all modules (#7675 ) * add license header to modules * check license header at linting time	2024-05-09 13:40:36 +00:00
Stefano Fiorucci	94467149c1	fix: fix serialization of `DocumentRecallEvaluator` (#7662 ) * fix serialization of DocumentRecallEvaluator * add requested tests	2024-05-08 16:00:49 +02:00
Julian Risch	2509eeea7e	refactor: Rename FaithfulnessEvaluator input responses to predicted_answers (#7621 )	2024-04-30 16:30:57 +02:00
Madeesh Kannan	a881451d3a	refactor: Refactor `EvaluationResult` into `BaseEvaluationRunResult` and `EvaluationRunResult` (#7594 ) The new `EvaluationRunResult` has slightly different semantics - it separates the previous `data` parameter into `inputs` and `results`and expects aggregate scores to be provided in the latter.	2024-04-25 12:16:48 +02:00
Julian Risch	9c56dbe288	test: Make ContextRelevanceEvaluator integration test more robust (#7584 )	2024-04-23 16:01:25 +00:00
Julian Risch	07307709ee	test: Make FaithfulnessEvaluator integration test more robust (#7582 )	2024-04-23 15:44:00 +00:00
Julian Risch	d7638cfd4b	refactor: FaithfulnessEvaluator specifies inputs explicitly (#7548 ) * specify inputs explicitly. move out examples * Update haystack/components/evaluators/faithfulness.py Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com> --------- Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>	2024-04-22 12:52:10 +00:00
Julian Risch	b12e0db134	feat: Add ContextRelevanceEvaluator component (#7519 ) * feat: Add ContextRelevanceEvaluator component * reno * fix expected inputs and example docstring * remove responses parameter from tests * specify inputs explicitly * add new evaluator to api reference docs	2024-04-22 14:10:00 +02:00
Massimiliano Pippi	2bad5bcb96	refactor: AnswerExactMatchEvaluator component inputs (#7536 ) * refactor component inputs * release notes * Update class docstring * pylint * update existing note instead of creating a new one --------- Co-authored-by: Julian Risch <julian.risch@deepset.ai>	2024-04-12 06:59:16 +00:00
David S. Batista	9a9c8aa1c8	feat: implementing evalualtion results API (#7520 ) * initial import * adding tests * attending PR comments * fixing tests * updating tests * updating tests and code * renaming * fixing linting issues * adding release notes * adding docstrings * latest fixes	2024-04-10 13:34:03 +00:00
Julian Risch	e974a23fa3	docs: Fix eval metric examples in docstrings (#7505 ) * fix eval metric docstrings, change type of individual scores * change import order * change exactmatch docstring to single ground truth answer * change exactmatch comment to single ground truth answer * reverted changing docs to single ground truth * add warm up in SASEvaluator example * fix FaithfulnessEvaluator docstring example * extend FaithfulnessEvaluator docstring example * Update FaithfulnessEvaluator init docstring * Remove outdated default from LLMEvaluator docstring * Add examples param to LLMEvaluator docstring example * Add import and print to LLMEvaluator docstring example	2024-04-10 11:00:20 +02:00
David S. Batista	aae2b31359	fix: typo in sas_evaluator arg (#7486 ) * fixing typo on SAS arg * fixing tests * fixing tests	2024-04-08 10:21:37 +02:00
Julian Risch	9d02dc607a	feat: Add FaithfulnessEvaluator component (#7424 ) * draft FaithfulnessEvaluator * reno * calculate score per statement and aggregate * Update release note * update default values in tests and fix import path * remove instructions, inputs, outputs params * remove unused imports * add expected format example to docstring * remove name 'llm' from tests and docstring	2024-04-04 16:33:59 +00:00
Julian Risch	8ef6062748	refactor: Remove name 'llm' from LLMEvaluator output (#7479 )	2024-04-04 15:19:30 +00:00
Silvano Cerza	8b8a93bc0d	refactor: Rename `DocumentMeanAveragePrecision` and `DocumentMeanReciprocalRank` (#7470 ) * Rename DocumentMeanAveragePrecision and DocumentMeanReciprocalRank * Update releasenotes * Simplify names	2024-04-04 17:04:59 +02:00
Silvano Cerza	bdc25ca2a0	feat: Add `DocumentMeanReciprocalRank` (#7468 ) * Add DocumentMeanReciprocalRank * Fix float precision error	2024-04-04 14:55:37 +02:00
Silvano Cerza	7799909069	feat: Add `DocumentMeanAveragePrecision` (#7461 ) * Add DocumentMeanAveragePrecision * Remove questions input * Update docstrings * Update haystack/components/evaluators/document_map.py Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com> --------- Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>	2024-04-04 14:15:45 +02:00
Silvano Cerza	dc87f51759	refactor: Remove `questions` inputs from evaluators (#7466 ) * Remove questions input from AnswerExactMatchEvaluator * Remove questions input from DocumentRecallEvaluator	2024-04-04 14:14:18 +02:00
Silvano Cerza	12acb3f12e	feat: Add `SASEvaluator` (#7428 ) * Add SASEvaluator * Add release notes * Apply suggestions from code review Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com> * Simplify similarity calculation with bi-encoders models * Fix linting * Update docstrings * Move tensor to CPU after calculating cosine similarity * Fix CI failing --------- Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>	2024-04-04 10:10:41 +02:00
Silvano Cerza	685343d13f	feat: Add `DocumentRecallEvaluator` (#7399 ) * Add DocumentRecallEvaluator * Fix mypy error * Simplify recall logic and change output for single hit mode * Remove unused import * Add comment for RecallMode fields * Reword RecallMode comments Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com> --------- Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>	2024-03-26 16:15:03 +01:00
Silvano Cerza	f398b29e7f	feat: Change outputs of AnswerExactMatchEvaluator (#7390 ) * Change outputs of AnswerExactMatchEvaluator * Changes scores to return the number of matches per question * Revert "Changes scores to return the number of matches per question" This reverts commit e4358720793d4584b0b961402d4557c50c4c2381. * Change output names	2024-03-26 10:57:59 +01:00
Julian Risch	bfd0d3eacd	feat: Add new LLMEvaluator component (#7401 ) * draft llm evaluator * docstrings * flexible inputs; validate inputs and outputs * add tests * add release note * remove example * docstrings * make outputs parameter optional. default: * validate init parameters * linting * remove mention of binary scores from template * make examples and outputs params non-optional * removed leftover from optional outputs param * simplify building examples section for template * validate inputs and outputs in examples are dict with str as key * fix pylint too-many-boolean-expressions * increase test coverage	2024-03-25 07:05:27 +01:00
Silvano Cerza	610ad6f6b2	Add `AnswerExactMatchEvaluator` (#7381 ) * Add AnswerExactMatchEvaluator * Add release notes * Fix linting * Update docstrings * Update docstrings * Remove to_dict and from_dict * Fix linting	2024-03-19 16:58:01 +01:00
Silvano Cerza	0a7dfc1b32	Revert "Add `AnswerExactMatchEvaluator` (#7050 )" (#7075 ) This reverts commit b4011af8e9bc4ae2f72e51db254bfda69e20b651.	2024-02-23 14:05:57 +01:00
Silvano Cerza	b4011af8e9	Add `AnswerExactMatchEvaluator` (#7050 ) * Add AnswerExactMatchEvaluator * Add release notes * Fix linting * Update docstrings	2024-02-23 10:37:18 +01:00
Silvano Cerza	8ca4bf405b	Remove all evaluator components (#7053 )	2024-02-21 18:24:14 +01:00
Ashwin Mathur	327c2d260d	feat: Add Mean Reciprocal Rank (MRR) metric to `StatisticalEvaluator` (#7042 ) * Add MRR Metric * Add release notes * Update logic	2024-02-20 13:58:48 +01:00
Silvano Cerza	9215882779	Add Recall Multi Hit and Single Hit metric (#7038 )	2024-02-19 18:00:39 +01:00
Silvano Cerza	6fe1d3b595	refactor: Clean eval components (#7005 ) * Remove preprocess.py * Rename eval components to evaluators	2024-02-15 17:17:59 +01:00

46 Commits