--- title: "DeepEval" id: integrations-deepeval description: "DeepEval integration for Haystack" slug: "/integrations-deepeval" --- ## Module haystack\_integrations.components.evaluators.deepeval.evaluator ### DeepEvalEvaluator A component that uses the [DeepEval framework](https://docs.confident-ai.com/docs/evaluation-introduction) to evaluate inputs against a specific metric. Supported metrics are defined by `DeepEvalMetric`. Usage example: ```python from haystack_integrations.components.evaluators.deepeval import DeepEvalEvaluator, DeepEvalMetric evaluator = DeepEvalEvaluator( metric=DeepEvalMetric.FAITHFULNESS, metric_params={"model": "gpt-4"}, ) output = evaluator.run( questions=["Which is the most popular global sport?"], contexts=[ [ "Football is undoubtedly the world's most popular sport with" "major events like the FIFA World Cup and sports personalities" "like Ronaldo and Messi, drawing a followership of more than 4" "billion people." ] ], responses=["Football is the most popular sport with around 4 billion" "followers worldwide"], ) print(output["results"]) ``` #### DeepEvalEvaluator.\_\_init\_\_ ```python def __init__(metric: Union[str, DeepEvalMetric], metric_params: Optional[Dict[str, Any]] = None) ``` Construct a new DeepEval evaluator. **Arguments**: - `metric`: The metric to use for evaluation. - `metric_params`: Parameters to pass to the metric's constructor. Refer to the `RagasMetric` class for more details on required parameters. #### DeepEvalEvaluator.run ```python @component.output_types(results=List[List[Dict[str, Any]]]) def run(**inputs: Any) -> Dict[str, Any] ``` Run the DeepEval evaluator on the provided inputs. **Arguments**: - `inputs`: The inputs to evaluate. These are determined by the metric being calculated. See `DeepEvalMetric` for more information. **Returns**: A dictionary with a single `results` entry that contains a nested list of metric results. Each input can have one or more results, depending on the metric. Each result is a dictionary containing the following keys and values: - `name` - The name of the metric. - `score` - The score of the metric. - `explanation` - An optional explanation of the score. #### DeepEvalEvaluator.to\_dict ```python def to_dict() -> Dict[str, Any] ``` Serializes the component to a dictionary. **Raises**: - `DeserializationError`: If the component cannot be serialized. **Returns**: Dictionary with serialized data. #### DeepEvalEvaluator.from\_dict ```python @classmethod def from_dict(cls, data: Dict[str, Any]) -> "DeepEvalEvaluator" ``` Deserializes the component from a dictionary. **Arguments**: - `data`: Dictionary to deserialize from. **Returns**: Deserialized component. ## Module haystack\_integrations.components.evaluators.deepeval.metrics ### DeepEvalMetric Metrics supported by DeepEval. All metrics require a `model` parameter, which specifies the model to use for evaluation. Refer to the DeepEval documentation for information on the supported models. #### ANSWER\_RELEVANCY Answer relevancy.\ Inputs - `questions: List[str], contexts: List[List[str]], responses: List[str]` #### FAITHFULNESS Faithfulness.\ Inputs - `questions: List[str], contexts: List[List[str]], responses: List[str]` #### CONTEXTUAL\_PRECISION Contextual precision.\ Inputs - `questions: List[str], contexts: List[List[str]], responses: List[str], ground_truths: List[str]`\ The ground truth is the expected response. #### CONTEXTUAL\_RECALL Contextual recall.\ Inputs - `questions: List[str], contexts: List[List[str]], responses: List[str], ground_truths: List[str]`\ The ground truth is the expected response.\ #### CONTEXTUAL\_RELEVANCE Contextual relevance.\ Inputs - `questions: List[str], contexts: List[List[str]], responses: List[str]` #### DeepEvalMetric.from\_str ```python @classmethod def from_str(cls, string: str) -> "DeepEvalMetric" ``` Create a metric type from a string. **Arguments**: - `string`: The string to convert. **Returns**: The metric.