haystack/docs-website/versioned_docs/version-2.18/pipeline-components/evaluators/answerexactmatchevaluator.mdx

---
title: "AnswerExactMatchEvaluator"
id: answerexactmatchevaluator
slug: "/answerexactmatchevaluator"
description: "The `AnswerExactMatchEvaluator` evaluates answers predicted by Haystack pipelines using ground truth labels. It checks character by character whether a predicted answer exactly matches the ground truth answer. This metric is called the exact match."
---

# AnswerExactMatchEvaluator

The `AnswerExactMatchEvaluator` evaluates answers predicted by Haystack pipelines using ground truth labels. It checks character by character whether a predicted answer exactly matches the ground truth answer. This metric is called the exact match.

|  |  |
| --- | --- |
| **Most common position in a pipeline** | On its own or in an evaluation pipeline. To be used after a separate pipeline that has generated the inputs for the Evaluator. |
| **Mandatory run variables** | "ground_truth_answers": A list of strings containing the ground truth answers  <br /> <br />"predicted_answers": A list of strings containing the predicted answers to be evaluated |
| **Output variables** | A dictionary containing:  <br /> <br />\- `score`: A number from 0.0 to 1.0 representing the proportion of questions in which any predicted answer matched the ground truth answers  <br /> <br />- `individual_scores`: A list of 0s and 1s, where 1 means that the predicted answer matched one of the ground truths |
| **API reference** | [Evaluators](/reference/evaluators-api) |
| **GitHub link** | https://github.com/deepset-ai/haystack/blob/main/haystack/components/evaluators/answer_exact_match.py |

## Overview

You can use the `AnswerExactMatchEvaluator` component to evaluate answers predicted by a Haystack pipeline, such as an extractive question answering pipeline, against ground truth labels. As the `AnswerExactMatchEvaluator` checks whether a predicted answer exactly matches the ground truth answer. It is not suited to evaluate answers generated by LLMs, for example, in a RAG pipeline. Use `FaithfulnessEvaluator` or `SASEvaluator` instead.

To initialize an `AnswerExactMatchEvaluator`, there are no parameters required.

Note that only _one_ predicted answer is compared to _one_ ground truth answer at a time. The component does not support multiple ground truth answers for the same question or multiple answers predicted for the same question.

## Usage

### On its own

Below is an example of using an `AnswerExactMatchEvaluator` component to evaluate two answers and compare them to ground truth answers.

```python
from haystack.components.evaluators import AnswerExactMatchEvaluator

evaluator = AnswerExactMatchEvaluator()
result = evaluator.run(
    ground_truth_answers=["Berlin", "Paris"],
    predicted_answers=["Berlin", "Lyon"],
)

print(result["individual_scores"])
## [1, 0]
print(result["score"])
## 0.5
```

### In a pipeline

Below is an example where we use an `AnswerExactMatchEvaluator` and a `SASEvaluator` in a pipeline to evaluate two answers and compare them to ground truth answers. Running a pipeline instead of the individual components simplifies calculating more than one metric.

```python
from haystack import Pipeline
from haystack.components.evaluators import AnswerExactMatchEvaluator
from haystack.components.evaluators import SASEvaluator

pipeline = Pipeline()
em_evaluator = AnswerExactMatchEvaluator()
sas_evaluator = SASEvaluator()
pipeline.add_component("em_evaluator", em_evaluator)
pipeline.add_component("sas_evaluator", sas_evaluator)

ground_truth_answers = ["Berlin", "Paris"]
predicted_answers = ["Berlin", "Lyon"]

result = pipeline.run(
		{
			"em_evaluator": {"ground_truth_answers": ground_truth_answers,
	    "predicted_answers": predicted_answers},
	    "sas_evaluator": {"ground_truth_answers": ground_truth_answers,
	    "predicted_answers": predicted_answers}
    }
)

for evaluator in result:
    print(result[evaluator]["individual_scores"])
## [1, 0]
## [array([[0.99999994]], dtype=float32), array([[0.51747656]], dtype=float32)]

for evaluator in result:
    print(result[evaluator]["score"])
## 0.5
## 0.7587383
```