2025-10-10 11:44:13 +02:00
---
title: "Data Classes"
2025-10-27 13:03:22 +01:00
id: data-classes
slug: "/data-classes"
2025-10-10 11:44:13 +02:00
description: "In Haystack, there are a handful of core classes that are regularly used in many different places. These are classes that carry data through the system and you are likely to interact with these as either the input or output of your pipeline."
---
2025-10-27 13:03:22 +01:00
# Data Classes
In Haystack, there are a handful of core classes that are regularly used in many different places. These are classes that carry data through the system and you are likely to interact with these as either the input or output of your pipeline.
2025-10-10 11:44:13 +02:00
Haystack uses data classes to help components communicate with each other in a simple and modular way. By doing this, data flows seamlessly through the Haystack pipelines. This page goes over the available data classes in Haystack: ByteStream, Answer (along with its variants ExtractedAnswer and GeneratedAnswer), ChatMessage, Document, and StreamingChunk, explaining how they contribute to the Haystack ecosystem.
2025-10-27 13:03:22 +01:00
You can check out the detailed parameters in our [Data Classes](/reference/data-classes-api) API reference.
2025-10-10 11:44:13 +02:00
2025-10-27 13:03:22 +01:00
### Answer
2025-10-10 11:44:13 +02:00
2025-10-27 13:03:22 +01:00
#### Overview
2025-10-10 11:44:13 +02:00
The `Answer` class serves as the base for responses generated within Haystack, containing the answer's data, the originating query, and additional metadata.
2025-10-27 13:03:22 +01:00
#### Key Features
2025-10-10 11:44:13 +02:00
- Adaptable data handling, accommodating any data type (`data`).
- Query tracking for contextual relevance (`query`).
- Extensive metadata support for detailed answer description.
2025-10-27 13:03:22 +01:00
#### Attributes
2025-10-10 11:44:13 +02:00
```python
@dataclass(frozen=True)
class Answer:
data: Any
query: str
meta: Dict[str, Any]
```
2025-10-27 13:03:22 +01:00
### ExtractedAnswer
2025-10-10 11:44:13 +02:00
2025-10-27 13:03:22 +01:00
#### Overview
2025-10-10 11:44:13 +02:00
`ExtractedAnswer` is a subclass of `Answer` that deals explicitly with answers derived from Documents, offering more detailed attributes.
2025-10-27 13:03:22 +01:00
#### Key Features
2025-10-10 11:44:13 +02:00
- Includes reference to the originating `Document`.
- Score attribute to quantify the answer's confidence level.
- Optional start and end indices for pinpointing answer location within the source.
2025-10-27 13:03:22 +01:00
#### Attributes
2025-10-10 11:44:13 +02:00
```python
@dataclass
class ExtractedAnswer:
query: str
score: float
data: Optional[str] = None
document: Optional[Document] = None
context: Optional[str] = None
document_offset: Optional["Span"] = None
context_offset: Optional["Span"] = None
meta: Dict[str, Any] = field(default_factory=dict)
```
2025-10-27 13:03:22 +01:00
### GeneratedAnswer
2025-10-10 11:44:13 +02:00
2025-10-27 13:03:22 +01:00
#### Overview
2025-10-10 11:44:13 +02:00
`GeneratedAnswer` extends the `Answer` class to accommodate answers generated from multiple Documents.
2025-10-27 13:03:22 +01:00
#### Key Features
2025-10-10 11:44:13 +02:00
- Handles string-type data.
- Links to a list of `Document` objects, enhancing answer traceability.
2025-10-27 13:03:22 +01:00
#### Attributes
2025-10-10 11:44:13 +02:00
```python
@dataclass
class GeneratedAnswer:
data: str
query: str
documents: List[Document]
meta: Dict[str, Any] = field(default_factory=dict)
```
2025-10-27 13:03:22 +01:00
### ByteStream
2025-10-10 11:44:13 +02:00
2025-10-27 13:03:22 +01:00
#### Overview
2025-10-10 11:44:13 +02:00
`ByteStream` represents binary object abstraction in the Haystack framework and is crucial for handling various binary data formats.
2025-10-27 13:03:22 +01:00
#### Key Features
2025-10-10 11:44:13 +02:00
- Holds binary data and associated metadata.
- Optional MIME type specification for flexibility.
- File interaction methods (`to_file`, `from_file_path`, `from_string`) for easy data manipulation.
2025-10-27 13:03:22 +01:00
#### Attributes
2025-10-10 11:44:13 +02:00
```python
@dataclass(frozen=True)
class ByteStream:
data: bytes
metadata: Dict[str, Any] = field(default_factory=dict, hash=False)
mime_type: Optional[str] = field(default=None)
```
2025-10-27 13:03:22 +01:00
#### Example
2025-10-10 11:44:13 +02:00
```python
from haystack.dataclasses.byte_stream import ByteStream
image = ByteStream.from_file_path("dog.jpg")
```
2025-10-27 13:03:22 +01:00
### ChatMessage
2025-10-10 11:44:13 +02:00
2025-10-27 13:03:22 +01:00
`ChatMessage` is the central abstraction to represent a message for a LLM. It contains role, metadata and several types of content, including text, tool calls and tool calls results.
2025-10-10 11:44:13 +02:00
Read the detailed documentation for the `ChatMessage` data class on a dedicated [ChatMessage](doc:chatmessage) page.
2025-10-27 13:03:22 +01:00
### Document
2025-10-10 11:44:13 +02:00
2025-10-27 13:03:22 +01:00
#### Overview
2025-10-10 11:44:13 +02:00
`Document` represents a central data abstraction in Haystack, capable of holding text, tables, and binary data.
2025-10-27 13:03:22 +01:00
#### Key Features
2025-10-10 11:44:13 +02:00
- Unique ID for each document.
- Multiple content types are supported: text, binary (`blob`).
- Custom metadata and scoring for advanced document management.
- Optional embedding for AI-based applications.
2025-10-27 13:03:22 +01:00
#### Attributes
2025-10-10 11:44:13 +02:00
```python
@dataclass
class Document(metaclass=_BackwardCompatible):
id: str = field(default="")
content: Optional[str] = field(default=None)
blob: Optional[ByteStream] = field(default=None)
meta: Dict[str, Any] = field(default_factory=dict)
score: Optional[float] = field(default=None)
embedding: Optional[List[float]] = field(default=None)
sparse_embedding: Optional[SparseEmbedding] = field(default=None)
```
2025-10-27 13:03:22 +01:00
#### Example
2025-10-10 11:44:13 +02:00
```python
from haystack import Document
documents = Document(content="Here are the contents of your document", embedding=[0.1]*768)
```
2025-10-27 13:03:22 +01:00
### StreamingChunk
2025-10-10 11:44:13 +02:00
2025-10-27 13:03:22 +01:00
#### Overview
2025-10-10 11:44:13 +02:00
2025-10-27 13:03:22 +01:00
`StreamingChunk` represents a partially streamed LLM response, enabling real-time LLM response processing. It encapsulates a segment of streamed content along with associated metadata and provides comprehensive information about the streaming state.
2025-10-10 11:44:13 +02:00
2025-10-27 13:03:22 +01:00
#### Key Features
2025-10-10 11:44:13 +02:00
2025-10-27 13:03:22 +01:00
- String-based content representation for text chunks
- Support for tool calls and tool call results
- Component tracking and metadata management
- Streaming state indicators (start, finish reason)
- Content block indexing for multi-part responses
2025-10-10 11:44:13 +02:00
2025-10-27 13:03:22 +01:00
#### Attributes
2025-10-10 11:44:13 +02:00
```python
2025-10-27 13:03:22 +01:00
@dataclass
2025-10-10 11:44:13 +02:00
class StreamingChunk:
content: str
2025-10-27 13:03:22 +01:00
meta: dict[str, Any] = field(default_factory=dict, hash=False)
component_info: Optional[ComponentInfo] = field(default=None)
index: Optional[int] = field(default=None)
tool_calls: Optional[list[ToolCallDelta]] = field(default=None)
tool_call_result: Optional[ToolCallResult] = field(default=None)
start: bool = field(default=False)
finish_reason: Optional[FinishReason] = field(default=None)
```
#### Example
```python
from haystack.dataclasses.streaming_chunk import StreamingChunk, ComponentInfo
## Basic text chunk
chunk = StreamingChunk(
content="Hello world",
start=True,
meta={"model": "gpt-3.5-turbo"}
)
## Tool call chunk
tool_chunk = StreamingChunk(
tool_calls=[ToolCallDelta(index=0, tool_name="calculator", arguments='{"operation": "add", "a": 2, "b": 3}')],
index=0,
start=False,
finish_reason="tool_calls"
)
```
### ToolCallDelta
#### Overview
`ToolCallDelta` represents a tool call prepared by the model, usually contained in an assistant message during streaming.
#### Attributes
```python
@dataclass
class ToolCallDelta:
index: int
tool_name: Optional[str] = field(default=None)
arguments: Optional[str] = field(default=None)
id: Optional[str] = field(default=None)
```
### ComponentInfo
#### Overview
The `ComponentInfo` class represents information about a component within a Haystack pipeline. It is used to track the type and name of components that generate or process data, aiding in debugging, tracing, and metadata management throughout the pipeline.
#### Key Features
- Stores the type of the component (including module and class name).
- Optionally stores the name assigned to the component in the pipeline.
- Provides a convenient class method to create a `ComponentInfo` instance from a `Component` object.
#### Attributes
```python
@dataclass
class ComponentInfo:
type: str
name: Optional[str] = field(default=None)
@classmethod
def from_component(cls, component: Component) -> "ComponentInfo":
...
```
#### Example
```python
from haystack.dataclasses.streaming_chunk import ComponentInfo
from haystack.core.component import Component
class MyComponent(Component):
...
component = MyComponent()
info = ComponentInfo.from_component(component)
print(info.type) # e.g., 'my_module.MyComponent'
print(info.name) # Name assigned in the pipeline, if any
2025-10-10 11:44:13 +02:00
```
2025-10-27 13:03:22 +01:00
### SparseEmbedding
2025-10-10 11:44:13 +02:00
2025-10-27 13:03:22 +01:00
#### Overview
2025-10-10 11:44:13 +02:00
2025-10-27 13:03:22 +01:00
The `SparseEmbedding` class represents a sparse embedding: a vector where most values are zeros.
2025-10-10 11:44:13 +02:00
2025-10-27 13:03:22 +01:00
#### Attributes
2025-10-10 11:44:13 +02:00
- `indices`: List of indices of non-zero elements in the embedding.
- `values`: List of values of non-zero elements in the embedding.
2025-10-27 13:03:22 +01:00
### Tool
2025-10-10 11:44:13 +02:00
`Tool` is a data class representing a tool that Language Models can prepare a call for.
2025-10-27 13:03:22 +01:00
Read the detailed documentation for the `Tool` data class on a dedicated [Tool](../tools/tool.mdx) page.