Daria Fokina 510d063612
style(docs): params as inline code (#10017)
* params as inline code

* more params

* even more params

* last params
2025-11-05 14:49:38 +01:00

79 lines
3.8 KiB
Plaintext
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
title: "JsonSchemaValidator"
id: jsonschemavalidator
slug: "/jsonschemavalidator"
description: "Use this component to ensure that an LLM-generated chat message JSON adheres to a specific schema."
---
# JsonSchemaValidator
Use this component to ensure that an LLM-generated chat message JSON adheres to a specific schema.
<div className="key-value-table">
| | |
| --- | --- |
| **Most common position in a pipeline** | After a [Generator](../generators.mdx) |
| **Mandatory run variables** | `messages`: A list of [`ChatMessage`](../../concepts/data-classes/chatmessage.mdx) instances to be validated  the last message in this list is the one that is validated |
| **Output variables** | `validated`: A list of messages if the last message is valid <br /> <br />`validation_error`: A list of messages if the last message is invalid |
| **API reference** | [Validators](/reference/validators-api) |
| **GitHub link** | https://github.com/deepset-ai/haystack/blob/main/haystack/components/validators/json_schema.py |
</div>
## Overview
`JsonSchemaValidator` checks the JSON content of a `ChatMessage` against a given [JSON Schema](https://json-schema.org/). If a message's JSON content follows the provided schema, it's moved to the `validated` output. If not, it's moved to the `validation_error`output. When there's an error, the component uses either the provided custom `error_template` or a default template to create the error message. These error `ChatMessages` can be used in Haystack recovery loops.
## Usage
### In a pipeline
In this simple pipeline, the `MessageProducer` sends a list of chat messages to a Generator through `BranchJoiner`. The resulting messages from the Generator are sent to `JsonSchemaValidator`, and the error `ChatMessages` are sent back to `BranchJoiner` for a recovery loop.
```python
from typing import List
from haystack import Pipeline
from haystack import component
from haystack.components.generators.chat import OpenAIChatGenerator
from haystack.components.joiners import BranchJoiner
from haystack.components.validators import JsonSchemaValidator
from haystack.dataclasses import ChatMessage
@component
class MessageProducer:
@component.output_types(messages=List[ChatMessage])
def run(self, messages: List[ChatMessage]) -> dict:
return {"messages": messages}
p = Pipeline()
p.add_component("llm", OpenAIChatGenerator(model="gpt-4-1106-preview",
generation_kwargs={"response_format": {"type": "json_object"}}))
p.add_component("schema_validator", JsonSchemaValidator())
p.add_component("branch_joiner", BranchJoiner(List[ChatMessage]))
p.add_component("message_producer", MessageProducer())
p.connect("message_producer.messages", "branch_joiner")
p.connect("branch_joiner", "llm")
p.connect("llm.replies", "schema_validator.messages")
p.connect("schema_validator.validation_error", "branch_joiner")
result = p.run(
data={"message_producer": {
"messages": [ChatMessage.from_user("Generate JSON for person with name 'John' and age 30")]},
"schema_validator": {"json_schema": {"type": "object",
"properties": {"name": {"type": "string"},
"age": {"type": "integer"}}}}})
print(result)
>> {'schema_validator': {'validated': [ChatMessage(_role=<ChatRole.ASSISTANT:
>> 'assistant'>, _content=[TextContent(text='\n{\n "name": "John",\n "age": 30\n}')],
>> _name=None, _meta={'model': 'gpt-4-1106-preview', 'index': 0, 'finish_reason': 'stop',
>> 'usage': {'completion_tokens': 17, 'prompt_tokens': 20, 'total_tokens': 37,
>> 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0,
>> 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details':
>> {'audio_tokens': 0, 'cached_tokens': 0}}})]}}
```