Daria Fokina 3e81ec75dc
docs: add 2.18 and 2.19 actual documentation pages (#9946)
* versioned-docs

* external-documentstores
2025-10-27 13:03:22 +01:00

120 lines
5.0 KiB
Plaintext
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
title: "HuggingFaceLocalGenerator"
id: huggingfacelocalgenerator
slug: "/huggingfacelocalgenerator"
description: "`HuggingFaceLocalGenerator` provides an interface to generate text using a Hugging Face model that runs locally."
---
# HuggingFaceLocalGenerator
`HuggingFaceLocalGenerator` provides an interface to generate text using a Hugging Face model that runs locally.
| | |
| :------------------------------------- | :------------------------------------------------------------------------------------------------------ |
| **Most common position in a pipeline** | After a [`PromptBuilder`](../builders/promptbuilder.mdx) |
| **Mandatory init variables** | "token": The Hugging Face API token. Can be set with `HF_API_TOKEN` or `HF_TOKEN` env var. |
| **Mandatory run variables** | “prompt”: A string containing the prompt for the LLM |
| **Output variables** | “replies”: A list of strings with all the replies generated by the LLM |
| **API reference** | [Generators](/reference/generators-api) |
| **GitHub link** | https://github.com/deepset-ai/haystack/blob/main/haystack/components/generators/hugging_face_local.py |
## Overview
Keep in mind that if LLMs run locally, you may need a powerful machine to run them. This depends strongly on the model you select and its parameter count.
:::note
Looking for chat completion?
This component is designed for text generation, not for chat. If you want to use Hugging Face LLMs for chat, consider using [`HuggingFaceLocalChatGenerator`](/docs/huggingfacelocalchatgenerator) instead.
:::
For remote files authorization, this component uses a `HF_API_TOKEN` environment variable by default. Otherwise, you can pass a Hugging Face API token at initialization with `token`:
```python
local_generator = HuggingFaceLocalGenerator(token=Secret.from_token("<your-api-key>"))
```
### Streaming
This Generator supports [streaming](/docs/choosing-the-right-generator#streaming-support) the tokens from the LLM directly in output. To do so, pass a function to the `streaming_callback` init parameter.
## Usage
### On its own
```python
from haystack.components.generators import HuggingFaceLocalGenerator
generator = HuggingFaceLocalGenerator(model="google/flan-t5-large",
task="text2text-generation",
generation_kwargs={
"max_new_tokens": 100,
"temperature": 0.9,
})
generator.warm_up()
print(generator.run("Who is the best American actor?"))
## {'replies': ['john wayne']}
```
### In a Pipeline
```python
from haystack import Pipeline
from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
from haystack.components.builders.prompt_builder import PromptBuilder
from haystack.components.generators import HuggingFaceLocalGenerator
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack import Document
docstore = InMemoryDocumentStore()
docstore.write_documents([Document(content="Rome is the capital of Italy"), Document(content="Paris is the capital of France")])
generator = HuggingFaceLocalGenerator(model="google/flan-t5-large",
task="text2text-generation",
generation_kwargs={
"max_new_tokens": 100,
"temperature": 0.9,
})
query = "What is the capital of France?"
template = """
Given the following information, answer the question.
Context:
{% for document in documents %}
{{ document.content }}
{% endfor %}
Question: {{ query }}?
"""
pipe = Pipeline()
pipe.add_component("retriever", InMemoryBM25Retriever(document_store=docstore))
pipe.add_component("prompt_builder", PromptBuilder(template=template))
pipe.add_component("llm", generator)
pipe.connect("retriever", "prompt_builder.documents")
pipe.connect("prompt_builder", "llm")
res=pipe.run({
"prompt_builder": {
"query": query
},
"retriever": {
"query": query
}
})
print(res)
```
## Additional References
:cook: Cookbooks:
- [Use Zephyr 7B Beta with Hugging Face for RAG](https://haystack.deepset.ai/cookbook/zephyr-7b-beta-for-rag)
- [Information Extraction with Gorilla](https://haystack.deepset.ai/cookbook/information-extraction-gorilla)
- [RAG on the Oscars using Llama 3.1 models](https://haystack.deepset.ai/cookbook/llama3_rag)
- [Agentic RAG with Llama 3.2 3B](https://haystack.deepset.ai/cookbook/llama32_agentic_rag)