mirror of
https://github.com/deepset-ai/haystack.git
synced 2026-02-08 07:52:48 +00:00
120 lines
5.0 KiB
Plaintext
120 lines
5.0 KiB
Plaintext
---
|
||
title: "HuggingFaceLocalGenerator"
|
||
id: huggingfacelocalgenerator
|
||
slug: "/huggingfacelocalgenerator"
|
||
description: "`HuggingFaceLocalGenerator` provides an interface to generate text using a Hugging Face model that runs locally."
|
||
---
|
||
|
||
# HuggingFaceLocalGenerator
|
||
|
||
`HuggingFaceLocalGenerator` provides an interface to generate text using a Hugging Face model that runs locally.
|
||
|
||
| | |
|
||
| :------------------------------------- | :------------------------------------------------------------------------------------------------------ |
|
||
| **Most common position in a pipeline** | After a [`PromptBuilder`](../builders/promptbuilder.mdx) |
|
||
| **Mandatory init variables** | "token": The Hugging Face API token. Can be set with `HF_API_TOKEN` or `HF_TOKEN` env var. |
|
||
| **Mandatory run variables** | “prompt”: A string containing the prompt for the LLM |
|
||
| **Output variables** | “replies”: A list of strings with all the replies generated by the LLM |
|
||
| **API reference** | [Generators](/reference/generators-api) |
|
||
| **GitHub link** | https://github.com/deepset-ai/haystack/blob/main/haystack/components/generators/hugging_face_local.py |
|
||
|
||
## Overview
|
||
|
||
Keep in mind that if LLMs run locally, you may need a powerful machine to run them. This depends strongly on the model you select and its parameter count.
|
||
|
||
:::note
|
||
Looking for chat completion?
|
||
|
||
This component is designed for text generation, not for chat. If you want to use Hugging Face LLMs for chat, consider using [`HuggingFaceLocalChatGenerator`](/docs/huggingfacelocalchatgenerator) instead.
|
||
:::
|
||
|
||
For remote files authorization, this component uses a `HF_API_TOKEN` environment variable by default. Otherwise, you can pass a Hugging Face API token at initialization with `token`:
|
||
|
||
```python
|
||
local_generator = HuggingFaceLocalGenerator(token=Secret.from_token("<your-api-key>"))
|
||
```
|
||
|
||
### Streaming
|
||
|
||
This Generator supports [streaming](/docs/choosing-the-right-generator#streaming-support) the tokens from the LLM directly in output. To do so, pass a function to the `streaming_callback` init parameter.
|
||
|
||
## Usage
|
||
|
||
### On its own
|
||
|
||
```python
|
||
from haystack.components.generators import HuggingFaceLocalGenerator
|
||
|
||
generator = HuggingFaceLocalGenerator(model="google/flan-t5-large",
|
||
task="text2text-generation",
|
||
generation_kwargs={
|
||
"max_new_tokens": 100,
|
||
"temperature": 0.9,
|
||
})
|
||
|
||
generator.warm_up()
|
||
print(generator.run("Who is the best American actor?"))
|
||
## {'replies': ['john wayne']}
|
||
```
|
||
|
||
### In a Pipeline
|
||
|
||
```python
|
||
from haystack import Pipeline
|
||
from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
|
||
from haystack.components.builders.prompt_builder import PromptBuilder
|
||
from haystack.components.generators import HuggingFaceLocalGenerator
|
||
from haystack.document_stores.in_memory import InMemoryDocumentStore
|
||
from haystack import Document
|
||
|
||
docstore = InMemoryDocumentStore()
|
||
docstore.write_documents([Document(content="Rome is the capital of Italy"), Document(content="Paris is the capital of France")])
|
||
|
||
generator = HuggingFaceLocalGenerator(model="google/flan-t5-large",
|
||
task="text2text-generation",
|
||
generation_kwargs={
|
||
"max_new_tokens": 100,
|
||
"temperature": 0.9,
|
||
})
|
||
|
||
query = "What is the capital of France?"
|
||
|
||
template = """
|
||
Given the following information, answer the question.
|
||
|
||
Context:
|
||
{% for document in documents %}
|
||
{{ document.content }}
|
||
{% endfor %}
|
||
|
||
Question: {{ query }}?
|
||
"""
|
||
pipe = Pipeline()
|
||
|
||
pipe.add_component("retriever", InMemoryBM25Retriever(document_store=docstore))
|
||
pipe.add_component("prompt_builder", PromptBuilder(template=template))
|
||
pipe.add_component("llm", generator)
|
||
pipe.connect("retriever", "prompt_builder.documents")
|
||
pipe.connect("prompt_builder", "llm")
|
||
|
||
res=pipe.run({
|
||
"prompt_builder": {
|
||
"query": query
|
||
},
|
||
"retriever": {
|
||
"query": query
|
||
}
|
||
})
|
||
|
||
print(res)
|
||
```
|
||
|
||
## Additional References
|
||
|
||
:cook: Cookbooks:
|
||
|
||
- [Use Zephyr 7B Beta with Hugging Face for RAG](https://haystack.deepset.ai/cookbook/zephyr-7b-beta-for-rag)
|
||
- [Information Extraction with Gorilla](https://haystack.deepset.ai/cookbook/information-extraction-gorilla)
|
||
- [RAG on the Oscars using Llama 3.1 models](https://haystack.deepset.ai/cookbook/llama3_rag)
|
||
- [Agentic RAG with Llama 3.2 3B](https://haystack.deepset.ai/cookbook/llama32_agentic_rag)
|