mirror of
https://github.com/deepset-ai/haystack.git
synced 2026-01-05 19:47:45 +00:00
145 lines
4.8 KiB
Plaintext
145 lines
4.8 KiB
Plaintext
---
|
|
title: "NvidiaGenerator"
|
|
id: nvidiagenerator
|
|
slug: "/nvidiagenerator"
|
|
description: "This Generator enables text generation using Nvidia-hosted models."
|
|
---
|
|
|
|
# NvidiaGenerator
|
|
|
|
This Generator enables text generation using Nvidia-hosted models.
|
|
|
|
<div className="key-value-table">
|
|
|
|
| | |
|
|
| --- | --- |
|
|
| **Most common position in a pipeline** | After a [`PromptBuilder`](../builders/promptbuilder.mdx) |
|
|
| **Mandatory init variables** | `api_key`: API key for the NVIDIA NIM. Can be set with `NVIDIA_API_KEY` env var. |
|
|
| **Mandatory run variables** | `prompt`: A string containing the prompt for the LLM |
|
|
| **Output variables** | `replies`: A list of strings with all the replies generated by the LLM <br /> <br />`meta`: A list of dictionaries with the metadata associated with each reply, such as token count and others |
|
|
| **API reference** | [Nvidia](/reference/integrations-nvidia) |
|
|
| **GitHub link** | https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/nvidia |
|
|
|
|
</div>
|
|
|
|
## Overview
|
|
|
|
The `NvidiaGenerator` provides an interface for generating text using LLMs self-hosted with NVIDIA NIM or models hosted on the [NVIDIA API catalog](https://build.nvidia.com/explore/discover).
|
|
|
|
## Usage
|
|
|
|
To start using `NvidiaGenerator`, first, install the `nvidia-haystack` package:
|
|
|
|
```shell
|
|
pip install nvidia-haystack
|
|
```
|
|
|
|
You can use the `NvidiaGenerator` with all the LLMs available in the [NVIDIA API catalog](https://docs.api.nvidia.com/nim/reference) or a model deployed with NVIDIA NIM. Follow the [NVIDIA NIM for LLMs Playbook](https://developer.nvidia.com/docs/nemo-microservices/inference/playbooks/nmi_playbook.html) to learn how to deploy your desired model on your infrastructure.
|
|
|
|
### On its own
|
|
|
|
To use LLMs from the NVIDIA API catalog, you need to specify the correct `api_url` and your API key. You can get your API key directly from the [catalog website](https://build.nvidia.com/explore/discover).
|
|
|
|
The `NvidiaGenerator` needs an Nvidia API key to work. It uses the `NVIDIA_API_KEY` environment variable by default. Otherwise, you can pass an API key at initialization with `api_key`, as in the following example.
|
|
|
|
```python
|
|
from haystack.utils.auth import Secret
|
|
from haystack_integrations.components.generators.nvidia import NvidiaGenerator
|
|
|
|
generator = NvidiaGenerator(
|
|
model="meta/llama3-70b-instruct",
|
|
api_url="https://integrate.api.nvidia.com/v1",
|
|
api_key=Secret.from_token("<your-api-key>"),
|
|
model_arguments={
|
|
"temperature": 0.2,
|
|
"top_p": 0.7,
|
|
"max_tokens": 1024,
|
|
},
|
|
)
|
|
generator.warm_up()
|
|
|
|
result = generator.run(prompt="What is the answer?")
|
|
print(result["replies"])
|
|
print(result["meta"])
|
|
```
|
|
|
|
To use a locally deployed model, you need to set the `api_url` to your localhost and unset your `api_key`.
|
|
|
|
```python
|
|
from haystack_integrations.components.generators.nvidia import NvidiaGenerator
|
|
|
|
generator = NvidiaGenerator(
|
|
model="llama-2-7b",
|
|
api_url="http://0.0.0.0:9999/v1",
|
|
api_key=None,
|
|
model_arguments={
|
|
"temperature": 0.2,
|
|
},
|
|
)
|
|
generator.warm_up()
|
|
|
|
result = generator.run(prompt="What is the answer?")
|
|
print(result["replies"])
|
|
print(result["meta"])
|
|
```
|
|
|
|
### In a Pipeline
|
|
|
|
Here's an example of a RAG pipeline:
|
|
|
|
```python
|
|
from haystack import Pipeline, Document
|
|
from haystack.utils.auth import Secret
|
|
from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
|
|
from haystack.components.builders.prompt_builder import PromptBuilder
|
|
from haystack.document_stores.in_memory import InMemoryDocumentStore
|
|
from haystack_integrations.components.generators.nvidia import NvidiaGenerator
|
|
|
|
docstore = InMemoryDocumentStore()
|
|
docstore.write_documents([Document(content="Rome is the capital of Italy"), Document(content="Paris is the capital of France")])
|
|
|
|
query = "What is the capital of France?"
|
|
|
|
template = """
|
|
Given the following information, answer the question.
|
|
|
|
Context:
|
|
{% for document in documents %}
|
|
{{ document.content }}
|
|
{% endfor %}
|
|
|
|
Question: {{ query }}?
|
|
"""
|
|
pipe = Pipeline()
|
|
|
|
pipe.add_component("retriever", InMemoryBM25Retriever(document_store=docstore))
|
|
pipe.add_component("prompt_builder", PromptBuilder(template=template))
|
|
pipe.add_component("llm", NvidiaGenerator(
|
|
model="meta/llama3-70b-instruct",
|
|
api_url="https://integrate.api.nvidia.com/v1",
|
|
api_key=Secret.from_token("<your-api-key>"),
|
|
model_arguments={
|
|
"temperature": 0.2,
|
|
"top_p": 0.7,
|
|
"max_tokens": 1024,
|
|
},
|
|
))
|
|
pipe.connect("retriever", "prompt_builder.documents")
|
|
pipe.connect("prompt_builder", "llm")
|
|
|
|
res=pipe.run({
|
|
"prompt_builder": {
|
|
"query": query
|
|
},
|
|
"retriever": {
|
|
"query": query
|
|
}
|
|
})
|
|
|
|
print(res)
|
|
```
|
|
|
|
## Additional References
|
|
|
|
🧑🍳 Cookbook: [Haystack RAG Pipeline with Self-Deployed AI models using NVIDIA NIMs](https://haystack.deepset.ai/cookbook/rag-with-nims)
|