mirror of
https://github.com/deepset-ai/haystack.git
synced 2026-02-04 14:03:31 +00:00
63 lines
2.9 KiB
Plaintext
63 lines
2.9 KiB
Plaintext
|
|
---
|
|||
|
|
title: "PineconeDocumentStore"
|
|||
|
|
id: pinecone-document-store
|
|||
|
|
slug: "/pinecone-document-store"
|
|||
|
|
description: "Use a Pinecone vector database with Haystack."
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
# PineconeDocumentStore
|
|||
|
|
|
|||
|
|
Use a Pinecone vector database with Haystack.
|
|||
|
|
|
|||
|
|
| | |
|
|||
|
|
| :------------ | :----------------------------------------------------------------------------------------- |
|
|||
|
|
| API reference | [Pinecone](/reference/integrations-pinecone) |
|
|||
|
|
| GitHub link | https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/pinecone |
|
|||
|
|
|
|||
|
|
[Pinecone](https://www.pinecone.io/) is a cloud-based vector database. It is fast and easy to use.
|
|||
|
|
Unlike other solutions (such as Qdrant and Weaviate), it can’t run locally on the user's machine but provides a generous free tier.
|
|||
|
|
|
|||
|
|
### Installation
|
|||
|
|
|
|||
|
|
You can simply install the Pinecone Haystack integration with:
|
|||
|
|
|
|||
|
|
```shell
|
|||
|
|
pip install pinecone-haystack
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### Initialization
|
|||
|
|
|
|||
|
|
- To use Pinecone as a Document Store in Haystack, sign up for a free Pinecone [account](https://app.pinecone.io/) and get your API key.
|
|||
|
|
The Pinecone API key can be explicitly provided or automatically read from the environment variable `PINECONE_API_KEY` (recommended).
|
|||
|
|
- In Haystack, each `PineconeDocumentStore` operates in a specific namespace of an index. If not provided, both index and namespace are `default`.
|
|||
|
|
If the index already exists, the Document Store connects to it. Otherwise, it creates a new index.
|
|||
|
|
- When creating a new index, you can provide a `spec` in the form of a dictionary. This allows choosing between serverless and pod deployment options and setting additional parameters. Refer to the [Pinecone documentation](https://docs.pinecone.io/reference/api/control-plane/create_index) for more details. If not provided, a default spec with serverless deployment in the `us-east-1` region will be used (compatible with the free tier).
|
|||
|
|
- You can provide `dimension` and `metric`, but they are only taken into account if the Pinecone index does not already exist.
|
|||
|
|
|
|||
|
|
Then, you can use the Document Store like this:
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
from haystack import Document
|
|||
|
|
from haystack_integrations.document_stores.pinecone import PineconeDocumentStore
|
|||
|
|
|
|||
|
|
## Make sure you have the PINECONE_API_KEY environment variable set
|
|||
|
|
document_store = PineconeDocumentStore(
|
|||
|
|
index="default",
|
|||
|
|
namespace="default",
|
|||
|
|
dimension=5,
|
|||
|
|
metric="cosine",
|
|||
|
|
spec={"serverless": {"region": "us-east-1", "cloud": "aws"}}
|
|||
|
|
)
|
|||
|
|
|
|||
|
|
document_store.write_documents([
|
|||
|
|
Document(content="This is first", embedding=[0.0]*5),
|
|||
|
|
Document(content="This is second", embedding=[0.1, 0.2, 0.3, 0.4, 0.5])
|
|||
|
|
])
|
|||
|
|
print(document_store.count_documents())
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### Supported Retrievers
|
|||
|
|
|
|||
|
|
[`PineconeEmbeddingRetriever`](/docs/pineconedenseretriever): Retrieves documents from the `PineconeDocumentStore` based on their dense embeddings (vectors).
|