mirror of
https://github.com/deepset-ai/haystack.git
synced 2026-01-07 20:46:31 +00:00
* Update documentation and remove unused assets. Enhanced the 'agents' and 'components' sections with clearer descriptions and examples. Removed obsolete images and updated links for better navigation. Adjusted formatting for consistency across various documentation pages. * remove dependency * address comments * delete more empty pages * broken link * unduplicate headings * alphabetical components nav
62 lines
2.8 KiB
Plaintext
62 lines
2.8 KiB
Plaintext
---
|
||
title: "PineconeDocumentStore"
|
||
id: pinecone-document-store
|
||
slug: "/pinecone-document-store"
|
||
description: "Use a Pinecone vector database with Haystack."
|
||
---
|
||
|
||
# PineconeDocumentStore
|
||
|
||
Use a Pinecone vector database with Haystack.
|
||
|
||
| | |
|
||
| --- | --- |
|
||
| API reference | [Pinecone](/reference/integrations-pinecone) |
|
||
| GitHub link | https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/pinecone |
|
||
|
||
[Pinecone](https://www.pinecone.io/) is a cloud-based vector database. It is fast and easy to use.
|
||
Unlike other solutions (such as Qdrant and Weaviate), it can’t run locally on the user's machine but provides a generous free tier.
|
||
|
||
### Installation
|
||
|
||
You can simply install the Pinecone Haystack integration with:
|
||
|
||
```shell
|
||
pip install pinecone-haystack
|
||
```
|
||
|
||
### Initialization
|
||
|
||
- To use Pinecone as a Document Store in Haystack, sign up for a free Pinecone [account](https://app.pinecone.io/) and get your API key.
|
||
The Pinecone API key can be explicitly provided or automatically read from the environment variable `PINECONE_API_KEY` (recommended).
|
||
- In Haystack, each `PineconeDocumentStore` operates in a specific namespace of an index. If not provided, both index and namespace are `default`.
|
||
If the index already exists, the Document Store connects to it. Otherwise, it creates a new index.
|
||
- When creating a new index, you can provide a `spec` in the form of a dictionary. This allows choosing between serverless and pod deployment options and setting additional parameters. Refer to the [Pinecone documentation](https://docs.pinecone.io/reference/api/control-plane/create_index) for more details. If not provided, a default spec with serverless deployment in the `us-east-1` region will be used (compatible with the free tier).
|
||
- You can provide `dimension` and `metric`, but they are only taken into account if the Pinecone index does not already exist.
|
||
|
||
Then, you can use the Document Store like this:
|
||
|
||
```python
|
||
from haystack import Document
|
||
from haystack_integrations.document_stores.pinecone import PineconeDocumentStore
|
||
|
||
## Make sure you have the PINECONE_API_KEY environment variable set
|
||
document_store = PineconeDocumentStore(
|
||
index="default",
|
||
namespace="default",
|
||
dimension=5,
|
||
metric="cosine",
|
||
spec={"serverless": {"region": "us-east-1", "cloud": "aws"}}
|
||
)
|
||
|
||
document_store.write_documents([
|
||
Document(content="This is first", embedding=[0.0]*5),
|
||
Document(content="This is second", embedding=[0.1, 0.2, 0.3, 0.4, 0.5])
|
||
])
|
||
print(document_store.count_documents())
|
||
|
||
```
|
||
|
||
### Supported Retrievers
|
||
|
||
[`PineconeEmbeddingRetriever`](../pipeline-components/retrievers/pineconedenseretriever.mdx): Retrieves documents from the `PineconeDocumentStore` based on their dense embeddings (vectors). |