mirror of
https://github.com/deepset-ai/haystack.git
synced 2025-12-31 00:57:37 +00:00
* Update document embedding values in example * Update document embeddings of Qdrant example to 768 dimensions * Remove spaces in embedding initialization * Update URL in agent output example to remove the 403 error
100 lines
4.1 KiB
Plaintext
100 lines
4.1 KiB
Plaintext
---
|
|
title: "QdrantDocumentStore"
|
|
id: qdrant-document-store
|
|
slug: "/qdrant-document-store"
|
|
description: "Use the Qdrant vector database with Haystack."
|
|
---
|
|
|
|
# QdrantDocumentStore
|
|
|
|
Use the Qdrant vector database with Haystack.
|
|
|
|
<div className="key-value-table">
|
|
|
|
| | |
|
|
| --- | --- |
|
|
| API reference | [Qdrant](/reference/integrations-qdrant) |
|
|
| GitHub link | https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/qdrant |
|
|
|
|
</div>
|
|
|
|
Qdrant is a powerful high-performance, massive-scale vector database. The `QdrantDocumentStore` can be used with any Qdrant instance, in-memory, locally persisted, hosted, and the official Qdrant Cloud.
|
|
|
|
### Installation
|
|
|
|
You can simply install the Qdrant Haystack integration with:
|
|
|
|
```shell
|
|
pip install qdrant-haystack
|
|
```
|
|
|
|
### Initialization
|
|
|
|
The quickest way to use `QdrantDocumentStore` is to create an in-memory instance of it:
|
|
|
|
```python
|
|
from haystack.dataclasses.document import Document
|
|
from haystack_integrations.document_stores.qdrant import QdrantDocumentStore
|
|
|
|
document_store = QdrantDocumentStore(
|
|
":memory:",
|
|
recreate_index=True,
|
|
return_embedding=True,
|
|
wait_result_from_api=True,
|
|
)
|
|
document_store.write_documents([
|
|
Document(content="This is first", embedding=[0.0]*768),
|
|
Document(content="This is second", embedding=[0.1]*768)
|
|
])
|
|
print(document_store.count_documents())
|
|
```
|
|
|
|
:::warning Collections Created Outside Haystack
|
|
|
|
When you create a `QdrantDocumentStore` instance, Haystack takes care of setting up the collection. In general, you cannot use a Qdrant collection created without Haystack with Haystack. If you want to migrate your existing collection, see the sample script at https://github.com/deepset-ai/haystack-core-integrations/blob/main/integrations/qdrant/src/haystack_integrations/document_stores/qdrant/migrate_to_sparse.py.
|
|
:::
|
|
|
|
You can also connect directly to [Qdrant Cloud](https://cloud.qdrant.io/login) directly. Once you have your API key and your cluster URL from the Qdrant dashboard, you can connect like this:
|
|
|
|
```python
|
|
from haystack.dataclasses.document import Document
|
|
from haystack_integrations.document_stores.qdrant import QdrantDocumentStore
|
|
from haystack.utils import Secret
|
|
|
|
document_store = QdrantDocumentStore(
|
|
url="https://XXXXXXXXX.us-east4-0.gcp.cloud.qdrant.io:6333",
|
|
index="your_index_name",
|
|
embedding_dim=1024, # based on the embedding model
|
|
recreate_index=True, # enable only to recreate the index and not connect to the existing one
|
|
api_key = Secret.from_token("YOUR_TOKEN")
|
|
)
|
|
|
|
document_store.write_documents([
|
|
Document(content="This is first", embedding=[0.0]*5),
|
|
Document(content="This is second", embedding=[0.1, 0.2, 0.3, 0.4, 0.5])
|
|
])
|
|
print(document_store.count_documents())
|
|
```
|
|
|
|
:::tip More information
|
|
|
|
You can find more ways to initialize and use QdrantDocumentStore on our [integration page](https://haystack.deepset.ai/integrations/qdrant-document-store).
|
|
:::
|
|
|
|
### Supported Retrievers
|
|
|
|
- [`QdrantEmbeddingRetriever`](../pipeline-components/retrievers/qdrantembeddingretriever.mdx): Retrieves documents from the `QdrantDocumentStore` based on their dense embeddings (vectors).
|
|
- [`QdrantSparseEmbeddingRetriever`](../pipeline-components/retrievers/qdrantsparseembeddingretriever.mdx): Retrieves documents from the `QdrantDocumentStore` based on their sparse embeddings.
|
|
- [`QdrantHybridRetriever`](../pipeline-components/retrievers/qdranthybridretriever.mdx): Retrieves documents from the `QdrantDocumentStore` based on both dense and sparse embeddings.
|
|
|
|
:::note Sparse Embedding Support
|
|
|
|
To use Sparse Embedding support, you need to initialize the `QdrantDocumentStore` with `use_sparse_embeddings=True`, which is `False` by default.
|
|
|
|
If you want to use Document Store or collection previously created with this feature disabled, you must migrate the existing data. You can do this by taking advantage of the `migrate_to_sparse_embeddings_support` utility function.
|
|
:::
|
|
|
|
## Additional References
|
|
|
|
🧑🍳 Cookbook: [Sparse Embedding Retrieval with Qdrant and FastEmbed](https://haystack.deepset.ai/cookbook/sparse_embedding_retrieval)
|