--- title: "PgvectorDocumentStore" id: pgvectordocumentstore slug: "/pgvectordocumentstore" --- # PgvectorDocumentStore
| | | | --- | --- | | API reference | [Pgvector](/reference/integrations-pgvector) | | GitHub link | https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/pgvector/ |
Pgvector is an extension for PostgreSQL that enhances its capabilities with vector similarity search. It builds upon the classic features of PostgreSQL, such as ACID compliance and point-in-time recovery, and introduces the ability to perform exact and approximate nearest neighbor search using vectors. For more information, see the [pgvector repository](https://github.com/pgvector/pgvector). Pgvector Document Store supports embedding retrieval and metadata filtering. ## Installation To quickly set up a PostgreSQL database with pgvector, you can use Docker: ```shell docker run -d -p 5432:5432 -e POSTGRES_USER=postgres -e POSTGRES_PASSWORD=postgres -e POSTGRES_DB=postgres ankane/pgvector ``` For more information on installing pgvector, visit the [pgvector GitHub repository](https://github.com/pgvector/pgvector). To use pgvector with Haystack, install the `pgvector-haystack` integration: ```shell pip install pgvector-haystack ``` ## Usage ### Connection String Define the connection string to your PostgreSQL database in the `PG_CONN_STR` environment variable. Two formats are supported: **URI format:** ```shell export PG_CONN_STR="postgresql://USER:PASSWORD@HOST:PORT/DB_NAME" ``` **Keyword/value format:** ```shell export PG_CONN_STR="host=HOST port=PORT dbname=DB_NAME user=USER password=PASSWORD" ``` :::caution Special Characters in Connection URIs When using the URI format, special characters in the password must be [percent-encoded](https://en.wikipedia.org/wiki/Percent-encoding). Otherwise, connection errors may occur. A password like `p=ssword` would cause the error `psycopg.OperationalError: [Errno -2] Name or service not known`. For example, if your password is `p=ssword`, the connection string should be: ```shell export PG_CONN_STR="postgresql://postgres:p%3Dssword@localhost:5432/postgres" ``` Alternatively, use the keyword/value format, which does not require percent-encoding: ```shell export PG_CONN_STR="host=localhost port=5432 dbname=postgres user=postgres password=p=ssword" ``` ::: For more details, see the [PostgreSQL connection string documentation](https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-CONNSTRING). ## Initialization Initialize a `PgvectorDocumentStore` object that’s connected to the PostgreSQL database and writes documents to it: ```python from haystack_integrations.document_stores.pgvector import PgvectorDocumentStore from haystack import Document document_store = PgvectorDocumentStore( embedding_dimension=768, vector_function="cosine_similarity", recreate_table=True, search_strategy="hnsw", ) document_store.write_documents([ Document(content="This is first", embedding=[0.1]*768), Document(content="This is second", embedding=[0.3]*768) ]) print(document_store.count_documents()) ``` To learn more about the initialization parameters, see our [API docs](/reference/integrations-pgvector#pgvectordocumentstore). To properly compute embeddings for your documents, you can use a Document Embedder (for instance, the [`SentenceTransformersDocumentEmbedder`](../pipeline-components/embedders/sentencetransformersdocumentembedder.mdx)). ### Supported Retrievers - [`PgvectorEmbeddingRetriever`](../pipeline-components/retrievers/pgvectorembeddingretriever.mdx): An embedding-based Retriever that fetches documents from the Document Store based on a query embedding provided to the Retriever. - [`PgvectorKeywordRetriever`](../pipeline-components/retrievers/pgvectorembeddingretriever.mdx): A keyword-based Retriever that fetches documents matching a query from the Pgvector Document Store.