mirror of https://github.com/Unstructured-IO/unstructured.git synced 2025-07-09 01:55:55 +00:00

docs: example of how to use unstructured with pgvector (#571 )

* pgvector requirements

* first pass on pgvector notebook and sql alchemy file

* created code for loading vectors into db

* added query for embedding distance

* updates to pgvector notebook

* update function with time decay

* update pgvector notebook to use example code

* remove old create table script

* add readme for pgvector

* update example to use get_date()

2023-05-12 13:54:38 -04:00

965 B

Raw Permalink Blame History

Loading `unstructured` outputs into Postgres with `pgvector`

The following example shows how to load unstructured output into Postgres with the pgvector extension installed. Combining the similarity search functionality of pgvector with the traditional RDBMS capabilities of Postgres allow users to performing similarity searches that are conditioned on metadata or biased toward more recent documents. Use cases include document discovery and more sophisticated retrieval augmented generation for LLMs. The langchain docs have more information about retrieval augmented generation.

Running the example

Install Postgres.
Install pgvector
Run pip install -r requirements.txt to install the Python dependencies.
Run `jupyter-notebook to start.
Run the pgvector.ipynb notebook.

965 B Raw Permalink Blame History

Loading unstructured outputs into Postgres with pgvector

Running the example

965 B

Raw Permalink Blame History

Loading `unstructured` outputs into Postgres with `pgvector`