PGVectorStore
PGVectorStore is an implementation of a LangChain vectorstore using postgres as the backend.
This notebook goes over how to use the PGVectorStore API.
The code lives in an integration package called: langchain-postgres.
Setup
This package requires a PostgreSQL database with the pgvector extension.
You can run the following command to spin up a container for a pgvector enabled Postgres instance:
docker run --name pgvector-container -e POSTGRES_USER=langchain -e POSTGRES_PASSWORD=langchain -e POSTGRES_DB=langchain -p 6024:5432 -d pgvector/pgvector:pg16
Install
Install the integration library, langchain-postgres.
%pip install --upgrade --quiet langchain-postgres
# This notebook also requires the following dependencies
%pip install --upgrade --quiet langchain-core langchain-cohere sqlalchemy
Set your Postgres values
Set your Postgres values to test the functionality in this notebook against a Postgres instance.
# @title Set your values or use the defaults to connect to Docker { display-mode: "form" }
POSTGRES_USER = "langchain" # @param {type: "string"}
POSTGRES_PASSWORD = "langchain" # @param {type: "string"}
POSTGRES_HOST = "localhost" # @param {type: "string"}
POSTGRES_PORT = "6024" # @param {type: "string"}
POSTGRES_DB = "langchain" # @param {type: "string"}
TABLE_NAME = "vectorstore" # @param {type: "string"}
VECTOR_SIZE = 1024 # @param {type: "int"}
Initialization
PGEngine Connection Pool
One of the requirements and arguments to establish PostgreSQL as a vector store is a PGEngine object. The PGEngine configures a shared connection pool to your Postgres database. This is an industry best practice to manage number of connections and to reduce latency through cached database connections.
PGVectorStore can be used with the asyncpg and psycopg3 drivers.
To create a PGEngine using PGEngine.from_connection_string() you need to provide:
url: Connection string using thepostgresql+asyncpgdriver.
Note: This tutorial demonstrates the async interface. All async methods have corresponding sync methods.
# See docker command above to launch a Postgres instance with pgvector enabled.
CONNECTION_STRING = (
f"postgresql+asyncpg://{POSTGRES_USER}:{POSTGRES_PASSWORD}@{POSTGRES_HOST}"
f":{POSTGRES_PORT}/{POSTGRES_DB}"
)
# To use psycopg3 driver, set your connection string to `postgresql+psycopg://`
from langchain_postgres import PGEngine
pg_engine = PGEngine.from_connection_string(url=CONNECTION_STRING)
To create a PGEngine using PGEngine.from_engine() you need to provide:
engine: An object ofAsyncEngine
from sqlalchemy.ext.asyncio import create_async_engine
# Create an SQLAlchemy Async Engine
engine = create_async_engine(
CONNECTION_STRING,
)
pg_engine = PGEngine.from_engine(engine=engine)
Initialize a table
The PGVectorStore class requires a database table. The PGEngine engine has a helper method ainit_vectorstore_table() that can be used to create a table with the proper schema for you.
See Create a custom Vector Store or Create a Vector Store using existing table for customizing the schema.
await pg_engine.ainit_vectorstore_table(
table_name=TABLE_NAME,
vector_size=VECTOR_SIZE,
)