Redis Vector Store
This notebook covers how to get started with the Redis vector store.
Redis is a popular open-source, in-memory data structure store that can be used as a database, cache, message broker, and queue. It now includes vector similarity search capabilities, making it suitable for use as a vector store.
What is Redis?
Most developers are familiar with Redis
. At its core, Redis
is a NoSQL Database in the key-value family that can used as a cache, message broker, stream processing and a primary database. Developers choose Redis
because it is fast, has a large ecosystem of client libraries, and has been deployed by major enterprises for years.
On top of these traditional use cases, Redis
provides additional capabilities like the Search and Query capability that allows users to create secondary index structures within Redis
. This allows Redis
to be a Vector Database, at the speed of a cache.
Redis as a Vector Database
Redis
uses compressed, inverted indexes for fast indexing with a low memory footprint. It also supports a number of advanced features such as:
- Indexing of multiple fields in Redis hashes and
JSON
- Vector similarity search (with
HNSW
(ANN) orFLAT
(KNN)) - Vector Range Search (e.g. find all vectors within a radius of a query vector)
- Incremental indexing without performance loss
- Document ranking (using tf-idf, with optional user-provided weights)
- Field weighting
- Complex boolean queries with
AND
,OR
, andNOT
operators - Prefix matching, fuzzy matching, and exact-phrase queries
- Support for double-metaphone phonetic matching
- Auto-complete suggestions (with fuzzy prefix suggestions)
- Stemming-based query expansion in many languages (using Snowball)
- Support for Chinese-language tokenization and querying (using Friso)
- Numeric filters and ranges
- Geospatial searches using Redis geospatial indexing
- A powerful aggregations engine
- Supports for all
utf-8
encoded text - Retrieve full documents, selected fields, or only the document IDs
- Sorting results (for example, by creation date)
Clients
Since Redis
is much more than just a vector database, there are often use cases that demand the usage of a Redis
client besides just the LangChain
integration. You can use any standard Redis
client library to run Search and Query commands, but it's easiest to use a library that wraps the Search and Query API. Below are a few examples, but you can find more client libraries here.
Project | Language | License | Author | Stars |
---|---|---|---|---|
jedis | Java | MIT | Redis | |
redisvl | Python | MIT | Redis | |
redis-py | Python | MIT | Redis | |
node-redis | Node.js | MIT | Redis | |
nredisstack | .NET | MIT | Redis |
Deployment options
There are many ways to deploy Redis with RediSearch. The easiest way to get started is to use Docker, but there are are many potential options for deployment such as
- Redis Cloud
- Docker (Redis Stack)
- Cloud marketplaces: AWS Marketplace, Google Marketplace, or Azure Marketplace
- On-premise: Redis Enterprise Software
- Kubernetes: Redis Enterprise Software on Kubernetes
Redis connection Url schemas
Valid Redis Url schemas are:
redis://
- Connection to Redis standalone, unencryptedrediss://
- Connection to Redis standalone, with TLS encryptionredis+sentinel://
- Connection to Redis server via Redis Sentinel, unencryptedrediss+sentinel://
- Connection to Redis server via Redis Sentinel, both connections with TLS encryption
More information about additional connection parameters can be found in the redis-py documentation.
Setup
To use the RedisVectorStore, you'll need to install the langchain-redis
partner package, as well as the other packages used throughout this notebook.
%pip install -qU langchain-redis langchain-huggingface sentence-transformers scikit-learn
Note: you may need to restart the kernel to use updated packages.
Credentials
Redis connection credentials are passed as part of the Redis Connection URL. Redis Connection URLs are versatile and can accommodate various Redis server topologies and authentication methods. These URLs follow a specific format that includes the connection protocol, authentication details, host, port, and database information. The basic structure of a Redis Connection URL is:
[protocol]://[auth]@[host]:[port]/[database]
Where:
- protocol can be redis for standard connections, rediss for SSL/TLS connections, or redis+sentinel for Sentinel connections.
- auth includes username and password (if applicable).
- host is the Redis server hostname or IP address.
- port is the Redis server port.
- database is the Redis database number.
Redis Connection URLs support various configurations, including:
- Standalone Redis servers (with or without authentication)
- Redis Sentinel setups
- SSL/TLS encrypted connections
- Different authentication methods (password-only or username-password)
Below are examples of Redis Connection URLs for different configurations:
# connection to redis standalone at localhost, db 0, no password
redis_url = "redis://localhost:6379"
# connection to host "redis" port 7379 with db 2 and password "secret" (old style authentication scheme without username / pre 6.x)
redis_url = "redis://:secret@redis:7379/2"
# connection to host redis on default port with user "joe", pass "secret" using redis version 6+ ACLs
redis_url = "redis://joe:secret@redis/0"
# connection to sentinel at localhost with default group mymaster and db 0, no password
redis_url = "redis+sentinel://localhost:26379"
# connection to sentinel at host redis with default port 26379 and user "joe" with password "secret" with default group mymaster and db 0
redis_url = "redis+sentinel://joe:secret@redis"
# connection to sentinel, no auth with sentinel monitoring group "zone-1" and database 2
redis_url = "redis+sentinel://redis:26379/zone-1/2"
# connection to redis standalone at localhost, db 0, no password but with TLS support
redis_url = "rediss://localhost:6379"
# connection to redis sentinel at localhost and default port, db 0, no password
# but with TLS support for both Sentinel and Redis server
redis_url = "rediss+sentinel://localhost"
Launching a Redis Instance with Docker
To use Redis with LangChain, you need a running Redis instance. You can start one using Docker with:
docker run -d -p 6379:6379 redis/redis-stack:latest
For this example, we'll use a local Redis instance. If you're using a remote instance, you'll need to modify the Redis URL accordingly.
import os
REDIS_URL = os.getenv("REDIS_URL", "redis://localhost:6379")
print(f"Connecting to Redis at: {REDIS_URL}")
Connecting to Redis at: redis://redis:6379
To enable automated tracing of your model calls, set your LangSmith API key:
# os.environ["LANGSMITH_API_KEY"] = getpass.getpass("Enter your LangSmith API key: ")
# os.environ["LANGSMITH_TRACING"] = "true"
Let's check that Redis is up an running by pinging it:
import redis
redis_client = redis.from_url(REDIS_URL)
redis_client.ping()
True
Sample Data
The 20 newsgroups dataset comprises around 18000 newsgroups posts on 20 topics. We'll use a subset for this demonstration and focus on two categories: 'alt.atheism' and 'sci.space':
from langchain.docstore.document import Document
from sklearn.datasets import fetch_20newsgroups
categories = ["alt.atheism", "sci.space"]
newsgroups = fetch_20newsgroups(
subset="train", categories=categories, shuffle=True, random_state=42
)
# Use only the first 250 documents
texts = newsgroups.data[:250]
metadata = [
{"category": newsgroups.target_names[target]} for target in newsgroups.target[:250]
]
len(texts)
250
Initialization
The RedisVectorStore instance can be initialized in several ways:
RedisVectorStore.__init__
- Initialize directlyRedisVectorStore.from_texts
- Initialize from a list of texts (optionally with metadata)RedisVectorStore.from_documents
- Initialize from a list oflangchain_core.documents.Document
objectsRedisVectorStore.from_existing_index
- Initialize from an existing Redis index
Below we will use the RedisVectorStore.__init__
method using a RedisConfig
instance.
pip install -qU langchain-openai
import getpass
import os
if not os.environ.get("OPENAI_API_KEY"):
os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter API key for OpenAI: ")
from langchain_openai import OpenAIEmbeddings
embeddings = OpenAIEmbeddings(model="text-embedding-3-large")
We'll use the SentenceTransformer model to create embeddings. This model runs locally and doesn't require an API key.
from langchain_redis import RedisConfig, RedisVectorStore
config = RedisConfig(
index_name="newsgroups",
redis_url=REDIS_URL,
metadata_schema=[
{"name": "category", "type": "tag"},
],
)
vector_store = RedisVectorStore(embeddings, config=config)