Building AI Applications with Spring AI (3): Vector Databases

In this installment of our series on building AI applications with Spring AI, we explore the powerful capabilities of vector databases. These specialized databases enhance AI systems by efficiently managing and utilizing embeddings—dense vector representations crucial for processing and understanding large-scale, complex data sets. Read on to learn how vector databases integrate seamlessly with Spring AI to revolutionize data handling in AI applications.

cover

What is an Embedding?

An embedding is a dense vector of floating-point numbers that transforms words, sentences, or entire documents into a format that machines can process. This technique is crucial in natural language processing as it helps capture semantic relationships, enabling the quantification of similarities and differences between words. Embeddings are widely used in applications such as search, recommendation systems, and anomaly detection. Advanced models, particularly those that use deep learning, enhance embeddings by learning deeper semantic relationships, thereby improving the performance and accuracy of tasks like sentiment analysis, word sense disambiguation, and contextual understanding. This makes embeddings a fundamental tool in AI applications, enabling machines to handle language with sophisticated nuance and precision.

How to Use Embeddings with OpenAI

In Java, using the OpenAI API to generate embeddings involves utilizing the OpenAiEmbeddingClient. This client facilitates the conversion of textual data into vector representations. The process is straightforward: you instantiate the client, send a request with your data, and then receive embeddings along with metadata about the request. This method simplifies the transformation from text to vectors, allowing developers to focus on applying these embeddings to enhance features like the relevance of search engines or the personalization of recommendation systems.

Java Implementation Example

Below is a practical example demonstrating how to use the OpenAiEmbeddingClient to generate embeddings for textual data in Java:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
final OpenAiEmbeddingClient openAiEmbeddingClient;

private void embeddingClientTest() {
// Create a request for embeddings using specific texts and a predefined model
EmbeddingResponse embeddingResponse = openAiEmbeddingClient.call(
new EmbeddingRequest(List.of("Hello World", "World is big and salvation is near"),
OpenAiEmbeddingOptions.builder()
.withModel(OpenAiApi.EmbeddingModel.TEXT_EMBEDDING_3_SMALL.value)
.build()));

// Print each embedding value
embeddingResponse.getResult().getOutput().forEach(embedding -> {
System.out.println(embedding);
});

// Print metadata for additional insights
embeddingResponse.getMetadata().forEach((key, value) -> {
System.out.println(key + " : " + value);
});
}

In this snippet, the OpenAiEmbeddingClient is configured to send a request to the OpenAI API, specifying the TEXT_EMBEDDING_3_SMALL model for processing. The request includes a list of text strings, and the response comprises both the generated embeddings and associated metadata. The embeddings are arrays of floating-point numbers, each representing the corresponding input text in a multi-dimensional vector space. The metadata provides additional context such as the model used, the number of tokens in the input, and other pertinent details.

Output Explanation

When executed, the output will display the numerical vectors for the input phrases, effectively demonstrating how each phrase is transformed into a multi-dimensional vector space. Here’s a sample of what the output might look like:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
-0.0069795838
0.02052451
0.015975665
.
.
.
-0.0023037316
-0.0057915696
-0.00749934

completion-tokens : null
total-tokens : 9
model : text-embedding-3-small
prompt-tokens : 9

The vector output from embeddingResponse.getResult().getOutput() can be used for tasks such as similarity comparisons, where vectors of different text inputs are compared to find how closely related they are. The metadata from embeddingResponse.getMetadata() offers insights such as the model used (model), and the total number of tokens processed (total-tokens), which are useful for understanding the scope and scale of the embedding process. This method facilitates a wide range of applications, from semantic search to personalized recommendations, by quantifying and leveraging the similarity of text-based data.

What are Vector Databases?

Vector databases are specialized systems specifically designed to store and manage high-dimensional vector embeddings, which are essential in fields such as natural language processing and machine learning. These databases are optimized for performing fast similarity searches, making them ideal for applications that require quick retrieval of similar items from large datasets. They are also scalable, capable of handling vast amounts of data without compromising performance.

In our example, we utilize PostgreSQL equipped with the vector extension to facilitate vector operations. This configuration enables efficient storage, updating, and querying of vector data, integrating seamlessly into a relational database environment. The addition of the vector extension allows PostgreSQL to perform sophisticated vector operations, thereby providing a robust solution for managing embeddings across various AI applications.

Preparing PostgreSQL with Docker

Setting up a PostgreSQL database with vector capabilities involves using Docker to run a PostgreSQL server with the necessary extensions. Here’s how you can use a compose.yaml file to configure and launch PostgreSQL with the vector extension and pgAdmin, a widely-used web-based administration tool for managing PostgreSQL databases.

Understanding the compose.yaml Configuration

Here’s the breakdown of the Docker Compose configuration needed to set up the services:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
services:
pgvector:
image: 'ankane/pgvector'
environment:
- POSTGRES_USER=postgres
- POSTGRES_PASSWORD=postgres
- POSTGRES_DB=vector_store
- PGPASSWORD=postgres
ports:
- '5432:5432'
healthcheck:
test: "pg_isready -U postgres -d vector_store"
interval: 2s
timeout: 20s
retries: 10
logging:
options:
max-size: 10m
max-file: "3"
pgadmin:
container_name: pgadmin_container
image: dpage/pgadmin4
environment:
PGADMIN_DEFAULT_EMAIL: ${PGADMIN_DEFAULT_EMAIL:-pgadmin4@pgadmin.org}
PGADMIN_DEFAULT_PASSWORD: ${PGADMIN_DEFAULT_PASSWORD:-admin}
volumes:
- ./servers.json:/pgadmin4/servers.json
ports:
- "${PGADMIN_PORT:-5050}:80"

The pgvector service uses the ankane/pgvector Docker image which integrates PostgreSQL with the pgvector extension. It sets environment variables for database credentials and configures port mapping, health checks, and logging. The pgadmin service runs pgAdmin, providing a user-friendly interface for database administration through specified email and password credentials, with port and volume configurations.

This setup ensures that you have a powerful, ready-to-use PostgreSQL database with vector capabilities and a convenient administrative interface, all running in a controlled Docker environment.

Starting the Services

To initiate the Docker Compose setup and start your services, use the following command in your terminal:

1
docker compose up

Executing this command will launch both the PostgreSQL server equipped with vector support and the pgAdmin interface. This operation initializes the services defined in your Docker Compose configuration, setting up the necessary environment for managing vector data.

Accessing pgAdmin and Connecting to PostgreSQL

Once the services are running, you can access and manage your PostgreSQL database through pgAdmin by following these steps:

  1. Access pgAdmin:

    • Open a web browser and navigate to http://localhost:5050/.
    • Use the credentials specified in the compose.yaml file to log in:
      • Username: pgadmin4@pgadmin.org (default)
      • Password: admin (default)
  2. Add a New Server in pgAdmin:

    • Step 1: Inside pgAdmin, right-click on ‘Servers’ in the left-hand menu and choose ‘Create’ -> ‘Server’.
    • Step 2: In the ‘Create - Server’ dialog box, navigate to the ‘General’ tab and enter a name for your server connection. This name will help you identify this connection in pgAdmin.
    • Step 3: Move to the ‘Connection’ tab:
      • For ‘Host name/address’, enter pgvector. This is the service name as defined in your Docker Compose, which resolves to your container running PostgreSQL.
      • Fill in the ‘Username’ and ‘Password’ fields with the values set in the Docker environment variables (postgres).
      • Specify the ‘Database’ name as set in your Docker Compose configuration (vector_store).
    • Step 4: Click ‘Save’ to finalize the server connection setup.

By completing these steps, you will establish a connection to your PostgreSQL database with vector capabilities, all running within Docker. This setup is perfect for developing and experimenting with applications that leverage vector data in a stable, reproducible environment. You can now manage your database through the web interface of pgAdmin, which provides tools for creating, modifying, and querying data.

Project Adjustments for Vector Database Integration

To facilitate the management of vector databases, we’ve added the org.springframework.ai:spring-ai-pgvector-store-spring-boot-starter package. Additionally, we’ve introduced the VectorDatabase class to handle operations specific to vector databases.

Configuration in application.yml

We have updated our application.yml with settings for the database connection and additional configurations for vector database operations as follows:

1
2
3
4
5
6
7
8
9
10
11
12
spring:
application:
name: sam-ai
datasource:
url: jdbc:postgresql://localhost:5432/vector_store
username: postgres
password: postgres
vectorstore:
pgvector:
index-type: HNSW
distance-type: COSINE_DISTANCE
dimensions: 1536

The inclusion of developmentOnly 'org.springframework.boot:spring-boot-docker-compose' simplifies the process of launching PostgreSQL with Docker Compose during development.

Explanation of Vector Database Parameters:

  1. Index-Type: HNSW

    • Parameter: index-type
    • Value: HNSW (Hierarchical Navigable Small World)
    • Description:
      • This index type is used for approximate nearest neighbor searches, ideal for large-scale datasets. HNSW provides a multi-level graph structure for searches, delivering superior query performance in terms of speed and recall compared to many other indexing methods.
  2. Distance-Type: COSINE_DISTANCE

    • Parameter: distance-type
    • Value: COSINE_DISTANCE
    • Description:
      • This distance function calculates the similarity between two vectors based on the cosine of the angle between them rather than their Euclidean distance. It is particularly effective when vectors are normalized to unit length, making it commonly used for measuring textual similarity.
  3. Dimension: 1536

    • Parameter: dimension
    • Value: 1536
    • Description:
      • This parameter defines the number of features in each vector. A higher dimension captures more information but may increase the complexity and storage requirements. When creating vector tables, it’s essential to specify this dimension to ensure all stored vectors conform to this specification.

Example Code:

The following Java code snippet demonstrates how to use the PgVectorStore to add documents and perform a similarity search:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
List<Document> documents = List.of(
new Document("Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!!",
Map.of("meta1", "meta1")),
new Document("The World is Big and Salvation Lurks Around the Corner"),
new Document("You walk forward facing the past and you turn back toward the future.",
Map.of("meta2", "meta2")));

VectorStore vectorStore = new PgVectorStore(jdbcTemplate, openAiEmbeddingClient);

// Add documents to PGVector
vectorStore.add(documents);

// Perform a similarity search
List<Document> results = vectorStore.similaritySearch(SearchRequest.query("Spring").withTopK(1));

results.forEach(document -> {
System.out.println(document);
});

Output:

1
Document{id='7a523baf-412d-4918-8404-4c8da38296d7', metadata={meta1=meta1, distance=0.18539967}, content='Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!!'}

The result shows the document most similar to the query “Spring”, demonstrating how effectively the vector database handles and retrieves relevant data based on similarity.

References

Videos

Documentation and Articles