[SEO Subhead]
This Guidance shows how to create a product catalog with a similarity search capability by integrating AWS and artificial intelligence (AI) services with the pgvector extension. As an open-source extension for PostgreSQL, pgvector adds the ability for you to store and search for points in a vector embedding and find the most similar or "nearest neighbor" to those points. The nearest neighbor search capabilities allow you to use the semantic meaning to power a variety of intelligent applications and data analysis within your PostgreSQL database. By integrating pgvector with AWS services, as shown here, you can conduct both image and text-to-image similarity searches to provide a more personalized, relevant, and efficient shopping experience for your consumers.
Please note: [Disclaimer]
Architecture Diagram
[Architecture diagram description]
Step 1
Deploy resources in your AWS account using AWS CloudFormation. This includes deploying instances of Amazon Relational Database Service (Amazon RDS) for PostgreSQL, Amazon SageMaker, AWS Cloud9, AWS Lambda, and a Custom Resource.
Step 2
RDS for PostgreSQL stores both the product catalog and embeddings for the products using the PostgreSQL open-source extension pgvector to store and index these high-dimensional vector embeddings.
Step 3
SageMaker runs the pre-trained HuggingFace large language model (LLM) (SentenceTransformer) for real-time inference. You also have the flexibility to select other models that best fit your needs. Amazon Bedrock offers an alternative way for running diverse foundation models and conducting inference to produce text embeddings.
Step 4
The AWS Cloud9 integrated development environment (IDE) hosts a sample Streamlit application that provides product information retrieved from RDS for PostgreSQL. You have the option to replace this sample application with one of your choice.
For alternative compute choices to run the application, consider deploying it on Amazon Elastic Container Service (Amazon ECS), Amazon Elastic Kubernetes Service (Amazon EKS), AWS Fargate, or Amazon Elastic Compute Cloud (Amazon EC2).
Step 5
Custom Resources in CloudFormation run user-defined provisioning logic during stack creation, update (if the Custom Resource is modified), or deletion. When combined with a Lambda function, CloudFormation invokes and runs the function accordingly.
The Custom Resource from the CloudFormation stack invokes Lambda to bootstrap RDS for PostgreSQL. During this process, the system creates the pgvector extension, initializes the Product Catalog schema, and generates embeddings using SageMaker near real-time inference.
The system stores these embeddings along with product catalog metadata in RDS for PostgreSQL and indexes the embeddings using the pgvector index type hierarchical navigable small world (HNSW).
Step 6
Run the e-commerce product catalog application on AWS Cloud9 and preview the application while it's running.
Step 7
Perform a search on the product catalog in the application, which generates embeddings for the search query using the SageMaker near real-time inference endpoint.
Step 8
The application connects to RDS for PostgreSQL, runs the similarity search query on embeddings using the pgvector HNSW index, and then displays the product catalog similarity search results on the application screen.
Get Started
Deploy this Guidance
Well-Architected Pillars
The AWS Well-Architected Framework helps you understand the pros and cons of the decisions you make when building systems in the cloud. The six pillars of the Framework allow you to learn architectural best practices for designing and operating reliable, secure, efficient, cost-effective, and sustainable systems. Using the AWS Well-Architected Tool, available at no charge in the AWS Management Console, you can review your workloads against these best practices by answering a set of questions for each pillar.
The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.
-
Operational Excellence
SageMaker simplifies machine learning model lifecycle management, allowing you to quickly adapt to changing data and user demands. RDS for PostgreSQL with the pgvector extension offers robust data storage and efficient nearest neighbor search capabilities, so you can deliver accurate and timely search results to your consumers. Together, these services streamline the deployment, monitoring, and maintenance of your search experience.
-
Security
RDS for PostgreSQL safeguards your data with industry-standard encryption protocols, while SageMaker offers built-in security controls to manage model training and deployment processes securely.
We recommend you use AWS Identity and Access Management (IAM) to control access to your AWS resources, and use AWS Secrets Manager to protect sensitive credentials.
-
Reliability
RDS for PostgreSQL provides high availability and durability, with automatic backups, database snapshots, and multiple Availability Zone (AZ) deployments for enhanced fault tolerance. Also, SageMaker allows you to configure multiple instances across AZs for high availability and quick recovery from failures for your machine learning operations.
-
Performance Efficiency
SageMaker supports near real-time inference and low-latency responses to user queries. RDS for PostgreSQL with the pgvector extension enables efficient management and querying of vector embeddings, significantly speeding up the similarity searches needed to match user queries with your product catalog.
We recommend you continuously monitor and optimize your system's performance by using AWS services like Amazon CloudWatch and AWS Auto Scaling so that the components in this Guidance remain responsive and cost-effective.
-
Cost Optimization
SageMaker helps reduce costs by providing a managed service with pay-as-you-go pricing and instance types optimized for specific workloads. Additionally, RDS for PostgreSQL offers cost efficiency through reserved instances and scaling options that adjust resources based on your database workload, minimizing unnecessary expenses. Moreover, you can implement cost monitoring and optimization strategies, such as AWS Budgets and AWS Cost Explorer, to continuously identify and address potential cost inefficiencies.
-
Sustainability
SageMaker and RDS for PostgreSQL are managed AWS services that optimize resource usage through efficient handling of workloads, reducing the environmental impact by minimizing the computational resources required for your workloads. And by deploying this Guidance in the AWS Cloud, you can avoid the need for physical hardware procurement, further enhancing the overall sustainability of your system. Additionally, use AWS services like AWS CloudTrail and AWS Config to monitor and enforce sustainable practices, such as resource utilization and energy efficiency.
Related Content
Building AI-powered search in PostgreSQL using Amazon SageMaker and pgvector
Disclaimer
The sample code; software libraries; command line tools; proofs of concept; templates; or other related technology (including any of the foregoing that are provided by our personnel) is provided to you as AWS Content under the AWS Customer Agreement, or the relevant written agreement between you and AWS (whichever applies). You should not use this AWS Content in your production accounts, or on production or other critical data. You are responsible for testing, securing, and optimizing the AWS Content, such as sample code, as appropriate for production grade use based on your specific quality control practices and standards. Deploying AWS Content may incur AWS charges for creating or using AWS chargeable resources, such as running Amazon EC2 instances or using Amazon S3 storage.
References to third-party services or organizations in this Guidance do not imply an endorsement, sponsorship, or affiliation between Amazon or AWS and the third party. Guidance from AWS is a technical starting point, and you can customize your integration with third-party services when you deploy the architecture.