Add Retrieval Augmented Generation (RAG) to your .NET applications with Amazon Bedrock Knowledge Bases

Introduction

When interacting with a large language model (LLM), its knowledge comes from data used during its training, which is often mostly from public sources. As a result, the model’s knowledge may not fully reflect the most current information. This leads to three main issues: outdated information, no access to internal company data, and potential inaccuracies. However, an approach called Retrieval Augmented Generation (RAG) addresses these issues. RAG-based models have the ability to access and leverage private data sources outside of their original training data. This allows them to provide more accurate, up-to-date and domain-specific responses tailored to the user’s needs.

This is Part 2 of a series that explores building .NET generative artificial intelligence (AI) applications using the foundational models (FMs) supported by Amazon Bedrock. We recommend you go through Part 1, Integrating Amazon Bedrock in your .NET applications, which covers the integration of Amazon Bedrock in your .NET applications.

In this post, we’ll show you how to create a RAG virtual assistant application for .NET using Amazon Bedrock Knowledge Bases. This solution demonstrates how to design and implement a RAG pattern that combines Amazon OpenSearch’s vector database capabilities with Amazon Bedrock to build an AI assistant with an advanced search interface.

Before we jump into the solution, it’s important to learn about vector embeddings and vector databases. Vector embeddings transform data into numerical representations that capture its semantic meaning and relationships. These embeddings enable us to conduct semantic searches, allowing us to find information based on its underlying meaning rather than exact keyword matches. Vector databases are specialized storage systems for vectors. Instead of traditional methods, they represent information as mathematical vectors occupying a multi-dimensional space. Vector databases are used to index and search unstructured and semi-structured data, such as images or text.

We use FMs available through Amazon Bedrock for the embeddings model (Amazon Titan Text Embeddings v2), the text generation model (Anthropic Claude 3 Haiku), and Amazon Bedrock Knowledge Bases for this solution. Before you read further, we highly recommend reviewing this primer on Retrieval Augmented Generation, Embeddings, Vector Databases, and Knowledge bases. We use the Amazon letters to shareholders dataset to develop this solution.

Architecture

The solution presented in this post uses a virtual assistant application built using the following solution architecture:

Figure 1: Generate embeddings knowledge base pipeline

The retrieval architecture workflow includes the following steps:

A user uploads the Amazon letters to shareholders dataset to an Amazon Simple Storage Service (Amazon S3) bucket which has been set up as the data source in Knowledge Bases for Amazon Bedrock.
The knowledge base splits the documents in the data source into manageable chunks for efficient retrieval.
The knowledge base is set up to use Amazon OpenSearch Serverless as its vector store and an Amazon Titan embedding text model to create the embeddings. In this step, Knowledge Bases for Amazon Bedrock converts the chunks to embeddings and writes to a vector index in the OpenSearch vector store, while maintaining a mapping to the original document.

Figure 2: Retrieve and generate response workflow using a knowledge base

The following describes the sequence of steps from Figure 2 for generating responses using the RAG pattern:

Leverage the AWS SDK for .NET to establish a connection and interact with the Amazon Bedrock Runtime API.
Take the user’s query and invoke the RetrieveAndGenerate API.
The KB API will use the embedded query to retrieve the most relevant documents from the knowledge base. It will combine the original user query with the information retrieved from the knowledge base to create an “augmented query” that provides additional context and relevant details. API will send the augmented query to a large language model (LLM) for processing and generation of the final response.
AWS SDK for .NET will receive the generated response from the LLM and deliver it back to the user, providing them with an informed and contextual answer to their original query.

Now, let’s look at the step-by-step instructions to build the AWS infrastructure needed for the sample application.

Prerequisites

To set up this solution, install the following prerequisites:

.NET 8.0 Software Development Kit (SDK)
The latest AWS Command Line Interface (AWS CLI)
Visual Studio Code (or your preferred IDE for .NET development)
AWS credentials for your AWS account configured locally
Git
Access to Amazon Titan Text Embeddings v2 and Anthropic Claude 3 Haiku models in your AWS account. Instructions to provide access can be found in the Amazon Bedrock User Guide.

Clone the GitHub repository

The solution presented in this post is available in the aws-samples/dotnet-genai-samples GitHub repository. Clone the GitHub repository to your local machine. Open a terminal window and create a folder for the solution code. Navigate to the new folder and run the following command (this is a single git clone command):

git clone aws-samples/dotnet-genai-samples

The repo also contains example shareholder letters and associated metadata json files.

Solution walkthrough

With the prerequisites satisfied, the next steps will guide you in configuring Knowledge base.

Step 1: Configure Knowledge base

A knowledge base requires:

A data source that stores information used for RAG
A vector database to store the embeddings created from our data sources

Step 1A: Create a data source

For the knowledge base in our example, we use an Amazon S3 bucket as the data source. This bucket contains a collection of shareholder letters published by Amazon over the years. By storing these letters in an S3 bucket, we can easily access and analyze the information they contain to enhance the knowledge base. The shareholder letters are part of the solution repository on GitHub.

You have the option to attach metadata to the files in your data source. Metadata filtering in Knowledge Bases for Amazon Bedrock allows you to define custom metadata attributes that can be used to filter search results before running a query.
This metadata filtering feature improves retrieval accuracy and maintaining data privacy and security. You can specify custom metadata for each document when ingesting data into the Bedrock knowledge base. For example, if you have a file called datasource.csv, the metadata file must be named datsource.csv.metadata.json, where you can specify the desired metadata fields. You can provide metadata by uploading JSON files within the same Amazon S3 bucket while adhering to some specific requirements.

Figure 3: Data source and the metadata for the documents in the data source

Following is the content from AMZN-2019-Shareholder-Letter.pdf.metadata.json file. This will give you an idea of the metadata we used for the files in our data source. A little later, we show you how you can use metadata as filters in your .NET code while querying your data sources to retrieve information to augment your prompts.

{
  "metadataAttributes": {
    "company": "Amazon",
    "year": 2019
  }
}

Step 1B: Create an Amazon Bedrock Knowledge Base

Now, create an Amazon Bedrock Knowledge Base. From the Amazon Bedrock console, go to Orchestration → Knowledge bases from the left navigation menu. Select Create knowledge base. You will be presented with a page that looks similar to Figure 4:

Figure 4: Provide knowledge base details

Provide a name and optional description for your knowledge base. Knowledge bases need permissions to access the S3 bucket and other AWS services. Select the default options in the IAM Permissions section. This will create and use a new AWS Identity and Access Management (IAM) service role for the service to use. Next, we need to configure our data source.

Figure 5: Configure data source

Provide a data source name and point your data source to the Amazon S3 bucket we created earlier with our shareholder letters. You can create up to 5 data sources. Next, we need to configure the embeddings model to generate vector embeddings of our source data and the vector database to store them.

Figure 6: Select embeddings model and configure vector store

We’re going to select Amazon Titan Embeddings for our embeddings model.

When building a knowledge base, you can choose from various vector database options such as Amazon OpenSearch Serverless, Amazon Aurora, MongoDB Atlas, Pinecone, or Redis Enterprise Cloud. For our example, we’ll proceed with the default option: Quick create a new vector store – Recommended. This will create the vector store using Amazon OpenSearch Serverless.

Amazon OpenSearch Serverless is a serverless option for Amazon OpenSearch Service. It provides a streamlined experience that removes the need for you to provision, configure, and tune clusters. The serverless cluster scales compute capacity based on your application’s needs.

Figure 7: Vector database configuration

Review the configuration and select Create knowledge base. Creating the knowledge base may take a few minutes. Amazon Bedrock will start ingesting the data by generating embeddings and storing them in the vector database. Once the status shows Ready, the knowledge base is created successfully!

Now that our vector DB is ready and populated with embeddings from our data source, let us switch our attention to the .NET sample application. It’s an ASP.NET Razor pages application that contains a page with a knowledge base playground, that demonstrates the virtual assistants we can build using knowledge bases for Amazon Bedrock.

Step 2: .NET application code

Browse to the location where you downloaded the sample code. Open the Amazon.GenAI.sln solution file using Visual Studio or your preferred integrated development environment (IDE). You can find the solution file in the main folder of the sample application. Select the project Amazon.GenAI. Inside the project, open the KnowledgeBasePlayground.razor file.

The below code snippet from the KnowledgeBasePlayground.razor uses AWS SDK for .NET to make a request to the Amazon Bedrock Runtime API for retrieving and generating a response based on a given question. It creates a RetrieveAndGenerateRequest object with the necessary parameters, including the user’s question, the knowledge base configuration, and the retrieval configuration. The request is then sent using an instance of the AmazonBedrockAgentRuntimeClient, and the response from the API is awaited and stored in the result variable.

var request = new RetrieveAndGenerateRequest
{
    Input = new RetrieveAndGenerateInput { Text = question }, 
    RetrieveAndGenerateConfiguration = new RetrieveAndGenerateConfiguration
    {
        Type = RetrieveAndGenerateType.KNOWLEDGE_BASE,
        KnowledgeBaseRetrieveAndGenerateConfiguration = new KnowledgeBaseRetrieveAndGenerateConfiguration
        {
            KnowledgeBaseId = _knowledgeBase?.KnowledgeBaseId,
            ModelArn = _textModelId,
            RetrievalConfiguration = new KnowledgeBaseRetrievalConfiguration 
            {
                VectorSearchConfiguration = new KnowledgeBaseVectorSearchConfiguration
                {
                    OverrideSearchType = "HYBRID",
                }
            }
        }
    }
};
var result = await _agentClient!.RetrieveAndGenerateAsync(request);

The following table includes some sample questions and related knowledge base responses. Try out some of these questions by using prompts.

Prompt	Response
What is Amazon doing in the field of generative AI?	Amazon is investing heavily in Large Language Models (LLMs) and Generative AI, which they believe will transform and improve virtually every customer experience. Amazon has been working on their own LLMs for a while and is democratizing this technology so companies of all sizes can leverage Generative AI. AWS is offering the most price-performant machine learning chips in Trainium and Inferentia so small and large companies can afford to train and run their LLMs in production. AWS is also delivering applications like CodeWhisperer, which revolutionizes developer productivity by generating code suggestions in real time.
What is AWS year-over-year revenue in 2022?	According to the search results, AWS had a 29% year-over-year revenue growth in 2022 on a $62 billion revenue base.
Which was the first inference chip launched by AWS?	According to the search results, AWS launched its first inference chips, called Inferentia, in 2019. These chips have already saved companies like Amazon over a hundred million dollars in capital expense.
According to the context, in what year did Amazon’s annual revenue increase from $245B to $434B?	According to the search results, Amazon’s annual revenue increased from $245 billion in 2019 to $434 billion in 2022.

When the Amazon.GenAI.NET application is first launched, it will retrieve a list of all the knowledge bases configured in your AWS account. This list of knowledge bases will then be displayed in a dropdown menu for the user to select from, as shown in Figure 8.

Figure 8: Launching the application

The RetrieveAndGenerate API returns a sessionId, which is then passed by the .NET app along with the subsequent user prompt as an input to the RetrieveAndGenerate API to continue the conversation in the same session. The RetrieveAndGenerate API manages the short-term memory and uses the chat history as long as the same sessionId is passed as an input in the successive calls.

Figure 9: Knowledgebase Playground

Pricing

You are only charged for the models and vector databases you use while building RAG applications using Agents and Knowledge Bases for Amazon Bedrock. For more information on the model pricing, refer Amazon Bedrock Pricing.

Cleanup

If you followed along and tried our sample application in your own AWS account, it is important to clean up the resources that you created to stop incurring charges.

To delete the knowledge base that was created, follow the instructions at Delete an Amazon Bedrock knowledge base provided in the user guide. Based on the data deletion policy you selected for your vector data store while creating the knowledge base, you may need to delete the vector index containing the data embeddings separately.

Make sure the data source for the knowledge base, the Amazon S3 bucket, is deleted. If not, follow the instructions at Deleting a bucket in the S3 user guide to delete it.

Conclusion

In this post, you learned how to create a RAG virtual assistant application for .NET using Amazon Bedrock Knowledge Bases. Although the process behind training and creating large language models (LLMs) is complex, adopting them to create value in your business is not. Amazon Bedrock is a fully managed service that offers access to high-performing foundation models (FMs) with easy to use APIs that help you build generative AI applications. To further optimize and enhance the responses from these FMs, you can use Amazon Bedrock’s capabilities like Knowledge Bases for Amazon Bedrock. This helps you build custom generative AI applications using authoritative private knowledge sources. You can further customize your RAG workflows, like using custom chunking strategies and vector stores. Learn more at Knowledge Bases now delivers fully managed RAG experience in Amazon Bedrock

We encourage you to use the code and the guidelines provided in this workshop to get hands-on experience building generative AI RAG applications using .NET and Amazon Bedrock. There are development resources available to help you skill up on generative AI technology. If you want to connect with an AWS AI/ML specialist and learn how our AI/ML services can help your business, please contact us.

.NET on AWS Blog