Use LangChain and vector search on Amazon DocumentDB to build a generative AI chatbot

Amazon DocumentDB (with MongoDB compatibility) offers benefits to customers building modern applications across multiple domains, including healthcare, gaming, and finance. As a fully managed document database, it can improve user experiences through flexibility, scalability, high performance, and advanced functionality. Enterprises that use the JSON data model supported by Amazon DocumentDB can achieve faster application development and faster reading speed by supporting semi-structured data.

By some estimates, unstructured data can make up to 80–90% of all new enterprise data, and is growing many times faster than structured data. This trend is accelerated by new opportunities in generative AI—AWS customers are increasingly asking how they can take advantage of this and apply it to their wealth of data. Many are looking to use vector database engines to implement recommendation engines, search rich media, or retrieve more relevant documents to their customer’s query by matching the context and semantics of the query.

Until recent feature releases, such as support for vector search on Amazon DocumentDB, in order to integrate vector search capabilities into their workload, enterprises needed to increase the cost and complexity of their architecture by moving their data out of their managed database service and into a vector database engine or service. This change in architecture leads to further code changes because the application needs to store and retrieve its embeddings in a different location than its documents.

Enabling semantic search capability within existing Amazon DocumentDB clusters makes it straightforward to build modern machine learning (ML) and generative artificial intelligence (AI) applications. You can use vector search on Amazon DocumentDB with AWS ML services, such as Amazon Bedrock and Amazon SageMaker, and other third-party services, such as OpenAI, Hugging Face, and LangChain. You can now store your vectors within your original documents. This feature was further enhanced with support of HNSW indexes, which lets you run vector similarity searches with low latency and produce highly relevant results.

In this post, we show you how to get started with an example of creating a chatbot that can query your large language model (LLM) by using LangChain after you’ve successfully loaded your embeddings into Amazon DocumentDB.

Retrieval Augmented Generation with LLMs

You can use Retrieval Augmented Generation (RAG) to retrieve data from outside a foundation model and augment your prompts by adding the relevant retrieved data in context. This is useful when building chatbots that can deliver a conversational experience to end-users through intelligent agents. This provides an intuitive interface because it retrieves information most relevant to the user’s request from the enterprise knowledge base or content, bundles it as context along with the user’s request as a prompt, and sends it to the LLM to get a generative AI response.

In this example, we used the LangChain PyPDFDirectoryLoader to ingest the PDF version of the Amazon DocumentDB Developer Guide. We use this to create a chatbot that we can interact with to ask questions about the service’s features, usage, and best practices. You can find the full solution in the amazon-documentdb-samples GitHub repository.

Prior to loading the data into Amazon DocumentDB, we create an HNSW index on the collection:

collection.create_index([("vectorContent","vector")], 
    vectorOptions= {
        "type": "hnsw", 
        "similarity": "euclidean",
        "dimensions": 1536,
        "m": 16,
        "efConstruction": 64},
    name="hnsw")

Unlike IVFFlat, HNSW has no training steps involved, allowing the index to be generated without an initial data load.

After using LangChain’s RecursiveCharacterTextSplitter to break the Developer Guide into chunks, we use the Amazon Titan Embeddings G1 – Text model (amazon.titan-embed-text-v1) to create the embeddings:

embeddings = BedrockEmbeddings(model_id= "amazon.titan-embed-text-v1", client=bedrock_client)
INDEX_NAME = "hnsw"
vector_store = DocumentDBVectorSearch.from_documents(
    documents=docs,
    embedding=embeddings,
    collection=collection,
    index_name=INDEX_NAME,
)

Finally, we initialize Anthropic Claude for Amazon Bedrock as our reasoning agent to break down user-requested tasks into multiple steps:

llm = BedrockChat(model_id="anthropic.claude-3-sonnet-20240229-v1:0", client=bedrock_client)

To test the solution, we create a prompt template and use the RetrievalQA chain to collect documents relevant to our question from the HNSW index. These documents are used to create a unique answer to our question, as shown in the following screenshot:

Screenshot displaying Jupyter Notebook of Code

You can review the full notebook for examples of how to use RAG that takes a query, chat history, and context as arguments, as well as using aggregation pipelines in Amazon DocumentDB to perform your vector search.

Summary

Vector search for Amazon DocumentDB combines the flexibility and rich querying capability of a JSON-based document database with the power of vector search. Using this feature with powerful frameworks like LangChain, text embeddings like Amazon Titan Text Embeddings, and foundation models like Anthropic’s Claude 3 through Amazon Bedrock allows you to build ML and generative AI solutions such as a semantic search experience, product recommendations, personalization, chatbots, fraud detection, and anomaly detection.

To get started using vector search with your existing workloads, visit the amazon-documentdb-samples GitHub repository to download the example discussed in this post.

About the Authors

Picture of author 1, Andrew Andrew Chen is an Edtech Solutions Architect with an interest in data analytics, machine learning, and virtualization of infrastructure. Andrew has previous experience in management consulting in which he worked as a technical lead for various cloud migration projects. In his free time, Andrew enjoys fishing, hiking, kayaking, and keeping up with financial markets.

Cody Allen is a Principal DocumentDB Specialist Solutions Architect based out of Texas. He is passionate about working side by side with customers to solve complex problems, and supporting teammates through mentorship and knowledge transfer. He has spent his career deploying and managing systems, softwares, and infrastructure for B2B SaaS providers, materiel and logistics suppliers, the U.S Air Force, and other government agencies domestic and international.

Inderpreet Singh is a Senior Product Manager (Tech) at Amazon DocumentDB. With over 12 years of experience in business consulting, Inderpreet’s goal is to merge his extensive business background with cutting-edge technology to shape the future of databases. He holds an MBA from IE Business School and an MS from the Massachusetts Institute of Technology. In his free time, Inderpreet enjoys learning new languages, teaching, and stock market trading.

AWS Database Blog

Use LangChain and vector search on Amazon DocumentDB to build a generative AI chatbot

Retrieval Augmented Generation with LLMs

Summary

About the Authors

Resources

Blog Topics

Follow