Guidance for Dynamic Non-Player Character (NPC) Dialogue on AWS

[SEO Subhead]

This Guidance helps game developers automate the creation of non-player characters (NPCs) that dynamically respond to player questions based on custom in-game knowledge and personality. This deployable template sets up API access to large language models (LLMs) in Amazon Bedrock, enabling a game client to connect and prompt the models with player input. Any of the available models in Amazon Bedrock, such as Amazon Nova, Claude, and Llama, can be used to generate dynamic responses from the NPC, enhancing scripted dialogue and creating unique player interactions. At the same time, it ensures that the NPC has secure access to a comprehensive knowledge base of game lore. This Guidance includes architecture sample code and engine integration example for quick and effective deployment.

Please note: [Disclaimer]

Architecture Diagram

Download the architecture diagram PDF

Overview
LLMOps Pipeline
FMOps Pipeline
Database Hydration

Overview
This architecture diagram shows an overview workflow for hosting a generative AI NPC on AWS.

Step 1
Game clients interact with the NPC running on Unreal Engine.

Step 2
Requests for generated text responses from the NPC are sent to a Text API Amazon API Gateway endpoint. Requests that require game-specific context from the NPC are sent to a retrieval-augmented generation (RAG) API Gateway endpoint.

Step 3
AWS Lambda handles the NPC text requests and sends them to LLMs hosted on Amazon Bedrock.

Step 4
Base LLMs and LLMs customized through fine-tuning provide a generated text response.

Step 5
The generated text response is sent to Amazon Polly, which in turn returns an audio stream of the response. The audio format is returned to the NPC to be delivered as NPC synthesized speech.

Step 6
For RAG NPC requests, Lambda submits the request to Amazon Bedrock to generate a vectorized representation from the embeddings model. Lambda then searches for relevant information from an Amazon OpenSearch Service vector index.

Step 7
OpenSearch Service provides a similarity search capability to provide relevant context that augments the generated text request based on the vectorized representation of the request from Amazon Bedrock.

Step 8
The relevant context and original text request are sent to LLMs hosted on Amazon Bedrock to provide a generated text response. Amazon Polly then delivers the response as synthesized speech to the NPC.

Step 9
Game narrative writers add game-specific training data to create custom models using the FMOps process or add game lore data to hydrate the vector database.

Step 10
Infrastructure and DevOps engineers manage the architecture as code using the AWS Cloud Development Kit (AWS CDK) and monitor the Guidance using Amazon CloudWatch.

Click to enlarge

Step 1
Game clients interact with the NPC running on Unreal Engine.

Step 2
Requests for generated text responses from the NPC are sent to a Text API Amazon API Gateway endpoint. Requests that require game-specific context from the NPC are sent to a retrieval-augmented generation (RAG) API Gateway endpoint.

Step 3
AWS Lambda handles the NPC text requests and sends them to LLMs hosted on Amazon Bedrock.

Step 4
Base LLMs and LLMs customized through fine-tuning provide a generated text response.

Step 5
The generated text response is sent to Amazon Polly, which in turn returns an audio stream of the response. The audio format is returned to the NPC to be delivered as NPC synthesized speech.

Step 6
For RAG NPC requests, Lambda submits the request to Amazon Bedrock to generate a vectorized representation from the embeddings model. Lambda then searches for relevant information from an Amazon OpenSearch Service vector index.

Step 7
OpenSearch Service provides a similarity search capability to provide relevant context that augments the generated text request based on the vectorized representation of the request from Amazon Bedrock.

Step 8
The relevant context and original text request are sent to LLMs hosted on Amazon Bedrock to provide a generated text response. Amazon Polly then delivers the response as synthesized speech to the NPC.

Step 9
Game narrative writers add game-specific training data to create custom models using the FMOps process or add game lore data to hydrate the vector database.

Step 10
Infrastructure and DevOps engineers manage the architecture as code using the AWS Cloud Development Kit (AWS CDK) and monitor the Guidance using Amazon CloudWatch.
LLMOps Pipeline
This architecture diagram shows the processes of deploying an LLMOps pipeline on AWS.

Step 1
Infrastructure engineers build and test the codified infrastructure using AWS CDK.

Step 2
Updates to infrastructure code are committed to the Git repository, invoking the continuous integration and continuous deployment (CI/CD) pipeline within the Toolchain AWS account.

Step 3
Infrastructure assets, such as docker containers and AWS CloudFormation templates, are compiled and stored in Amazon Elastic Container Registry (Amazon ECR) and Amazon Simple Storage Service (Amazon S3).

Step 4
The infrastructure is deployed to the quality assurance (QA) AWS account as a CloudFormation stack for integration and system testing.

Step 5
AWS CodeBuild initiates automated testing scripts that verify that the architecture is functional and ready for production deployment.

Step 6
Upon successful completion of all systems tests, the infrastructure is automatically deployed as a CloudFormation stack into the Production (PROD) AWS account.

Step 7
The FMOps pipeline resources are also deployed as a CloudFormation stack into the PROD AWS account.

Click to enlarge

Step 1
Infrastructure engineers build and test the codified infrastructure using AWS CDK.

Step 2
Updates to infrastructure code are committed to the Git repository, invoking the continuous integration and continuous deployment (CI/CD) pipeline within the Toolchain AWS account.

Step 3
Infrastructure assets, such as docker containers and AWS CloudFormation templates, are compiled and stored in Amazon Elastic Container Registry (Amazon ECR) and Amazon Simple Storage Service (Amazon S3).

Step 4
The infrastructure is deployed to the quality assurance (QA) AWS account as a CloudFormation stack for integration and system testing.

Step 5
AWS CodeBuild initiates automated testing scripts that verify that the architecture is functional and ready for production deployment.

Step 6
Upon successful completion of all systems tests, the infrastructure is automatically deployed as a CloudFormation stack into the Production (PROD) AWS account.

Step 7
The FMOps pipeline resources are also deployed as a CloudFormation stack into the PROD AWS account.
FMOps Pipeline
This architecture diagram shows the process of tuning a generative AI model using FMOps.

Step 1
Game lore text documents are uploaded to an S3 bucket.

Step 2
The document object upload event invokes Amazon SageMaker Pipelines.

Step 3
The preprocessing step runs a SageMaker processing job to pre-process the text documents for model fine-tuning and model evaluation.

Step 4
The callback step allows SageMaker Pipelines to integrate with other AWS services by sending a message to an Amazon Simple Queue Service (Amazon SQS) queue. After sending the message, SageMaker Pipelines waits for a response from the queue.

Step 5
Amazon SQS manages the message queue that coordinates tasks between the SageMaker Pipelines and the AWS Step Functions workflow.

Step 6
The Step Functions workflow orchestrates the process of fine-tuning the LLM. Once a model has been fine-tuned, Amazon SQS sends a success message back to the SageMaker Pipelines callback step.

Step 7
The model evaluation step runs a SageMaker processing job to evaluate the fine-tuned model’s performance. The tuned model is stored in the Amazon SageMaker Model Registry.

Step 8
Machine learning (ML) practitioners review the tuned model and approve it for production use.

Step 9
An AWS CodePipeline workflow is invoked to deploy the approved model into production.

Click to enlarge

Step 1
Game lore text documents are uploaded to an S3 bucket.

Step 2
The document object upload event invokes Amazon SageMaker Pipelines.

Step 3
The preprocessing step runs a SageMaker processing job to pre-process the text documents for model fine-tuning and model evaluation.

Step 4
The callback step allows SageMaker Pipelines to integrate with other AWS services by sending a message to an Amazon Simple Queue Service (Amazon SQS) queue. After sending the message, SageMaker Pipelines waits for a response from the queue.

Step 5
Amazon SQS manages the message queue that coordinates tasks between the SageMaker Pipelines and the AWS Step Functions workflow.

Step 6
The Step Functions workflow orchestrates the process of fine-tuning the LLM. Once a model has been fine-tuned, Amazon SQS sends a success message back to the SageMaker Pipelines callback step.

Step 7
The model evaluation step runs a SageMaker processing job to evaluate the fine-tuned model’s performance. The tuned model is stored in the Amazon SageMaker Model Registry.

Step 8
Machine learning (ML) practitioners review the tuned model and approve it for production use.

Step 9
An AWS CodePipeline workflow is invoked to deploy the approved model into production.
Database Hydration
This architecture diagram shows the process for database hydration by vectorizing and storing gamer lore for RAG.

Step 1
A data scientist uploads game lore text documents to an S3 bucket.

Step 2
The object upload invokes a Lambda function to launch a SageMaker processing job.

Step 3
A SageMaker processing job downloads the text document from Amazon S3 and splits the text into multiple chunks.

Step 4
The SageMaker processing job then submits each chunk of text to an Amazon Titan embeddings model hosted on Amazon Bedrock to create a vectorized representation of the text chunks.

Step 5
The SageMaker processing job then ingests both the text chunk and the vector representation into OpenSearch Service for RAG.

Click to enlarge

Step 1
A data scientist uploads game lore text documents to an S3 bucket.

Step 2
The object upload invokes a Lambda function to launch a SageMaker processing job.

Step 3
A SageMaker processing job downloads the text document from Amazon S3 and splits the text into multiple chunks.

Step 4
The SageMaker processing job then submits each chunk of text to an Amazon Titan embeddings model hosted on Amazon Bedrock to create a vectorized representation of the text chunks.

Step 5
The SageMaker processing job then ingests both the text chunk and the vector representation into OpenSearch Service for RAG.

Well-Architected Pillars

The AWS Well-Architected Framework helps you understand the pros and cons of the decisions you make when building systems in the cloud. The six pillars of the Framework allow you to learn architectural best practices for designing and operating reliable, secure, efficient, cost-effective, and sustainable systems. Using the AWS Well-Architected Tool, available at no charge in the AWS Management Console, you can review your workloads against these best practices by answering a set of questions for each pillar.

The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.

Operational Excellence

This Guidance uses Lambda, API Gateway, and CloudWatch to track all API requests for generated NPC dialogue between the Unreal Engine client and the Amazon Bedrock foundation model. This provides end-to-end visibility into the status of the Guidance, allowing you to granularly track each request and response from the game client so you can quickly identify issues and react accordingly. Additionally, this Guidance is codified as a CDK application using CodePipeline, so operations teams and developers can address faults and bugs through appropriate change control methodologies and quickly deploy these updates or fixes using the CI/CD pipeline.

Read the Operational Excellence whitepaper
Security

Amazon S3 provides encrypted protection for storing game lore documentation at rest in addition to encrypted access for data in transit, while ingesting game lore documentation into the vector or fine-tuning an Amazon Bedrock foundation model. API Gateway adds an additional layer of security between the Unreal Engine and the Amazon Bedrock foundation model by providing TLS-based encryption of all data between the NPC and the model. Lastly, Amazon Bedrock implements automated abuse detection mechanisms to further identify and mitigate violations of the AWS Acceptable Use Policy and the AWS Responsible AI Policy.

Read the Security whitepaper
Reliability

API Gateway manages the automated scaling and throttling of requests by the NPC to the foundation model. Additionally, since the entire infrastructure is codified using CI/CD pipelines, you can provision resources across multiple AWS accounts and multiple AWS Regions in parallel. This enables multiple simultaneous infrastructure re-deployment scenarios to help you overcome AWS Region-level failures. As serverless infrastructure resources, API Gateway and Lambda allow you to focus on game development instead of manually managing resource allocation and usage patterns for API requests.

Read the Reliability whitepaper
Performance Efficiency

Serverless resources, such as Lambda and API Gateway, contribute to the efficiency of the Guidance by providing both elasticity and scalability. This allows the Guidance to dynamically adapt to an increase or decrease in API calls from the NPC client. An elastic and scalable approach helps you right-size resources for optimal performance and to address unforeseen increases or decreases in API requests—without having to manually manage provisioned infrastructure resources.

Read the Performance Efficiency whitepaper
Cost Optimization

Codifying the Guidance as a CDK application provides game developers with the ability to quickly prototype and deploy their NPC characters into production. Amazon Bedrock gives developers quick access to foundation models hosted on Amazon Bedrock through an API Gateway REST API without having to engineer, build, and pre-train them. Turning around quick prototypes helps reduce the time and operations costs associated with building foundation models from scratch.

Read the Cost Optimization whitepaper
Sustainability

Lambda provides a serverless, scalable, and event-driven approach without having to provision dedicated compute resources. Amazon S3 implements data lifecycle policies along with compression for all data across this Guidance, allowing for energy-efficient storage. Amazon Bedrock hosts foundation models on AWS silicon, offering better performance per watt of standard compute resources.

Read the Sustainability whitepaper

Implementation Resources

The sample code is a starting point. It is industry validated, prescriptive but not definitive, and a peek under the hood to help you begin.

Open sample code on GitHub

Select your cookie preferences

[SEO Subhead]

Architecture Diagram

Well-Architected Pillars

Implementation Resources

Related Content

[Title]

Disclaimer

Was this page helpful?

Select your cookie preferences

Guidance for Dynamic Non-Player Character (NPC) Dialogue on AWS

[SEO Subhead]

Architecture Diagram

Well-Architected Pillars

Implementation Resources

Related Content

[Title]

Disclaimer

Was this page helpful?

Ending Support for Internet Explorer