Implement real-time personalized recommendations using Amazon Personalize

February 9, 2024: Amazon Kinesis Data Firehose has been renamed to Amazon Data Firehose. Read the AWS What’s New post to learn more.

At a basic level, Machine Learning (ML) technology learns from data to make predictions. Businesses use their data with an ML-powered personalization service to elevate their customer experience. This approach allows businesses to use data to derive actionable insights and help grow their revenue and brand loyalty.

Amazon Personalize accelerates your digital transformation with ML, making it easier to integrate personalized recommendations into existing websites, applications, email marketing systems, and more. Amazon Personalize enables developers to quickly implement a customized personalization engine, without requiring ML expertise. Amazon Personalize provisions the necessary infrastructure and manages the entire machine learning (ML) pipeline, including processing the data, identifying features, using the most appropriate algorithms, and training, optimizing, and hosting the models. You receive results through an API and pay only for what you use, with no minimum fees or upfront commitments.

The post Architecting near real-time personalized recommendations with Amazon Personalize shows how to architect near real-time personalized recommendations using Amazon Personalize and AWS purpose-built data services. In this post, we walk you through a reference implementation of a real-time personalized recommendation system using Amazon Personalize.

Solution overview

The real-time personalized recommendations solution is implemented using Amazon Personalize, Amazon Simple Storage Service (Amazon S3), Amazon Kinesis Data Streams, AWS Lambda, and Amazon API Gateway.

The architecture is implemented as follows:

Data preparation – Start by creating a dataset group, schemas, and datasets representing your items, interactions, and user data.
Train the model – After importing your data, select the recipe matching your use case, and then create a solution to train a model by creating a solution version. When your solution version is ready, you can create a campaign for your solution version.
Get near real-time recommendations – When you have a campaign, you can integrate calls to the campaign in your application. This is where calls to the GetRecommendations or GetPersonalizedRanking APIs are made to request near real-time recommendations from Amazon Personalize.

For more information, refer to Architecting near real-time personalized recommendations with Amazon Personalize.

The following diagram illustrates the solution architecture.

Implementation

We demonstrate this implementation with a use case about making real-time movie recommendations to an end user based on their interactions with the movie database over time.

The solution is implemented using the following steps:

Prerequisite (Data preparation)
Setup your development environment
Deploy the solution
Create a solution version
Create a campaign
Create an event tracker
Get recommendations
Ingest real-time interactions
Validate real-time recommendations
Cleanup

Prerequisites

Before you get started, make sure you have the following prerequisites:

Prepare your training data – Prepare and upload the data to an S3 bucket using the instructions. For this particular use case, you will be uploading interactions data and items data. An interaction is an event that you record and then import as training data. Amazon Personalize generates recommendations primarily based on the interactions data you import into an Interactions dataset. You can record multiple event types, such as click, watch, or like. Although the model created by Amazon Personalize can suggest based on a user’s past interactions, the quality of these suggestions can be enhanced when the model possesses data about the associations among users or items . If a user has engaged with movies categorized as Drama in the item dataset, Amazon Personalize will suggest movies (items) with the same genre.
Setup your development environment – Install the AWS Command Line Interface (AWS CLI).
Configure CLI with your Amazon account – Configure the AWS CLI with your AWS account information.
Install and bootstrap AWS Cloud Development Kit (AWS CDK)

Deploy the solution

To deploy the solution, do the following:

Clone the repository to a new folder on your desktop.
Deploy the stack to your AWS environment.

Create a solution version

A solution refers to the combination of an Amazon Personalize recipe, customized parameters, and one or more solution versions (trained models). When you deploy the CDK project in the previous step, a solution with a User-Personalization recipe is created for you automatically. A solution version refers to a trained machine learning model. Create a solution version for the implementation.

Create a campaign

A campaign deploys a solution version (trained model) with a provisioned transaction capacity for generating real-time recommendations. Create a campaign for the implementation.

Create an event tracker

Amazon Personalize can make recommendations based on real-time event data only, historical event data only, or both. Record real-time events to build out your interactions data and allow Amazon Personalize to learn from your user’s most recent activity. This keeps your data fresh and improves the relevance of Amazon Personalize recommendations. Before you can record events, you must create an event tracker. An event tracker directs new event data to the Interactions dataset in your dataset group. Create and event tracker for the implementation.

Get recommendations

In this use case, the interaction dataset is composed of movie IDs. Consequently, the recommendations presented to the user will consist of movie IDs that align most closely with their personal preferences, determined from their historical interactions. You can use the getRecommendations API to retrieve personalized recommendations for a user by sending its associated userID, the number of results for recommendations that you need for the user as well as the campaign ARN. You can find the campaign ARN in the Amazon Personalize console menu.

For example, the following request will retrieve 5 recommendations for the user whose userId is 429:

curl --location 'https://{your-api-id}.execute-api.{your-region}.amazonaws.com/prod/getRecommendations?campaignArn={campaignArn}&userId=429&numResults=5'

The response from the request will be:

{
	"$metadata": {
		"httpStatusCode": 200,
		"requestId": "7159c128-4e16-45a4-9d7e-cf19aa2256e8",
		"attempts": 1,
		"totalRetryDelay": 0
	},
	"itemList": [
	{
		"itemId": "596",
		"score": 0.0243044
	},
	{
		"itemId": "153",
		"score": 0.0151695
	},
	{
		"itemId": "16",
		"score": 0.013694
	},
	{
		"itemId": "261",
		"score": 0.013524
	},
	{
		"itemId": "34",
		"score": 0.0122294
	}
	],
	"recommendationId": "RID-1d-40c1-8d20-dfffbd7b0ac7-CID-06b10f"
}

The items returned by the API call are the movies that Amazon Personalize recommends to the user based on their historical interactions.

The score values provided in this context represent floating-point numbers that range between zero and 1.0. These values correspond to the current campaign and the associated recipes for this use case. They are determined based on the collective scores assigned to all items present in your comprehensive dataset.

Ingest real-time interactions

In the previous example, recommendations were obtained for the user with an ID of 429 based on their historical interactions with the movie database. For real-time recommendations, the user interactions with the items must be ingested into Amazon Personalize in real-time. These interactions are ingested into the recommendation system through the Amazon Personalize Event Tracker. The type of interaction, also called EventType, is given by the column of the same name in the interaction data dataset (EVENT_TYPE). In this example, the events can be of type “watch” or “click”, but you can have your own types of events according to the needs of your application.

In this example, the exposed API that generates the events of the users with the items receives the “interactions” parameter that corresponds to the number of events (interactions) of a user (UserId) with a single element (itemId) right now. The trackingId parameter can be found in the Amazon Personalize console and in the response of the creation of Event Tracker request.

This example shows a putEvent request: Generate 1 interactions of click type, with an item id of ‘185’ for the user id ‘429’, using the current timestamp. Note that in production, the ‘sentAt’ should be set to the time of the user’s interaction. In the following example, we set this to the point in time in epoch time format when we wrote the API request for this post. The events are sent to Amazon Kinesis Data Streams through an API Gateway which is why you need to send the stream-name and PartitionKey parameters.

curl --location 'https://iyxhva3ll6.execute-api.us-west-2.amazonaws.com/prod/data' --header 'Content-Type: application/json' --data '{ "stream-name": "my-stream","Data": {"userId" : "429", "interactions": 1, "itemId": "185", "trackingId" : "c90ac6d7-3d89-4abc-8a70-9b09c295cbcd", "eventType": "click", "sentAt":"1698711110"},"PartitionKey":"userId"}'

You will receive a confirmation response similar to the following:

{
	"Message": "Event sent successfully",
	"data": {
		"EncryptionType": "KMS",
		"SequenceNumber": "49..........1901314",
		"ShardId": "shardId-xxxxxxx"
	}
}

Validate real-time recommendations

Because the interaction dataset has been updated, the recommendations will be automatically updated to consider the new interactions. To validate the recommendations updated in real-time, you can call the getRecommendations API again for the same user id 429, and the result should be different from the previous one. The following results show a new recommendation with an id of 594 and the recommendations with the id’s of 16, 596, 153and 261 changed their scores. These items brought in new movie genre (‘Animation|Children|Drama|Fantasy|Musical’) the top 5 recommendations.

Request:

curl --location 'https://{your-api-id}.execute-api.{your-region}.amazonaws.com/prod/getRecommendations?campaignArn={campaignArn} &userId=429&numResults=5'

Response:

{
	"$metadata": {
		"httpStatusCode": 200,
		"requestId": "680f2be8-2e64-47d7-96f7-1c4aa9b9ac9d",
		"attempts": 1,
		"totalRetryDelay": 0
	},
	"itemList": [
	{
		"itemId": "596",
		"score": 0.0288085
	},
	{
		"itemId": "16",
		"score": 0.0134173
	},
	{
		"itemId": "594",
		"score": 0.0129357
	},
	{
		"itemId": "153",
		"score": 0.0129337
	},
	{
		"itemId": "261",
		"score": 0.0123728
	}
	],
	"recommendationId": "RID-dc-44f8-a327-482fb9e54921-CID-06b10f"
}

The response shows that the recommendation provided by Amazon Personalize was updated in real-time.

Clean up

To avoid unnecessary charges, clean up the solution implementation by using Cleaning up resources.

Conclusion

In this post, we showed you how to implement a real-time personalized recommendations system using Amazon Personalize. The interactions with Amazon Personalize to ingest real-time interactions and get recommendations were executed through a command line tool called curl but these API calls can be integrated into a business application and derive the same outcome.

To choose a new recipe for your use case, refer to Real-time personalization. To measure the impact of the recommendations made by Amazon Personalize, refer to Measuring impact of recommendations.

About the Authors

Cristian Marquez is a Senior Cloud Application Architect. He has vast experience designing, building, and delivering enterprise-level software, high load and distributed systems and cloud native applications. He has experience in backend and frontend programming languages, as well as system design and implementation of DevOps practices. He actively assists customers build and secure innovative cloud solutions, solving their business problems and achieving their business goals.

Anand Komandooru is a Senior Cloud Architect at AWS. He joined AWS Professional Services organization in 2021 and helps customers build cloud-native applications on AWS cloud. He has over 20 years of experience building software and his favorite Amazon leadership principle is “Leaders are right a lot.“

AWS Machine Learning Blog