Implement an analytics pipeline for games

Analytics and Machine Learning

Written by: Gena Gizzi, Greg Cheng, and Dominic Mills

Games are generating more data than ever. So, it’s important to have access to the right data at the right time as you develop your games. This enables you to answer questions about how your games are performing and determine what changes you want to make to keep players engaged.

Studios like Metalhead Software run analytics on AWS to gain insights that help teams make fast, well-informed decisions for a variety of use cases.

Player engagement: Analytics highlight areas where game design could be improved, helping you create more engaging games. Instrumenting your game to emit game events enables you to analyze the event data and reveal how your games are being played. Then, you can use that information to help enhance your design.

Monetization: The game industry is increasing adoption of the games as a service operation model. With this model, recurring revenue is frequently generated through in-app purchases, subscriptions, advertising, and other techniques. To understand the features players are willing to pay for, it’s helpful to know which elements of your game draw players in and keep them returning. With this information, you can encourage purchases, serve targeted ads, and offer rewarded videos.

Fraud and player investigation: Fraudulent behavior and in-game cheating can be disruptive to the game experience, making it less enjoyable for your players. Having a plan helps you react quickly—so you can avoid game disruptions and reduce negative impacts on your game. With analytics, you can detect and prevent cheating, investigate fraud, and understand player complaints.

Performance and error reporting: Identifying peak usage times using metrics, like CPU and memory utilization, enables you to scale infrastructure accordingly. You can also analyze error trends using log analytics to help detect and troubleshoot errors.

While having access to analytics is important, there are some challenges unique to the game industry. Because games generate so much data, it’s important to understand what data to collect and how to collect it. Also, traditional analytics solutions are complex to manage and scale. Depending on the size and maturity of your game studio, you may not have dedicated resources to manage this. Another challenge is that many managed service options keep different sources of data completely separate. This makes it harder to get insights across all of your data. Managed services also often lack flexibility. For example, your unique game data may not be able to fit into a standard message format or match specific tags or message types.

However, we offer a solution that addresses these challenges—the AWS Game Analytics Pipeline solution. The Game Analytics Pipeline solution helps game developers launch a scalable analytics pipeline to ingest, store, analyze, and visualize telemetry data generated from games and services. This serverless solution gives you time to focus on getting insights and expanding solution functionality instead of managing analytics infrastructure. The solution is an infrastructure as code (IaC) tool that you can quickly deploy with AWS CloudFormation. Once it’s deployed, you can ingest and durably store data at a massive scale. Then, you can analyze that data with the flexibility to choose any analysis tools you want.

The following is an architecture diagram of the solution:

analytics pipeline

Figure 1: Architecture diagram of the AWS Game Analytics Pipeline solution

The provided AWS CloudFormation template deploys AWS resources that ingest game data from your data producers, which includes game clients, game servers, and other applications. The streaming data is ingested into Amazon Simple Storage Service (Amazon S3) for data lake integration and interactive analytics. Streaming analytics allow you to process real-time events and generate metrics. Data consumers analyze data metrics in Amazon CloudWatch and raw events in Amazon S3. The template deploys the following essential building blocks for this solution:

Solution API and configuration data: Amazon API Gateway provides REST API endpoints for registering game applications with the solution, ingesting game telemetry data, and sending events to Amazon Kinesis Data Streams (KDS). Amazon DynamoDB stores game application configurations and API keys to use when sending events to the solution API.

Event streaming: KDS captures streaming data from your game and enables real-time data processing by Amazon Kinesis Data Firehose and Amazon Kinesis Data Analytics.

Streaming analytics: Kinesis Data Analytics analyzes streaming event data from KDS to generate custom metrics. The custom metrics outputs are processed using AWS Lambda and published to Amazon CloudWatch.

Metrics and notifications: CloudWatch monitors, logs, and generates alarms for your AWS resources and creates an operational dashboard. It also provides metrics storage for custom metrics generated by Kinesis Data Analytics. Amazon Simple Notification Service (Amazon SNS) delivers notifications to solution administrators and other data consumers when CloudWatch alarms are breached.

Streaming ingestion: Kinesis Data Firehose consumes data from KDS. Kinesis Data Firehose then invokes AWS Lambda with batches of events for serverless data processing and transformation before the data is delivered to Amazon S3.

Data lake integration and ETL: Amazon S3 provides storage for raw and processed data. AWS Glue provides extract, transform, and load (ETL) processing workflows and metadata storage in the AWS Glue Data Catalog, which provides the basis for a data lake for integration with flexible analytics tools.

Interactive analytics: Amazon Athena sample queries are deployed to provide analysis of game events. You can easily integrate these queries with Amazon QuickSight for reporting and visualization insights.

When planning the integration and deployment of the Game Analytics Pipeline solution, there are several things you should consider:

Streaming analytics: You can choose to disable streaming analytics to simply the solution and reduce costs. You may choose to do this if you don’t need react to player behavior in real time and only need batch processing and reporting for historical data. This setting can be toggled during the deployment of the AWS CloudFormation template.

Kinesis shard count: Depending on the expected throughput of data ingested into the pipeline, you can modify the number of KDS shards that are initially deployed with the solution. If your requirements change after the deployment, you can modify the shard count in the Kinesis Data Stream console or via the AWS Command Line Interface (CLI).

Data ingestion method: The solution ingests game event data either by submitting data directly to KDS or by sending requests to the solution API, which forwards the events to KDS. The REST API is the point of entry for applications that may require custom REST API proxy integration. KDS is well suited for the majority of use cases because it provides several integration options for data producers to publish data directly to streams. For more information, see Writing Data to Amazon Kinesis Data Streams in the Amazon Kinesis Data Streams Developer Guide.

Regional deployment: This solution uses Kinesis Data Analytics, Kinesis Data Firehose, AWS Glue, Amazon Athena, and Amazon QuickSight, which are currently available in specific AWS Regions only. Therefore, you must launch this solution in an AWS Region where these services are available. See the AWS Regional Table for the most current service availability by AWS Region.

To learn more about analytics for games and building games on AWS, check out Game Tech Learning Path. You can also access our session on making faster and smarter decisions with serverless data analytics, which is available in the on-demand 2020 Digital Download online event. And get interactive training and experiment with the Serverless Analytics for Games workshop. Then, once you’re ready to deploy and integrate your own Game Analytics Pipeline into a game, follow along with the AWS Game Analytics Pipeline implementation guide.

About the Authors

Gena Gizzi is a Games SA for AWS located in Southern California. She helps games customers build, launch, and scale their games and businesses on AWS. She has a focus on analytics for games and helps customers gain insights from their data. Some of Gena’s favorite games include Breath of the Wild, Pokémon, and Minecraft.

Greg Cheng is a Solutions Architect at AWS where he helps customers navigate the AWS platform throughout their cloud journey by providing well-architected best practices and guidance. He is focused on analytics and serverless technologies, and his goal is to enable games customers to provide seamless multiplayer experiences with low latency and no downtime. In his spare time, he enjoys playing Super Smash Bros Melee, competitive online multiplayer games, boardgames, cooking, and reading.

Dominic Mills is a Games Solutions Architect at AWS. He helps video game developers of all sizes utilize AWS to develop, build and deploy games, with a particular focus on analytics and game production in the cloud. He has a deep obsession with dungeon crawlers and any game with roguelite elements.

AWS for Games Blog

Implement an analytics pipeline for games

About the Authors

Resources

Follow