Build a Scalable and efficient data analytics engine for your game

“Data is a new form of Gold”

Published on 30th November 2021 by Rushabh Sudame

Introduction

Over the last several years, enterprises have accumulated massive amounts of data. Data volumes have increased at an unprecedented rate, exploding from terabytes to petabytes and sometimes exabyte’s of data. Increasingly, many enterprises are building highly scalable, available, secure, and flexible data lakes on AWS that can handle extremely large datasets. After data lakes are productionised, to measure the efficacy of the data lake and communicate the gaps or accomplishments to the business groups, enterprise data teams need tools to extract operational insights from the data lake.

Same thing goes into gaming data. Since Covid-19, the Gaming industry is booming in terms of Daily Active Users which accumulates high amount of data scaling from terabytes to petabytes. Games are generating more data than ever. Hence, it’s important to have access to the right data at the right time as you develop your games. This enables you to answer questions about how your games are performing and determine what changes you want to make to keep players engaged.

Some of the key benefits of using data analytics for gaming are as follows:

Player engagement: Analytics highlight areas where game design could be improved, helping you create more engaging games. Instrumenting your game to emit game events enables you to analyse the event data and reveal how your games are being played. Then, you can use that information to help enhance your design.

Monetization: The game industry is increasing adoption of games as a service operation model. With this model, recurring revenue is frequently generated through in-app purchases, subscriptions, advertising, and other techniques. To understand the features players are willing to pay for, it is helpful to know which elements of your game draw players in and keep them returning. With this information, you can encourage purchases, serve targeted ads, and offer rewarded videos.

While having access to analytics is important, there are some challenges unique to the gaming industry. Because games generate so much data, it’s important to understand what data to collect and how to collect it.

For one of our customers in the gaming industry, Flentas created an end-to-end gaming data analytics solution using AWS Services.

Problem Statement

The customer already had the data analytics solution deployed on Google’s Firebase platform for all analytical requirements. Since the firebase platform is a standard product, there were obvious limitations with respect to customisations. For complex analytical requirements, some of which are not supported on Firebase platform, the Analytics team used to manually download the report, correlate it with other metrics/data points and fulfil their analytical requirements using Microsoft Excel based formulas. There was a need to build a custom analytics solution which can incorporate their ongoing custom requirements as well as can handle a huge amount of data, and can cater to multiple data sources, scale horizontally to handle multiple game data in future and is cost effective at same time.

Architecture Diagram

Gaming Data analytics architecture diagram

Figure 1: Gaming Data analytics architecture diagram

Solution Summary

Authentication

  1. Function which queries the Game backend Authentication server, checking the validity of the JWT token received by mobile device. Mobile devices first authenticates themselves by sending JWT token to Lambda.
  2. Lambda function then calculates the hash based on device ID and retrieves Access Tokens stored in AWS Secrets Manager. Then Access Tokens are then sent back to mobile device which uses these tokens to send data to Kinesis Firehose.

Data Ingestion

  1. Mobile SDK is developed which sends the game events data to AWS.
  2. Kinesis firehose is used to receive game events streaming data, buffers it upto 128MB or 15 mins, and stores the data into S3 Bucket. The data is also partitioned based on date & time of ingestion. Athena data catalog is attached to Kinesis Firehose to convert the streaming JSON data into Parquet format and then store to S3 bucket.
  3. To further reduce the cost of data ingestion, since Kinesis Data Firehose charges each event as 5kb, and the individual event generated by game is not more than 1kb, Mobile SDK is configured to buffer the data on device till it’s near to 5kb. This reduced the data ingestion cost upto 5X.
  4. There are other third party data sources whose data also needs to be pulled into AWS S3 bucket. Lambda is scheduled periodically which pulls the data from third party sources i.e., Admob, Unity, Appsflyer, etc.

Data Lake & ETL Processing

  1. Data is first stored in raw_events folder in S3 bucket which needs further cleansing and processing.
  2. Data Modelling is very important when the data size will accumulate till petabytes within a short span of time.
  3. Since the use case was to derive KPI’s based on event data, and all of the KPI’s were pointing to some particular event, it is important to partition the data further based on Game Event which will reduce the data querying cost and will reduce time to scan data.
  4. AWS Glue is used to clean the data and partition it based on game event name.
  5. AWS Glue is used in all form of ETL Processing, from partitioning the data to deriving KPI data.
  6. AWS Glue Data Catalog was created using AWS Glue Crawler and then later updated on the go inside the Glue Jobs.
  7. AWS Glue data catalog can be used to query the data from AWS Athena. AWS Athena sits on top of AWS S3 which is used to query the data from S3 objects.

Interactive Analysis

  1. The Amazon QuickSight BI utility enabled businesses to create and analyse data visualizations and extract easy-to-understand insights to inform business decision-making. These interactive dashboards can be seamlessly embedded into many applications, portals and websites.
  2. KPI’s derived from AWS Glue ETL Jobs are then plotted on AWS Quicksight for interactive analysis.
  3. AWS Athena can also be used to query the data if there are any custom requirements.
  4. AWS Quicksight dashboard is then shared with customers Data Analytics team for them to gain insights from interactive Widgets.

Few of important KPI’s are listed as follows:

  1. DAU (Daily Active Users)
  2. MAU (Monthly Active Users)
  3. User Retention
  4. IAP Revenue
  5. IAP Repeat Purchase
  6. ARPDAU
  7. Stickiness
  8. Total Installs
  9. Total Uninstalls, etc.

Conclusion

In this blog, we have covered how AWS services can be used to create a scalable data analytics solution. If you are looking to know more about how we can help you with data analytics solution on AWS for your game, reach out to us at sales@flentas.com or simply fill in your details in the form below.

×

Talk to our experts:

Talk to our experts to discuss your requirements