AWS Machine Learning Blog
Enable intelligent decision-making with Amazon SageMaker Canvas and Amazon QuickSight
Every company, regardless of its size, wants to deliver the best products and services to its customers. To achieve this, companies want to understand industry trends and customer behavior, and optimize internal processes and data analyses on a routine basis. This is a crucial component of a company’s success.
A very prominent part of the analyst role includes business metrics visualization (like sales revenue) and prediction of future events (like increase in demand) to make data-driven business decisions. To approach this first challenge, you can use Amazon QuickSight, a cloud-scale business intelligence (BI) service that provides easy-to-understand insights and gives decision-makers the opportunity to explore and interpret information in an interactive visual environment. For the second task, you can use Amazon SageMaker Canvas, a cloud service that expands access to machine learning (ML) by providing business analysts with a visual point-and-click interface that allows you to generate accurate ML predictions on your own.
When looking at these metrics, business analysts often identify patterns in customer behavior, in order to determine whether the company risks losing the customer. This problem is called customer churn, and ML models have a proven track record of predicting such customers with high accuracy (for an example, see Elula’s AI Solutions Help Banks Improve Customer Retention).
Building ML models can be a tricky process because it requires an expert team to manage the data preparation and ML model training. However, with Canvas, you can do that without any special knowledge and with zero lines of code. For more information, check out Predict customer churn with no-code machine learning using Amazon SageMaker Canvas.
In this post, we show you how to visualize the predictions generated from Canvas in a QuickSight dashboard, enabling intelligent decision-making via ML.
Overview of solution
In the post Predict customer churn with no-code machine learning using Amazon SageMaker Canvas, we assumed the role of a business analyst in the marketing department of a mobile phone operator, and we successfully created an ML model to identify customers with potential risk of churn. Thanks to the predictions generated by our model, we now want to make an analysis of a potential financial outcome to make data-driven business decisions about potential promotions for these clients and regions.
The architecture that will help us achieve this is shown in the following diagram.
The workflow steps are as follows:
- Upload a new dataset with the current customer population into Canvas.
- Run a batch prediction and download the results.
- Upload the files into QuickSight to create or update visualizations.
You can perform these steps in Canvas without writing a single line of code. For the full list of supported data sources, refer to Importing data in Amazon SageMaker Canvas.
Prerequisites
For this walkthrough, make sure that the following prerequisites are met:
- Run through the solution in Predict customer churn with no-code machine learning using Amazon SageMaker Canvas. Make sure that, during the model building phase, you don’t drop the
State
andPhone
attributes. You use these later. - Sign up for a QuickSight subscription. For this post, we only use QuickSight features included in the Standard subscription.
Use the customer churn model
After you complete the prerequisites, you should have a model trained on historical data in Canvas, ready to be used with new customer data to predict customer churn, which you can then use in QuickSight.
- Create a new file
churn-no-labels.csv
by randomly selecting 1,500 lines from the original dataset churn.csv and removing theChurn?
column.
We use this new dataset to generate predictions.
We complete the next steps in Canvas. You can open Canvas via the AWS Management Console, or via the SSO application provided by your cloud administrator. If you’re not sure how to access Canvas, refer to Getting started with using Amazon SageMaker Canvas.
- On the Canvas console, choose Datasets in the navigation pane.
- Choose Import.
- Choose Upload and choose the
churn-no-labels.csv
file that you created. - Choose Import data.
The data import process time depends on the size of the file. In our case, it should be around 10 seconds. When it’s complete, we can see the dataset is in Ready
status.
- To preview the first 100 rows of the dataset, choose the options menu (three dots) and choose Preview.
- Choose Models in the navigation pane, then choose the churn model you created as part of the prerequisites.
- On the Predict tab, choose Select dataset.
- Select the
churn-no-labels.csv
dataset, then choose Generate predictions.
Inference time depends on model complexity and dataset size; in our case, it takes around 10 seconds. When the job is finished, it changes its status to Ready and we can download the results.
- Choose the options menu (three dots), Download, and Download all values.
Optionally, we can take a quick look at the results choosing Preview. The first two columns are predictions from the model.
We have successfully used our model to predict churn risk for our current customer population. Now we’re ready to visualize business metrics based on our predictions.
Import data to QuickSight
As we discussed previously, business analysts require predictions to be visualized together with business metrics in order to make data-driven business decisions. To do that, we use QuickSight, which provides easy-to-understand insights and gives decision-makers the opportunity to explore and interpret information in an interactive visual environment. With QuickSight, we can build visualizations like graphs and charts in seconds with a simple drag-and-drop interface. In this post, we build several visualizations to better understand business risks and how we could manage them, such as where we should launch new marketing campaigns.
To get started, complete the following steps:
- On the QuickSight console, choose Datasets in the navigation pane.
- Choose New dataset.
QuickSight supports many data sources. In this post, we use a local file, the one we previously generated in Canvas, as our source data.
- Choose Upload a file.
- Choose the recently downloaded file with predictions.
QuickSight uploads and analyzes the file.
- Check that everything is as expected in the preview, then choose Next.
- Choose Visualize.
The data is now successfully imported and we’re ready to analyze it.
Create a dashboard with business metrics of churn predictions
It’s time to analyze our data and make a clear and easy-to-use dashboard that recaps all the information necessary for data-driven business decisions. This type of dashboard is an important tool in the arsenal of a business analysts.
The following is an example dashboard that can help identify and act on the risk of customer churn.
On this dashboard, we visualize several important business metrics:
- Customers likely to churn – The left donut chart represents the number and percent of users over 50% risk of churning. This chart helps us quickly understand the size of a potential problem.
- Potential revenue loss – The top middle donut chart represents the amount of revenue loss from users over 50% risk of churning. This chart helps us quickly understand the size of potential revenue loss from churn. The chart also shows that we could lose several above-average customers as a percent of potential revenue lost that’s bigger than the percent of users at risk of churning.
- Potential revenue loss by state – The top right horizontal bar chart represents the size of revenue lost versus revenue from customers not at risk of churning. This visual could help us understand which state is the most important for us from a marketing campaign perspective.
- Details about customers at risk of churning – The bottom left table contains details about all our customers. This table could be helpful if we want to quickly look at the details of several customers with and without churn risk.
Customers likely to churn
We start by building a chart with customers at risk of churning.
- Under Fields list, choose the Churn? attribute.
QuickSight automatically builds a visualization.
Although the bar plot is a common visualization to understand data distribution, we prefer to use a donut chart. We can change this visual by changing its properties.
- Choose the donut chart icon under Visual types.
- Choose the current name (double-click) and change it to Customers likely to churn.
- To customize other visual effects (remove legend, add values, change font size), choose the pencil icon and make your changes.
As shown in the following screenshot, we increased the area of the donut, as well as added some extra information in the labels.
Potential revenue loss
Another important metric to consider when calculating the business impact of customer churn is potential revenue loss. This is an important metric because it helps us understand the business impact from customers not at risk of churning. In the telecom industry, for example, we could have many inactive clients who have a high risk of churn and but zero revenue. This chart can help us understand if we’re in a such situation or not. To add this metric to our dashboard, we create a custom calculated field by providing the mathematical formula for computing potential revenue loss, then visualize it as another donut chart.
- On the Add menu, choose Add calculated field.
- Name the field Total charges.
- Enter the formula {Day Charge}+{Eve Charge}+{Intl Charge}+{Night Charge}.
- Choose Save.
- On the Add menu, choose Add visual.
- Under Visual types, choose the donut chart icon.
- Under Fields list, drag Churn? to Group/Color.
- Drag Total charges to Value.
- On the Value menu, choose Show as and choose Currency.
- Choose the pencil icon to customize other visual effects (remove legend, add values, change font size).
At this moment, our dashboard has two visualizations.
We can already observe that in total we could lose 18% (270) customers, which equals 24% ($6,280) in revenue. Let’s explore further by analyzing potential revenue loss at the state level.
Potential revenue loss by state
To visualize potential revenue loss by state, let’s add a horizontal bar graph.
- On the Add menu, choose Add visual.
- Under Visual types¸ choose the horizontal bar chart icon.
- Under Fields list¸ drag Churn? to Group/Color.
- Drag Total charges to Value.
- On the Value menu, choose Show as and Currency.
- Drag Stage to Y axis.
- Choose the pencil icon to customize other visual effects (remove legend, add values, change font size).
- We can also sort our new visual by choosing Total charges at the bottom and choosing Descending.
This visual could help us understand which state is the most important from a marketing campaign perspective. For example, in Hawaii, we could potentially lose half our revenue ($253,000) while in Washington, this value is less than 10% ($52,000). We can also see that in Arizona, we risk losing almost every customer.
Details about customers at risk of churning
Let’s build a table with details about customers at risk of churning.
- On the Add menu, choose Add visual.
- Under Visual types, choose the table icon.
- Under Field lists, drag Phone, State, Int’l Plan, Vmail Plan, Churn?, and Account Length to Group by.
- Drag probability to Value.
- On the Value menu, choose Show as and Percent.
Customize your dashboard
QuickSight offers several options to customize your dashboard, such as the following.
- To add a name, on the Add menu, choose Add title.
- Enter a title (for this post, we rename our dashboard Churn analysis).
- To resize your visuals, choose the bottom right corner of the chart and drag to the desired size.
- To move a visual, choose the top center of the chart and drag it to a new location.
- To change the theme, choose Themes in the navigation pane.
- Choose your new theme (for example, Midnight), and choose Apply.
Publish your dashboard
A dashboard is a read-only snapshot of an analysis that you can share with other QuickSight users for reporting purposes. Your dashboard preserves the configuration of the analysis at the time you publish it, including such things as filtering, parameters, controls, and sort order. The data used for the analysis isn’t captured as part of the dashboard. When you view the dashboard, it reflects the current data in the datasets used by the analysis.
To publish your dashboard, complete the following steps:
- On the Share menu, choose Publish dashboard.
- Enter a name for your dashboard.
- Choose Publish dashboard.
Congratulations, you have successfully created a churn analysis dashboard.
Update your dashboard with a new prediction
As the model evolves and we generate new data from the business, we might need to update this dashboard with new information. Complete the following steps:
- Create a new file
churn-no-labels-updated.csv
by randomly selecting another 1,500 lines from the original dataset churn.csv and removing theChurn?
column.
We use this new dataset to generate new predictions.
- Repeat the steps from the Use the customer churn model section of this post to get predictions for the new dataset, and download the new file.
- On the QuickSight console, choose Datasets in the navigation pane.
- Choose the dataset we created.
- Choose Edit dataset.
- On the drop-down menu, choose Update file.
- Choose Upload file.
- Choose the recently downloaded file with the predictions.
- Review the preview, then choose Confirm file update.
After the “File updated successfully” message appears, we can see that file name has also changed.
- Choose Save & publish.
- When the “Saved and published successfully” message apears, you can go back to the main menu by choosing the QuickSight logo in the left upper corner.
- Choose Dashboards in the navigation pane and choose the dashboard we created before.
You should see your dashboard with the updated values.
We have just updated our QuickSight dashboard with the most recent predictions from Canvas.
Clean up
To avoid future charges, log out from Canvas.
Conclusion
In this post, we used an ML model from Canvas to predict customers at risk of churning and built a dashboard with insightful visualizations to help us make data-driven business decisions. We did so without writing a single line of code thanks to user-friendly interfaces and clear visualizations. This enables business analysts to be agile in building ML models, and perform analyses and extract insights in complete autonomy from data science teams.
To learn more about using Canvas, see Build, Share, Deploy: how business analysts and data scientists achieve faster time-to-market using no-code ML and Amazon SageMaker Canvas. For more information about creating ML models with a no-code solution, see Announcing Amazon SageMaker Canvas – a Visual, No Code Machine Learning Capability for Business Analysts. To learn more about the latest QuickSight features and best practices, see AWS Big Data Blog.
About the Author
Aleksandr Patrushev is AI/ML Specialist Solutions Architect at AWS, based in Luxembourg. He is passionate about the cloud and machine learning, and the way they could change the world. Outside work, he enjoys hiking, sports, and spending time with his family.
Davide Gallitelli is a Specialist Solutions Architect for AI/ML in the EMEA region. He is based in Brussels and works closely with customers throughout Benelux. He has been a developer since he was very young, starting to code at the age of 7. He started learning AI/ML at university, and has fallen in love with it since then.