AWS Database Blog
Use VPC endpoints with Amazon Timestream
Time series data is a sequence of data points recorded over a time interval. This type of data is used for measuring events that change over time, such as stock prices, temperature measurements, or CPU utilization of an Amazon Elastic Compute Cloud (Amazon EC2) instance.
With time series data, each data point consists of a timestamp, one or more attributes, and the event that changes over time. You can use this data to derive insights into the performance and health of an application, detect anomalies, and identify optimization opportunities. For example, DevOps engineers might want to view data that measures changes in infrastructure performance metrics. Manufacturers might want to track internet of things (IoT) sensor data that measures changes in equipment across a facility. Online marketers might want to analyze clickstream data that captures how a user navigates a website. Because time series data is generated from multiple sources in extremely high volumes, it needs to be cost-effectively collected in near-real time, and therefore requires efficient storage that helps organize and analyze the data.
Amazon Timestream is a fast, scalable, fully managed, purpose-built time series database that makes it easy to store and analyze trillions of time series data points per day. Timestream saves you time and cost in managing the lifecycle of time series data by keeping recent data in memory and moving historical data to a cost-optimized storage tier based upon user-defined policies. The purpose-built query engine of Timestream lets you access and analyze recent and historical data together, without having to specify its location. Timestream has built-in time series analytics functions, helping you identify trends and patterns in your data in near-real time. Timestream is serverless and automatically scales up or down to adjust capacity and performance. Because you don’t need to manage the underlying infrastructure, you can focus on optimizing and building your applications.
This post shows how to use virtual private cloud (VPC) endpoints to send and read data from your Amazon Virtual Private Cloud (Amazon VPC) services to Timestream. We also demonstrate how to configure Grafana (with the Timestream plug-in) on an EC2 instance to read and visualize your time series data.
A VPC endpoint (powered by AWS PrivateLink) allows you to privately connect to AWS services (like Timestream) without using an internet gateway, network address translation (NAT) device, virtual private network (VPN) connection, or AWS Direct Connect connection. Traffic between your Amazon VPC and the connected service doesn’t leave the Amazon network.
Overview of solution
To implement this post, you complete the following high-level steps:
- Create a key pair.
- Deploy an AWS CloudFormation template, which performs the following actions:
- Creates an Amazon VPC with VPC endpoints to use for connecting to Timestream.
- Installs and configures Grafana on an EC2 instance.
- Creates a Timestream database and table.
- Configures Grafana with the Timestream plugin.
- Downloads the sample tool for continuous ingestion.
- Create an EC2 instance and install and configure the Grafana runtime and application on the instance.
- Create a Timestream table and verify that your EC2 instance is communicating over the private link.
- Run a script from the EC2 instance to load data into your Timestream table.
- Launch the Grafana application and visualize your time series data.
VPC endpoints provide an extra layer of security for organizations that want to keep communication with their Amazon VPC services and non-VPC services private. For more information about the benefits of pairing VPC endpoints to services like Timestream, see Reduce Cost and Increase Security with Amazon VPC Endpoints.
Overview of architecture
The following diagram shows the architecture being demonstrated in this post. The Amazon VPC is set up to run in one Availability Zone, which has a public and private subnet. Two EC2 instances are configured in the public subnet. One has Grafana (with the Timestream plugin) installed, and the second instance has a Python script to load data into the Timestream database. This Grafana instance is publicly available (over HTTP port 3000). You can access the Grafana web console application through this. VPC endpoints are configured and associated to the private subnet. These connect to Timestream running in the Region where the VPC is deployed.
Prerequisites
Before you begin, you must have the following prerequisites:
- An AWS account that provides access to Amazon EC2, Timestream (for instructions, see Accessing Timestream), and AWS PrivateLink (your VPC endpoint)
- An SSH client on your computer
Create a key pair
To create your key pair, complete the following steps:
- On the Amazon EC2 console, in the navigation pane, under Network & Security, choose Key Pairs.
- Choose Create key pair.
- For Name, enter
ts-vpce-demo
. - For File format, choose how to SSH (pem for OpenSSH or command line SSH, or ppk for use with PuTTY).
- Choose Create key pair.
When you create your key pair, you automatically download your key file.
Deploy your CloudFormation template
The provided CloudFormation template provisions your resources.
You can open the template to review it. It has two resources of interest: IngestVpcEndpoint
and QueryVpcEndpoint
. These are the VPC endpoint resource definitions that allow private communication from your VPC private subnet to your Timestream database.
To deploy your template, complete the following steps:
- On the AWS CloudFormation console, choose Create stack.
- For Prepare template, select Template is ready.
- For Template source, select Upload a template file.
- Upload the template you downloaded.
- For Stack name, enter
TsPrivateLink
. - For Cell, enter
cell1
. - For Grafana IP Range, enter your IP address range:
- Go to
https://checkip.amazonaws.com/
and copy the IP address. - Enter the address and add /0 at the end, for example 111.111.111.111/0.
- Go to
- For InstanceType, enter
t2.medium
(this is the EC2 instance type that Grafana runs on). - For KeyName, enter
ts-vpce-demo
(the name of your key pair) - For PrivateSubnet01Block, keep at default or specify your own CIDR range.
- For PublicSubnet01Block, keep default or specify your own CIDR range.
- For SSHIPRange, enter your IP address range:
- Go to
https://checkip.amazonaws.com/
and copy the IP address. - Enter the address and add /0 at the end, for example 111.111.111.111/0.
- Go to
- For VpcIdCidrBlock, keep at default or specify your own CIDR range.
- Choose Next.
- In the Configure stack options section, keep the settings at their defaults.
- Choose Next.
- In the Review TsPrivateLink section, keep the settings at their defaults.
- Select I acknowledge that AWS CloudFormation might create IAM resources.
- Choose Create stack.
Verify the VPC endpoint from Amazon EC2
To verify your endpoint, complete the following steps:
- On the Amazon EC2 console, in the navigation pane, choose Instances.
- Choose the instance
IngestorInstance
. - In the Details section, note the public IP address.
SSH to IngestorInstance
In this step, you SSH into IngestorIntance
using the key file you created earlier (ts-vpce-demo
).
- Open a terminal in the directory with your
ts-vpce-demo.pem
file. - Run the following command:
Make sure that you have the latest AWS Command Line Interface (AWS CLI) running on your instance. For instructions, see Update to the AWS CLI version 2 on Linux.
- Run the following AWS CLI commands:
- To verify that the endpoint DNS resolves to a private IP space, run the following commands:
Load data into the Timestream table
To populate your table, complete the following steps:
- Start the ingestor script to populate the Timestream table by running the following command:
At any time, you can press CTRL+C to end the ingestor.
Now you can verify the data is ingesting into your Timestream table.
- On the Timestream console, in the navigation pane, choose Query editor.
- In the query editor, enter the following query:
- Choose Run.
Visualize the data in Timestream
To visualize your data, complete the following steps:
- On the Amazon EC2 console, choose the instance
GrafanaServerInstance
. - In the Details section, note the public IP4 address.
- Open a browser tab and navigate to
http://<your Amazon EC2 public IP>:3000
. - Log in with the Grafana default user (
admin
) and password (admin
).
- In the navigation pane, under Configuration, choose Data sources.
- Choose Add data source.
- Choose Amazon Timestream as your data source.
- For Default Region, enter
us-west-2
. - For Endpoint, enter
https://query-cell1.timestream.us-west-2.amazonaws.com
. - Choose Save & Test.
- In the database
TsPrivateLinkGrafanaDb
, choose the tableTsPrivateLinkGrafanaTb
. - Choose a default measure, such as
cpu_hi
. - On the Dashboards tab, choose Import and Sample (DevOps).
- In the navigation pane, navigate to Dashboards and choose Sample (DevOps).
The following screenshot shows your dashboard.
Clean up
To avoid incurring future charges, delete the resources. To do this, on the AWS CloudFormation console, select your TsPrivateLinkDemo
CloudFormation stack and choose Delete.
Conclusion
In this post, you learned how to use VPC endpoints to send and read data from your VPC services to Timestream. You also learned how to configure Grafana (with the Timestream plugin) on an EC2 instance to read and visualize your time series data.
For further reading and hands-on experience, see Getting Started with Amazon Timestream and the Amazon Timestream Tools and Samples GitHub repo. If you have comments or questions about this solution, please submit them in the comments section.
Appendix
Timestream is a cellular architecture, so for the lifetime of your account, you’re mapped to a single cell in a given Region. The Timestream SDK is transparent to this architecture and automatically resolves the cell-based endpoint for you; you can cache it for 24 hours.
To use interface VPC endpoints to directly connect to Timestream from within your VPC, you first need to find the cell endpoint. You can run the DescribeEndpoints API from the Timestream Write and Query SDKs and find your current cell mapping.
Timestream is a cellular architecture that supports rerouting to cell-based endpoints via Discovery. When you create a VPC endpoint for any of the Timestream SDKs (such as Write or Query), any requests to a Timestream cellular endpoint within the Region (for example, ingest-cell1.timestream.us-west-2.amazonaws.com
) are routed to a private Timestream Write endpoint within the Amazon network. The endpoint name remains the same, but the route to Timestream stays entirely within the Amazon network, and doesn’t access the public internet.
About the authors
Andrew Timpone is a Sr. Solutions Architect at Amazon Web Services. He works with AWS customers to provide guidance and technical assistance on AWS architectures.
Manali Shah is a software development engineer at AWS. She is passionate about building the right solution for the customer.