AWS DevOps & Developer Productivity Blog

Monitor AWS resources created by Terraform in Amazon DevOps Guru using tfdevops

This post was written in collaboration with Kapil Thangavelu, CTO at Stacklet


Amazon DevOps Guru is a machine learning (ML) powered service that helps developers and operators automatically detect anomalies and improve application availability. DevOps Guru utilizes machine learning models, informed by years of Amazon.com and AWS operational excellence to identify anomalous application behavior (e.g., increased latency, error rates, resource constraints) and surface critical issues that could cause potential outages or service disruptions. DevOps Guru’s anomaly detectors can also proactively detect anomalous behavior even before it occurs, helping you address issues before they happen; insights provide recommendations to mitigate anomalous behavior.

When you enable DevOps Guru, you can configure its coverage to determine which AWS resources you want to analyze. As an option, you can define the coverage boundary by selecting specific AWS CloudFormation stacks. For each stack you choose, DevOps Guru analyzes operational data from the supported resources to detect anomalous behavior. See Working with AWS CloudFormation stacks in DevOps Guru for more details.

For Terraform users, Stacklet developed an open-source tool called tfdevops, which converts Terraform state to an importable CloudFormation stack, which allows DevOps Guru to start monitoring the encapsulated AWS resources. Note that tfdevops is not a tool to convert Terraform into CloudFormation. Instead, it creates the CloudFormation stack containing the imported resources that are specified in the Terraform module and enables DevOps Guru to monitor the resources in that CloudFormation stack.

In this blog post, we will explain how you can configure and use tfdevops, to easily enable DevOps Guru for your existing AWS resources created by Terraform.

Solution overview

tfdevops performs the following steps to import resources into Amazon DevOps Guru:

  • It translates terraform state into an AWS CloudFormation template with a retain deletion policy
  • It creates an AWS CloudFormation stack with imported resources
  • It enrolls the stack into Amazon DevOps Guru

For illustration purposes, we will use a sample serverless application that includes some of the components DevOps Guru and tfdevops supports. This application consists of an Amazon Simple Queue Service (SQS) queue, and an AWS Lambda function that processes messages in the SQS queue. It also includes an Amazon DynamoDB table that the Lambda function uses to persist or to read data, and an Amazon Simple Notification Service (SNS) topic to where the Lambda function publishes the results of its processing. The following diagram depicts our sample application:

The architecture diagram shows a sample application containing an Amazon SQS queue, an AWS Lambda function, an Amazon SNS topic and an Amazon DynamoDB table.

Prerequisites

Before getting started, make sure you have these prerequisites:

Walkthrough

Follow these steps to monitor your AWS resources created with Terraform templates by using tfdevops:

  1. Install tfdevops following the instructions on GitHub
  2. Create a Terraform module with the resources supported by tfdevops
  3. Deploy the Terraform to your AWS account to create the resources in your account

Below is a sample Terraform module to create a sample AWS Lambda function, an Amazon DynamoDB table, an Amazon SNS topic and an Amazon SQS queue.

# IAM role for the lambda function
resource "aws_iam_role" "lambda_role" {
 name   = "iam_role_lambda_function"
 assume_role_policy = <<EOF
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Action": "sts:AssumeRole",
      "Principal": {
        "Service": "lambda.amazonaws.com"
      },
      "Effect": "Allow",
      "Sid": ""
    }
  ]
}
EOF
}

# IAM policy for logging from the lambda function
resource "aws_iam_policy" "lambda_logging" {

  name         = "iam_policy_lambda_logging_function"
  path         = "/"
  description  = "IAM policy for logging from a lambda"
  policy = <<EOF
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Action": [
        "logs:CreateLogGroup",
        "logs:CreateLogStream",
        "logs:PutLogEvents"
      ],
      "Resource": "arn:aws:logs:*:*:*",
      "Effect": "Allow"
    }
  ]
}
EOF
}

# Policy attachment for the role
resource "aws_iam_role_policy_attachment" "policy_attach" {
  role        = aws_iam_role.lambda_role.name
  policy_arn  = aws_iam_policy.lambda_logging.arn
}

# Generates an archive from the source
data "archive_file" "default" {
  type        = "zip"
  source_dir  = "${path.module}/src/"
  output_path = "${path.module}/myzip/python.zip"
}

# Create a lambda function
resource "aws_lambda_function" "basic_lambda_function" {
  filename                       = "${path.module}/myzip/python.zip"
  function_name                  = "basic_lambda_function"
  role                           = aws_iam_role.lambda_role.arn
  handler                        = "index.lambda_handler"
  runtime                        = "python3.8"
  depends_on                     = [aws_iam_role_policy_attachment.policy_attach]
}

# Create a DynamoDB table
resource "aws_dynamodb_table" "sample_dynamodb_table" {
  name           = "sample_dynamodb_table"
  hash_key       = "sampleHashKey"
  billing_mode   = "PAY_PER_REQUEST"

  attribute {
    name = "sampleHashKey"
    type = "S"
  }
}

# Create an SQS queue
resource "aws_sqs_queue" "sample_sqs_queue" {
  name          = "sample_sqs_queue"
}

# Create an SNS topic
resource "aws_sns_topic" "sample_sns_topic" {
  name = "sample_sns_topic"
}
  1. Run tfdevops to convert to CloudFormation template, deploy the stack and enable DevOps Guru

The following command generates a CloudFormation template locally from a Terraform state file:

tfdevops cfn -d ~/path/to/terraform/module --template mycfn.json --resources importable-ids.json

The following command deploys the CloudFormation template, creates a CloudFormation stack, imports resources, and activates DevOps Guru on the stack:

tfdevops deploy --template mycfn.json --resources importable-ids.json
  1. After tfdevopsfinishes the deployment, you can already see the stack in the CloudFormation dashboard.

CloudFormation dashboard showing the stack, GuruStack, created by tfdevops

tfdevops imports the existing resources in the Terraform module into AWS CloudFormation. Note, that these are not new resources and would have no additional cost implications for the resources itself. See Bringing existing resources into CloudFormation management to learn more about importing resources into CloudFormation.

Resources view for GuruStack listing the imported resources in GuruStack

  1. Your stack also appears at the DevOps Guru dashboard, indicating that DevOps Guru is monitoring your resources, and will alarm in case it detects anomalous behavior. Insights are co-related sequence of events and trails, grouped together to provide you with prescriptive guidance and recommendations to root-cause and resolve issues more quickly. See Working with insights in DevOps Guru to learn more about DevOps Guru insights.

Amazon DevOps Guru Dashboard displays the system health summary and system health overview of each CloudFormation stack. GuruStack is marked as healthy with 0 reactive insights and 0 proactive insights.

Note that when you use the tfdevops tool, it automatically enables DevOps Guru on the imported stack.

Amazon DevOps Guru Analyze resources displays the analysis coverage option selected. GuruStack is the selected stack for analysis

  1. Clean up – delete the stack

CloudFormation Stacks menu showing GuruStack as selected. The stack can be deleted by pressing the Delete button.

Conclusion

This blog post demonstrated how to enable DevOps Guru to monitor your AWS resources created by Terraform. Using the Stacklet’s tfdevops tool, you can create a CloudFormation stack from your Terraform state, and use that to define the coverage boundary for DevOps Guru. With that, if your resources have unexpected or unusual behavior, DevOps Guru will notify you and provide prescriptive recommendations to help you quickly fix the issue.

If you want to experiment DevOps Guru, AWS offers a free tier for the first three months that includes 7,200 AWS resource hours per month for free on each resource group A and B. Also, you can Estimate Amazon DevOps Guru resource analysis costs from the AWS Management Console. This feature scans selected resources to automatically generate a monthly cost estimate. Furthermore, refer to Gaining operational insights with AIOps using Amazon DevOps Guru to learn more about how DevOps Guru helps you increase your applications’ availability, and check out this workshop for a hands-on walkthrough of DevOps Guru’s main features and capabilities. To learn more about proactive insights, see Generating DevOps Guru Proactive Insights for Amazon ECS. To learn more about anomaly detection, see Anomaly Detection in AWS Lambda using Amazon DevOps Guru’s ML-powered insights.

About the authors

Harish Vaswani

Harish Vaswani is a Senior Cloud Application Architect at Amazon Web Services. He specializes in architecting and building cloud native applications and enables customers with best practices in their cloud journey. He is a DevOps and Machine Learning enthusiast. Harish lives in New Jersey and enjoys spending time with this family, filmmaking and music production.

Rafael Ramos

Rafael is a Solutions Architect at AWS, where he helps ISVs on their journey to the cloud. He spent over 13 years working as a software developer, and is passionate about DevOps and serverless. Outside of work, he enjoys playing tabletop RPG, cooking and running marathons.