AWS Cloud Operations Blog
Controlling your AWS costs by deleting unused Amazon EBS volumes
Customers across industries and verticals consider cost management as one of their top priorities. Limited visibility into a volume’s lifecycle can result in costs for unutilized resources. AWS builds cost-management products to access, organize, understand, control, and optimize costs on AWS.
Unused and overlooked Amazon EBS volumes contribute to AWS costs. The lifecycle of EBS volumes can be independent of Amazon EC2 compute instances. Therefore, even if the EC2 instance associated with the EBS volumes is terminated, the EBS volumes tend to persist unless you select the Delete on Termination option at launch. Also, instances spun up and down as part of development and testing cycles may leave orphaned EBS volumes if there are no workflows in use to delete them automatically. These orphaned EBS volumes accrue charges while unattached.
In this post, we walk you through an automated process using OpsCenter, a recently launched feature of AWS Systems Manager, to identify and manage EBS volumes that are unused and detached from an EC2 instance.
Overview
First, we use Amazon CloudWatch Events to invoke an AWS Lambda function periodically. This Lambda function parses AWS CloudTrail for EBS events and creates operational work items (OpsItems) for EBS volumes that are in the available state and have not been attached to an EC2 instance for a user-definable time period.
Next, we demonstrate how to associate and run automation steps to snapshot your volumes. This increased visibility into your unused volumes enables you to take actions on these volumes using AWS Systems Manager Automation, simplifying common maintenance and deployment tasks of EC2 instances and other AWS resources.
Systems Manager gives you visibility into and control of your infrastructure on AWS, providing a single pane of glass view of your operational data from multiple AWS services and allowing you to automate operational tasks across your AWS resources. OpsCenter is designed to reduce mean time to resolution (MTTR) for issues impacting your resources, whether in the cloud or in your datacenter. It also provides a central location where operations engineers and IT professionals can view, investigate, and resolve OpsItems related to your resources. The process workflow is shown in the following diagram.
Walkthrough
In order to periodically run a Lambda function to examine CloudTrail for entries related to EBS volumes we will use a scheduled CloudWatch Events rule. The Lambda function, hosted on GitHub, determines if an EBS volume has been detached and in the available state for a user-defined length of time by examining actions in CloudTrail. CloudTrail logs are stored for 90 days, allowing you to review events and identify volumes based on activity within this timeframe.
Once the Lambda function identifies a volume as unused, it leverages OpsCenter APIs to create OpsItems within OpsCenter. OpsItems contain information about findings, such as:
- Out-of-compliance EC2 instances
- Operating system patch levels
- Orphaned EBS volumes (as demonstrated in this post)
OpsItems can include up to 100 related resources batched together. For example, if you have 250 EBS volumes identified as unattached and available, and the Lambda function is configured to include 100 related resources per OpsItem, the Lambda function will publish three OpsItems to Systems Manager: two with 100 related resources and one with 50 related resources.
Batching related resources is useful if you have a large number of volumes. Also, a notification is triggered from the Lambda function to an Amazon SNS topic and its subscribers for each OpsItem. Once these OpsItems are published, the findings are available to review. You can click through to the specific resources flagged in the OpsItem’s list of related resources and use Automation actions to resolve the issue.
In this post, we will associate a default Automation document, AWS-CreateSnapshot, to the OpsItems to demonstrate the functionality of taking automated actions. You can either create your own Automation document or use one of the preexisting ones created by AWS to take additional steps on your orphaned volumes (such as snapshotting and then deleting them) to save costs and eliminate orphaned EBS volumes.
Solution
To get started with the process of setting up this solution, follow these steps:
Create an Amazon SNS topic:
- In the AWS Management Console, navigate to Amazon SNS in your region of choice and create a new topic. We will use us-west-2 to demonstrate the process. Provide a name and accept all the default settings. Note the topic’s ARN, which you need in subsequent steps, and is in this format:
arn:aws:sns:us-west-2:123456789012:MyTopic
For details on creating an SNS topic, see the Quick Start Tutorial.
- In the SNS console, create an email subscription for the SNS topic created in step 1. For more information, see Subscribing an Endpoint to an Amazon SNS Topic.
Create an AWS Lambda function:
- Navigate to the IAM Console and create a role for Lambda function execution with the provided JSON policy. This policy grants the Lambda function the basic Lambda execution role, the ability to read CloudTrail logs, and the ability to access EC2 resources such as EBS volumes and Systems Manager. For more information, see Create and Attach Your First Customer Managed Policy.
- Navigate to the Lambda console and create a function with following basic details as shown in the following screenshot:
- For Function name, enter opsCenterAgedEBSVolumeFinder.
- For Runtime, from the dropdown list, select Python 3.6 or Python 3.7.
- Under Permissions:
- For Execution role, from the dropdown list, select Use an existing role.
- For Existing role, select the role created in the previous step for the Lambda function.
- Choose Create Function.
- Next, in the Lambda console of the newly created function, scroll down to the Environment variables section and enter the following four key-value parameters, also shown in the following screenshot:
- For BATCH_SIZE, enter 20. This parameter refers to the number of EBS volumes batched together by the Lambda function into one OpsItems event published to Systems Manager OpsCenter. The maximum Batch Size is 100.
- For DETAILED_NOTIFICATIONS, enter True to send a detailed notification to the SNS topic. Alternatively, you can send a brief notification by entering False.
- For IGNORE_WINDOW, enter a value between 0 and 90. We use 15 in our example below. This parameter specifies how long a volume can be available since being detached before being considered orphaned. CloudTrail logs are retained for a maximum of 90 days.
- For SNS_ARN, provide the ARN of the SNS topic you created in step 1.
- For SSM_AUTOMATION_ID, enter AWS-CreateSnapshot. Provide the default Automation document you want to associate with the OpsItems that the Lambda function writes to the OpsCenter. In this blog post, we use a preexisting automation document to create a snapshot of the EBS volume. Using the Systems Manager Automation Documents Reference, you can create custom document, upload it to Systems Manager, and associate that with your OpsItems as the default Automation action.
- In the Lambda console, scroll down to the Function Code section, and under Handler, update to opsCenterAgedEBSVolumeFinder.lambda_handler, as shown in the following diagram.
- In the Lambda console, under Basic settings, update Memory to 256 MB and Timeout to 3 Choose Save to update the function.
- In a new directory/folder in your laptop or EC2 instance, download the script for the Lambda function to examine your CloudTrail logs to identify the unused EBS volumes, and follow the steps:
- Navigate to the folder where you saved the Python file and create a requirements.txt file with the following botocore, boto3, and awscli versions to run the function:
- Navigate to the folder where you saved the Python file and create a requirements.txt file with the following botocore, boto3, and awscli versions to run the function:
boto3==1.9.170
botocore==1.12.171
awscli==1.16.181
Save this file in the same directory as the Lambda function. The requirements.txt file is necessary to ensure that the Lambda execution environment has the necessary versions of the libraries packaged together for the function to execute correctly. For a quick rundown on packaging boto3 and botocore, see Automate your DynamoDB backups with Serverless in less than 5 minutes.
Note
As long as you use a version equal to or later than the ones specified here for boto3, botocore, and awscli, you can use that to package the Lambda function.
-
- From the directory where you placed the Lambda function and the requirements.txt file, run the following command to create local copies of the boto3, botocore, and awscli packages (include the “.” at the end):
MacOS/Linux: pip install --upgrade -r requirements.txt -t .
Windows: pip install -r requirements.txt -t .
-
- Recursively zip the contents of the directory where the Lambda function resides to create a deployment package using the following command, which is run from the Lambda function’s directory (include the “.” at the end):
MacOS/Linux: zip -r9 ../opsCenterAgedEBSVolumeFinder.zip .
Windows: 7z.exe a -r c:\code\opsCenterAgedEBSVolumeFinder.zip .
-
- Upload the package to Lambda by updating the existing function (replace your function name, Region, and zip file name if needed):
aws lambda update-function-code --region us-west-2 --function-name opsCenterAgedEBSVolumeFinder --zip-file fileb://opsCenterAgedEBSVolumeFinder.zip
- Note: This command assumes that it is run from the directory where the zip file exists; alternatively, you can provide the absolute path to the file.
- After the package successfully uploads, navigate to the console and test the Lambda function.
- To manually invoke your Lambda function, choose Test.
- Choose Create a new test event and leave the default template in place.
- Enter an Event name, then choose Create.
- Choose Test to invoke the Lambda function.
Review the OpsItems and run an Automation task on a non-compliant EBS volume:
- Now that the Lambda function has run to identify any EBS volumes which are non-compliant, we can review the findings. Navigate to the Systems Manager console and, from the left panel, select OpsCenter. You should be able to see several OpsItems on the dashboard.
- In the OpsItem summary status area, choose the number indicating the quantity of Open and in progress
- You will see a list of OpsItems batches. Select the batch ID to see the associated resources.
- On the next screen, you should see a list of Resource IDs of unattached EBS volumes that are older than the number of days specified in the IGNORE_WINDOW environment variable of the Lambda function.
- Select one of the related resources for the OpsItem radio button and choose Run Automation to see the associated Automation document AWS-CreateSnapshot.
- You can see that the VolumeId field has been auto-populated for the automation to be run. Provide a description (optional) and AutomationAssumeRole (optional), and then choose Execute to start the Automation task.
- A status message at the top of the screen shows that the automation is executing. Check the status by choosing View automation status on the upper right.
The following new tab opens, where you can see the steps executed for the automation, the snapshot ID that resulted, and additional details such as start and end times.
Note
You can create your own custom Automation document for snapshotting and deleting a volume or simply deleting a volume as needed, and associate it to the OpsItems that are published by the Lambda function.
Schedule the Lambda Function:
You can now schedule the Lambda function to be automatically invoked at your preferred frequency using a CloudWatch Events rule. Use the schedule option to specify a CRON style expression to define a specific time and schedule to trigger the execution of the Lambda function automatically.
For this post, we set the CRON expression to invoke the Lambda function once daily at 23:00 UTC. To set this schedule:
- Navigate to the CloudWatch service console and in the left panel, under the Events option, select Rules. Choose Create Rule. For more information, see Creating a CloudWatch Events Rule That Triggers on a Schedule.
- As shown in the following screenshot, select the Schedule option and specify a CRON expression. You can use the following expression to trigger every day at 23:00 UTC:
0 23 * * ? *
- In the Targets section, select the Lambda function created earlier for identifying unused EBS volumes, then choose Configure details.
- Under Rule definition, specify a name and description, make sure the State is enabled, and then choose Create rule.
The Lambda function will now automatically get triggered every day to publish OpsItem batches of unused EBS volumes to Systems Manager OpsCenter and publish a notification to the SNS topic.
Conclusion
In this post, we demonstrated using OpsCenter APIs within a Lambda function to create OpsItems identifying orphaned EBS volumes, and showed how AWS Systems Manager Automation can be used to save the volume data through snapshots. You can also use Automation to snapshot and then delete the EBS volumes, managing your costs on AWS. You can now get started with the new Systems Manager OpsCenter feature to centrally identify and manage various AWS resources.
We welcome your suggestions and feedback.
Authors
Josh Zeiser is a Senior Technical Account Manager at AWS. He lives in the San Francisco Bay area and helps customers architect and optimize applications on AWS. In his spare time, he enjoys cooking and traveling with his family.
Sona Rajamani is a Senior Solutions Architect at AWS. She lives in the San Francisco Bay area and helps customers architect and optimize applications on AWS. In her spare time, she enjoys hiking and traveling.
Ballu Singh is a Principal Solutions Architect at AWS. He lives in the San Francisco Bay area and helps customers architect and optimize applications on AWS. In his spare time, he enjoys reading and spending time with his family.