Integration & Automation

Restart Amazon ECS tasks with AWS Lambda and AWS CloudFormation custom resources

Long-running tasks such as web applications in Amazon Elastic Container Service (Amazon ECS) are often configured to read an AWS Secrets Manager secret value at startup. When that secret is rotated in Secrets Manager, all Amazon ECS tasks that use the secret must be restarted to read the new value.

You can restart tasks under an Amazon ECS service using an UpdateService API call with the forceNewDeployment option using the AWS Management Console or AWS Command Line Interface (CLI). However, this option is not available for application environments where changes are allowed only through pipeline deployments. In these situations, you must rebuild and redeploy the container, which can pose operational challenges to organizations with a large number of Amazon ECS deployments.

In this post, I present an approach that programmatically recycles tasks under an Amazon ECS service by using a combination of an AWS Lambda function and an AWS CloudFormation custom resource. My solution is designed to integrate with a pipeline so that Amazon ECS tasks restart whenever the pipeline is deployed.

About this blog post
Time to read ~10 minutes
Time to complete ~20 minutes
Cost to complete ~$1 (see the AWS service documentation for details)
Learning level Intermediate (200)
AWS services AWS Lambda
AWS CloudFormation
AWS Identity and Access Management (IAM)
Amazon Elastic Container Service (Amazon ECS)

Overview

My solution deploys the following architecture.

Diagram that shows the five steps of recycling Amazon ECS tasks with a CloudFormation custom resource

Figure 1. Recycling Amazon ECS tasks with a CloudFormation custom resource

  1. A user creates or updates a CloudFormation custom resource through a pipeline deployment. The Amazon ECS cluster and service names are resource properties of the custom resource.
  2. The custom resource invokes the Lambda function as its service token to initiate the process.
  3. The Lambda function extracts the Amazon ECS cluster and service names from the invocation event, and makes an UpdateService API call with the forceNewDeployment option on the service.
  4. The Amazon ECS service recycles all of its tasks.
  5. The Lambda function sends a response back to the custom resource.

In this configuration, the Lambda function is invoked whenever you create, update, or delete the custom resource. You can also pass arguments to the Lambda function by configuring them as properties to the custom resource. This lets you recycle any of the Amazon ECS services by passing in corresponding identifiers. You can also integrate the solution to a deployment pipeline for restarting Amazon ECS tasks in restricted environments.

Important: My solution does not apply to Amazon ECS standalone tasks. Also, Terraform does provide an option to force new deployment.

Prerequisites

Walkthrough

To get started, sign in to the AWS Management Console.

Step 1: Create an execution role with permissions for the Lambda function

Task 1.1: Create an execution role in the IAM console

  1. Open the IAM console.
  2. In the navigation pane of the IAM console, choose Roles, Create role.
  3. Under Trusted entity type, choose AWS Service.
  4. Under Use case, choose Lambda.
  5. Choose Next.
  6. Under Permission Policies, select AWSLambdaVPCAccessExecutionRole.
  7. Enter a name for the role, and then choose Create role.

Task 1.2: Add permissions to the execution role

  1. Still in the IAM console, navigate to the role that you just created in the previous step.
  2. Under Add permissions, choose Create inline policy.
  3. Under Select a service, choose Elastic Container Service.
  4. Under List, select ListServices.
  5. Under Write, select UpdateService.
  6. Under Resources, select Any in this account.
  7. Choose Next.
  8. Enter the policy name, and choose Create Policy.

For additional details, see Create a role to delegate permissions to an AWS service.

Step 2: Create the Lambda function

  1. Open the Lambda console.
  2. Choose Create function.
  3. Choose Author from scratch.
  4. Enter the function name and description.
  5. Under Runtime, select Python 3.12.
  6. Under Change default execution role, select Use an existing role.
  7. Under existing role, select the role name that you created in step 1.
  8. Choose Create function.
  9. Paste the following code example in the lambda_function.py tab, and choose Deploy.
import json
import boto3
import botocore
import urllib3

SUCCESS = "SUCCESS"
FAILED = "FAILED"

http = urllib3.PoolManager()

def send(event, context, response_data, physical_resource_id=None):
    print('Sending response back to the custom resource ...')

    response_url = event['ResponseURL']
    response_body = {
        'Status' : SUCCESS,
        'Reason' : "See the details in CloudWatch Log Stream: {}".format(context.log_stream_name),
        'PhysicalResourceId' : physical_resource_id or context.log_stream_name,
        'StackId' : event['StackId'],
        'RequestId' : event['RequestId'],
        'LogicalResourceId' : event['LogicalResourceId'],
        'NoEcho' : None,
        'Data' : response_data
    }

    json_response_body = json.dumps(response_body)
    headers = {
        'content-type' : '',
        'content-length' : str(len(json_response_body))
    }

    try:
        response = http.request('PUT', response_url, headers=headers, body=json_response_body)
    except urllib3.exceptions.HTTPError as e:
        print('Failed')
        print(e)

    print('SUCCESS')

def update_ecs_service(cluster_name, service_name, response_data):
    print('Updating ECS Service {} in Cluster {} ... '.format(service_name, cluster_name))
    try:
        ecs = boto3.client('ecs')
        response = ecs.update_service(
            cluster=cluster_name,
            service=service_name,
            forceNewDeployment=True
        )
    except botocore.exceptions.ClientError as e:
        print('Failed')
        print(e)

    response_data['status'] = SUCCESS
    print('Done')

def lambda_handler(event, context):
    response_data = {}

    if event['RequestType'] == 'Delete':
        print('Stack Delete request. No action taken')
    else:
        ecs_cluster = event['ResourceProperties']['ECSCluster']
        ecs_service = event['ResourceProperties']['ECSService']
        print("Update service request")
        update_ecs_service(ecs_cluster, ecs_service, response_data)
    send(event, context, response_data)

    return {
        'statusCode': 200,
        'body': 'Success!'
    }

For additional details, see Create your first Lambda function.

Step 3: Create the CloudFormation template

Perform these steps:

  1. Copy and paste the following code example into a .yaml file on your local machine. This file is the CloudFormation template.
    AWSTemplateFormatVersion: 2010-09-09
    
    Description: Creates a custom resource to trigger a Lambda when deployed
    
    Parameters:
        ECSCluster:
          Description: Name of ECS cluster
          Type: String
        ECSService:
          Description: Name of ECS service to force new deployment
          Type: String
        ReRunParam:
          Type: String
          Description: |
            A dummy param to allow updating the stack when other params remain same
          Default: "xxx"
    
    Resources:
      LambdaTrigger:
        Type: Custom::LambdaTrigger
        Properties:
          ServiceToken: <arn of Lambda function>
          ServiceTimeout: "120"
          ECSCluster: !Ref ECSCluster
          ECSService: !Ref ECSService
          ReRunParam: !Ref ReRunParam
    
  2. Replace the <arn of Lambda function> placeholder with the Amazon Resource Number (ARN) of the Lambda function.
  3. Save the .yaml file to your local machine. You will upload this template file in the next step.

Step 4: Create a CloudFormation custom resource

Perform these steps:

  1. Open the CloudFormation console.
  2. Under Create stack, select With new resources (standard).
  3. Under Specify template, select Upload a template file.
  4. Under Upload a template file, choose Choose file.
  5. Navigate to and choose the .yaml file that you created in the previous step.
  6. Choose Next.
  7. Enter the stack name.
  8. Under Parameters, enter values for the ECSCluster and ECSService parameters.
  9. Choose Next and then Submit.

Step 5: Validate the solution

Perform these steps:

  1. Open the Amazon ECS console.
  2. Confirm that the same number of new tasks are added as existed before you created the custom resource.
  3. Confirm that the old tasks are deleted after the new tasks are started and that they complete the health check.
  4. To restart the same tasks again, navigate to the Parameters section, and update the stack with a different value for the ReRunParam parameter.

Cleanup

To avoid incurring future charges, delete the resources you created for this solution in the following sequence:

  1. Delete the CloudFormation stack. For instructions, see Deleting a stack on the AWS CloudFormation console.
  2. Delete the Lambda function. In the Lambda console, select the Lambda function and choose Actions, Delete.
  3. Delete the IAM role. For instructions, see Deleting roles or instance profiles.
  4. Delete the Amazon ECS cluster. For instructions, see Deleting an Amazon ECS cluster.

Conclusion

In this post, I’ve provided a solution for restarting Amazon ECS tasks by using a combination of a Lambda function and a CloudFormation custom resource. You can use this solution for situations such as secrets rotation or launching a new Docker image with the existing task definition in your production environments through a pipeline deployment.

If you have a comment or feedback about this blog post, use the Comments section on this page.

Umesh SalianUmesh Salian is a Senior Security Consultant with AWS Professional Services. He enjoys providing design and automation to customers for solving their security concerns. Outside of work, he enjoys watching sports, DIY projects around the house, and traveling to new places.