AWS Cloud Operations Blog

Maintenance Windows: Support for New Task Types Using Amazon EC2 Systems Manager

In Amazon EC2 Systems Manager, the Maintenance Windows service allows you to define a set of tasks, along with the instances where those tasks should be run and a run schedule. In this post, I talk about a new feature for Maintenance Windows—support for New Task types.

Maintenance Windows now supports Systems Manager Automation documents, AWS Step Functions tasks, and AWS Lambda functions as tasks, including support for Parameter Store (when using Step Functions and Lambda). This allows you to perform complex workflows on your instances, such as patching a server running SQL Server using an Automation document.

In this post, I show you the steps for executing this example and walk through the required configuration steps one-by-one.

Walkthrough

In this walkthrough, you learn how to create a maintenance window with an Automation task type. This task stops an instance, creates snapshots of attached EBS volumes using a Lambda function, restarts the instance, checks for missing Windows updates using another Lambda function, and installs the missing updates. The walkthrough includes the following steps:

  • Set up IAM users and roles.
  • Create Lambda functions.
  • Create an Automation document.
  • Create an EC2 instance.
  • Create a maintenance window.
  • Register an Automation task with the maintenance window.

Set up IAM users and roles

Because maintenance windows run on a schedule without a user taking specific actions, you need to create a role that grants the maintenance window the appropriate permissions to run the Automation document you’re creating. Similarly, a role needs to be created for the Automation document that grants the permissions to perform the actions in the document. Finally, create a role for the Lambda function so the function can take EBS snapshots.

  1. Create a user with Systems Manager full access as defined in Create a User Account for Systems Manager.
  2. Create an instance role for Systems Manager as defined in Create a Role for Systems Manager Managed Instances.
  3. Create a role for Systems Manager Automation to perform actions as defined in Create an IAM Role for Automation and Add a Trust Relationship for Automation.
  4. Create a role for the Systems Manager to allow Lambda functions to perform actions as defined in Create an IAM Role for AWS Lambda
    • Attach a policy: AmazonEC2FullAccess.
    • Add a lambda.amazonaws.com trust relationship (similar to what was done in Add a Trust Relationship for Automation but replacing ssm.amazonaws.com with lambda.amazonaws.com).
  5. Create a role for the Systems Manager maintenance window to perform actions as defined in Controlling Access to Maintenance Windows Using the AWS Console.
    • Attach the iam:PassRole policy to this role for passing the Automation role created earlier (similar to what was defined for the Automation role in Attach the iam:PassRole Policy).

Create Lambda functions

Create two Lambda functions:

  • One to initiate the creation of an EBS snapshot.
  • One to check the status of the snapshot creation so you can wait for the snapshot to be created.

Here are the steps to create these functions.

  1. Open the AWS Lambda console at https://console.aws.amazon.com/lambda/.
  2. Choose Create a Lambda function.
  3. On the Select blueprint page, choose Author from scratch.
  4. On the Configure triggers page, choose Next.
  5. On the Configure function page, for Name, type SSM-Automation-CreateSnapshots, and type an optional description.
  6. For Runtime, choose Python 2.7.
  7. In the Lambda function code section, delete the pre-populated code and paste the code in Table 1.
  8. In the Lambda function handler and role section, for Role, choose the service role for Lambda that you created earlier.
  9. Choose Next, Create function.

Table 1.

from __future__ import print_function

import json
import boto3

print('Loading function')

#Expects instanceIds
def lambda_handler(event, context):
    
    print("Received event: " + json.dumps(event, indent=2))

    # get EC2 client
    ec2 = boto3.client('ec2')

    # find volumes for given instances
    response = ec2.describe_volumes(
        Filters=[
            {
                'Name': 'attachment.instance-id',
                'Values': [ event['instanceIds'] ],
            },
        ]
    )

    # hold list of snapshot ids
    snapshotIdList = []
    
    # create snapshot of each volume id
    for ids in response['Volumes']:
        print('Creating snapshot for volumeId : %s' % ids['VolumeId'])
        
        # create snapshot
        response = ec2.create_snapshot(
            Description='New snapshot for test',
            VolumeId=ids['VolumeId'],
            DryRun=False
        )
        
        print('Created snapshotId : %s' % response['SnapshotId'])
        
        # add snapshot id to be returned
        snapshotIdList.append(response['SnapshotId'])
    
    returnString = ",".join(str(id) for id in snapshotIdList)

    return returnString
  1. Repeat steps 1–4 from the first Lambda function.
  2. On the Configure function page, for Name, type SSM-Automation-CheckSnapshots and type an optional description.
  3. For Runtime, choose Python 2.7.
  4. In the Lambda function code section, delete the pre-populated code and paste the code in Table 2.
  5. In the Lambda function handler and role section, for Role, choose the service role for Lambda that you created earlier.
  6. Choose Next, Create function.

Table 2.

from __future__ import print_function

import json
import boto3

print('Loading function')

#Expects snapshotIds
def lambda_handler(event, context):

    print("Received event: " + json.dumps(event, indent=2))
    
    # get EC2 client
    ec2 = boto3.client('ec2')
    
    # get the snapshotIds passed
    snapshotIds = event['snapshotIds'].split(',')
    
    # check the state of each snapshot
    for id in snapshotIds:
        response = ec2.describe_snapshots(
            SnapshotIds=[
                id,
            ],
            DryRun=False
        )
        
        # if the state is not completed then it can't continue, so throw an error
        for state in response['Snapshots']:
            print('SnapshotId ' + id + ' in state : %s' % state['State'])
            
            if state['State'] != 'completed':
                errorString = 'Unable to proceed, snapshot in ' + state['State'] + ' state for: ' + id
                raise Exception(errorString)
    
    return "Snapshots completed."

Create an Automation document

An Automation document allows you to create your own custom workflow using a series of steps. Automation has several predefined actions that when stitched together can create complex and robust workflows.

In this example, you use a few of these actions to build a custom workflow, such as aws:changeInstanceState, aws:invokeLambdaFunction, aws:sleep and aws:runCommand.  The steps in this Automation document perform the following actions:

  • Stop an EC2 instance
  • Initiate the creation of volume snapshots with a Lambda function
  • Wait a specified time while the snapshots are created
  • Use another Lambda function to check if the snapshots have been successfully created
  • Restart the EC2 instance
  • Use a Run Command document to install updates on to the EC2 instance.

Here are the steps to create the document.

  1. Open the Amazon EC2 console at https://console.aws.amazon.com/ec2/v2/home.
  2. In the navigation pane, choose Documents, Create Document.
  3. For Name, type CreateVolumeSnapshots.
  4. For Document Type, choose Automation.
  5. Delete the brackets in the Content box, and then paste the following JSON document.
  6. Choose Create Document.
{
  "schemaVersion":"0.3",
  "description":"Create EBS snapshots and update a Windows instance.",
  "assumeRole":"{{AutomationAssumeRole}}",
  "parameters":{
    "instanceId":{
      "type":"String",
      "description":"(Required) Id of the instance"
    },
    "AutomationAssumeRole":{
      "type":"String",
      "description":"(Required) Role under which to execute this automation."
    },
    "SnapshotTimeout":{
      "type":"String",
      "description":"(Required) Timeout for checking for snapshot completion.",
      "default": "PT20M"
    }
  },
  "mainSteps":[
    {
      "name":"stopInstance",
      "action":"aws:changeInstanceState",
      "maxAttempts":1,
      "timeoutSeconds":300,
      "onFailure":"Abort",
      "inputs":{
        "InstanceIds":[
          "{{ instanceId }}"
        ],
        "DesiredState":"stopped"
      }
    },
    {
        "name":"createVolumeSnapshots",
        "action":"aws:invokeLambdaFunction",
        "timeoutSeconds":120,
        "maxAttempts":1,
        "onFailure":"Abort",
        "inputs":{
            "FunctionName":"Automation-CreateSnapshots",
            "Payload":"{\"instanceIds\":\"{{instanceId}}\"}"
        }
    },
    {
    "name":"waitForSnapshotsToCreate",
    "action":"aws:sleep",
    "inputs":{
        "Duration":"{{ SnapshotTimeout }}"
        }
    },
    {
        "name":"checkVolumeSnapshots",
        "action":"aws:invokeLambdaFunction",
        "timeoutSeconds":120,
        "maxAttempts":1,
        "onFailure":"Abort",
        "inputs":{
            "FunctionName":"Automation-CheckSnapshots",
            "Payload":"{\"snapshotIds\":\"{{createVolumeSnapshots.Payload}}\"}"
        }
    },
    {
      "name":"startInstance",
      "action":"aws:changeInstanceState",
      "maxAttempts":1,
      "timeoutSeconds":600,
      "onFailure":"Abort",
      "inputs":{
        "InstanceIds":[
          "{{ instanceId }}"
        ],
        "DesiredState":"running"
      }
    },
    {
      "name":"installMissingWindowsUpdates",
      "action":"aws:runCommand",
      "maxAttempts":1,
      "timeoutSeconds":14400,
      "onFailure":"Continue",
      "inputs":{
         "DocumentName":"AWS-InstallWindowsUpdates",
         "InstanceIds":[
            "{{ instanceId }}"
         ],
         "Parameters":{}
      }
    }
  ]
}

Create an EC2 instance

To try out the Automation document, you need to have an instance running and managed by Systems Manager.  For this example, use a Windows instance.

Create an EC2 instance that uses the Systems Manager instance role created earlier. For more information, see Create an Amazon EC2 Instance that Uses the Systems Manager Role.

Create a maintenance window

Maintenance Windows let customers set up recurring schedules to perform defined actions on their instances.  Each maintenance window has a schedule, duration, set of registered targets, and set of registered tasks to be performed against the targets. In this example, you create a maintenance window so it can be the initiator of the Automation task created earlier.

Create a maintenance window and assign the new instance as a target. For more information, see Maintenance Window Console Walkthrough.

Register an Automation task with the maintenance window

With the maintenance window created, you can now register an Automation task to run the Automation document and pass it the necessary parameters.

  1. Open the Amazon EC2 console at https://console.aws.amazon.com/ec2/v2/home.
  2. In the navigation pane, choose Maintenance Windows and select the new maintenance window. Enter the following values:
    • For Actions, choose Register automation task.
    • For Document, choose the CreateVolumeSnapshots document.
    • For Task Priority, specify a priority for this task. 1 is the highest priority. Tasks in a maintenance window are scheduled in priority order with tasks that have the same priority scheduled in parallel.
    • In the Target by section, choose Selecting registered target groups and select the target that you created earlier.
    • In the Parameters section, for Execute on, specify 1. For Stop after, specify 1.
    • For Role, specify the maintenance window role ARN created earlier.
    • In the Input Parameters section, for AutomationAssumeRole, specify the Automation role ARN. Enter a timeout for waiting for the snapshots to complete. For InstanceId, type the instance ID of the instance created earlier.
  3. Choose Register automation task.

Most of the steps for registering a task in the AWS Management Console can be executed using AWS CLI commands instead.  Furthermore, these steps could as easily be executed using the AWS SDK with Java, SDK for Python (Boto), or any of the other languages supported. This gives you many options when working with an AWS service. The following code examples use the AWS CLI for creating a maintenance window, registering a target with that maintenance window, and registering an Automation task using the Automation document created earlier.

  1. If needed, follow the steps to get started on creating a maintenance window with the AWS CLI in Maintenance Window CLI Walkthrough.
  2. Run the following CLI command and find the WindowId value for the maintenance window created earlier.
    aws ssm describe-maintenance-windows
    This command returns the following, noting that the values for WindowId, name, description, and other fields below are fictional examples:

    {
        "WindowIdentities": [
            {
                "WindowId": "mw-abc1234e3ddc9e286",
                "Name": "MW1",
                "Description": "MW1 description",
                "Enabled": true,
                "Duration": 2,
                "Cutoff": 1
            }
        ]
    }
  3. Run the following CLI command and find the WindowTargetId value for the instance that was assigned when you created the maintenance window.

          aws ssm describe-maintenance-window-targets –window-id “mw-abc1234e3ddc9e286”

This command returns the following:

{
    "Targets": [
        {
            "WindowId": "mw-abc1234e3ddc9e286",
            "WindowTargetId": "2ecce06f-130c-41a3-870c-d36deff6cbba",
            "ResourceType": "INSTANCE",
            "Targets": [
                {
                    "Key": "InstanceIds",
                    "Values": [
                        "i-000a0a0ca4caf9861"
                    ]
                }
            ],
            "OwnerInformation": "test",
            "Name": "test",
            "Description": "test description"
        }
    ]
}

4. Run the following CLI command to register the Automation task to the maintenance window.

aws ssm register-task-with-maintenance-window –window-id “mw-abc1234e3ddc9e286” –targets “Key=WindowTargetIds,Values=2ecce06f-130c-41a3-870c-d36deff6cbba” –task-arn “CreateVolumeSnapshots” –service-role-arn “arn:aws:iam::111111111111:role/MaintenanceWindowRole” –task-type “AUTOMATION” –task-invocation-parameters “Automation={Parameters={instanceId=i-000a0a0ca4caf9861,AutomationAssumeRole=arn:aws:iam::111111111111:role/AutomationRole,SnapshotTimeout=PT20M}}” –priority 1 –max-concurrency 1 –max-errors 1 –name “Automation_Task1” –description “Automation_Task1 description”

This command returns the following:

{
  "WindowTaskId": "563e10e1-7b9c-4285-8c0c-cb94912b2839"
}

View the maintenance window execution

A maintenance window is executed based on the schedule that was specified.  After the maintenance window execution, results can be viewed in the History tab on the selected maintenance window landing page.

Conclusion

In this post, I showed you how to use the newly launched task types in Maintenance Windows to schedule and automate the execution of many common systems administration tasks. Using the maintenance window, you can now create different types of tasks, run your tasks when needed on specified targets, and get notified about any problems running these tasks. This helps you get the status on scheduled tasks, with details of the errors, enabling you to take appropriate action.


About the Authors

Lavanya Krishnan is a Technical Product Manager in the EC2 Systems Manager team responsible for Patch Manager and Maintenance Window capabilities. Lavanya is passionate about building Enterprise and Cloud Products and Services. When not working, she loves to spend time with family, travel, read books and play board games.

 

 

Tracy Williams is a Software Development Manager in the EC2 Systems Manager team and is responsible for Maintenance Window Capabilities. In his free time Tracy enjoys hiking, movies and sports cars.