AWS DevOps & Developer Productivity Blog
Optimize AWS CloudFormation Templates
The following post is by guest blogger Julien Lépine, Solutions Architect at AWS. He explains how to optimize templates so that AWS CloudFormation quickly deploys your environments.
______________________________________________________________________________________
Customers sometimes ask me if there’s a way to optimize large AWS CloudFormation templates, which can take several minutes to deploy a stack. Often stack creation is slow because one resource depends on the availability of another resource before it can be provisioned. Examples include:
- A front-end web server that has a dependency on an application server
- A service that waits for another remote service to be available
In this post, I describe how to speed up stack creation when resources have dependencies on other resources.
Note: I show how to launch Windows instances with Windows PowerShell, but you can apply the same concepts to Linux instances launched with shell scripts.
How CloudFormation Creates Stacks
When CloudFormation provisions two instances, it provisions them randomly. Defining one resource before another in a template doesn’t guarantee that CloudFormation will provision that resource first. You need to explicitly tell CloudFormation the right order for instance provisioning.
To demonstrate how to do this, I’ll start with the following CloudFormation template:
{ "AWSTemplateFormatVersion" : "2010-09-09", "Description": "This is a demonstration AWS CloudFormation template containing two instances", "Parameters": { "ImageId" : { "Description": "Identifier of the base Amazon Machine Image (AMI) for the instances in this sample (please use Microsoft Windows Server 2012 R2 Base)", "Type" : "AWS::EC2::Image::Id" }, "InstanceType" : { "Description": "EC2 instance type to use for the instances in this sample", "Type" : "String" }, }, "Resources" : { "Instance1": { "Type": "AWS::EC2::Instance", "Properties": { "ImageId": { "Ref" : "ImageId" }, "InstanceType": { "Ref": "InstanceType" }, } }, "Instance2": { "Type": "AWS::EC2::Instance", "Properties": { "ImageId": { "Ref" : "ImageId" }, "InstanceType": { "Ref": "InstanceType" }, } } } }
CloudFormation would likely create the stack in the following sequence:
This is fast, but if Instance2 is dependent on Instance1, you would ordinarily need to hard code or script the provisioning sequence to ensure that Instance1 is provisioned first.
Specifying Dependencies
When you need CloudFormation to wait to provision one resource until another one has been provisioned, you can use the DependsOn attribute.
"Instance2": { "DependsOn": ["Instance1"] "Type": "AWS::EC2::Instance", "Properties": { "ImageId": { "Ref" : "ImageId" }, "InstanceType": { "Ref": "InstanceType" } } }
You can also introduce references between elements by using either the { “Ref”: “MyResource” } or the { “Fn::GetAtt” : [ “MyResource” , “MyAttribute” ] } functions. When you use one of these functions, CloudFormation behaves as if you’ve added a DependsOn attribute to the resource. In the following example, the identifier of Instance1 is used in a tag for Instance2.
"Instance2": { "Type": "AWS::EC2::Instance", "Properties": { "ImageId": { "Ref" : "ImageId" }, "InstanceType": { "Ref": "InstanceType" }, "Tags": [ { "Key" : "Dependency", "Value" : { "Ref": "Instance1" } } ] } }
Both methods of specifying dependencies result in the same sequence:
Now, CloudFormation waits for Instance1 to be provisioned before provisioning Instance2. But I’m not guaranteed that services hosted on Instance1 will be available, so I will have to address that in the template.
Note that instances are provisioned quickly in CloudFormation. In fact, it happens in the time it takes to call the RunInstances Amazon Elastic Compute Cloud (EC2) API. But it takes longer for an instance to fully boot than it does to provision the instance.
Using Creation Policies to Wait for On-Instance Configurations
In addition to provisioning the instances in the right order, I want to ensure that a specific setup milestone has been achieved inside Instance1 before contacting it. To do this, I use a CreationPolicy attribute. A CreationPolicy is an attribute you can add to an instance to prevent it from being marked CREATE_COMPLETE until it has been fully initialized.
In addition to adding the CreationPolicy attribute, I want to ask Instance1 to notify CloudFormation after it’s done initializing. I can do this in the instance’s UserData section. On Windows instances, I can use this section to execute code in batch files or in Windows PowerShell in a process called bootstrapping.
I’ll execute a batch script, then tell CloudFormation that the creation process is done by sending a signal specifying that Instance1 is ready. Here’s the code with a CreationPolicy attribute and a UserData section that includes a script that invokes cfn-signal.exe:
"Instance1": { "Type": "AWS::EC2::Instance", "CreationPolicy" : { "ResourceSignal" : { "Timeout": "PT15M", "Count" : "1" } }, "Properties": { "ImageId": { "Ref" : "ImageId" }, "InstanceType": { "Ref": "InstanceType" }, "UserData": { "Fn::Base64": { "Fn::Join": [ "n", [ "<script>", "REM ...Do any instance configuration steps deemed necessary...", { "Fn::Join": ["", [ "cfn-signal.exe -e 0 --stack "", { "Ref": "AWS::StackName" }, "" --resource "Instance1" --region "", { "Ref" : "AWS::Region" }, """ ] ] }, "</script>" ] ] } } } }
I don’t need to change the definition of Instance2 because it’s already coded to wait for Instance1. I now know that Instance1 will be completely set up before Instance2 is provisioned. The sequence looks like this:
Optimizing the Process with Parallel Provisioning
It takes only a few seconds to provision an instance in CloudFormation, but it can take several minutes for an instance to boot and be ready because it must wait for the complete OS boot sequence, activation and the execution of the UserData scripts. As we saw in the figures, the time it takes to create the complete CloudFormation stack is about twice the boot and initialization time for a resource. Depending on the complexity of our processes, booting can take up to 10 minutes.
I can reduce waiting time by running instance creation in parallel and waiting only when necessary – before the application is configured. I can do this by splitting instance preparation into two steps: booting and initialization. Booting happens in parallel for both instances, but initialization for Instance2 starts only when Instance1 is completely ready.
This is the new sequence:
Because I’m doing some tasks in parallel, it takes much less time for Instance2 to become available.
The only problem is that CloudFormation has no built-in construct to enter a dependency in the middle of the booting process of another resource. Let’s devise a solution for this.
Using Wait Conditions
Creation policies also provide a notification mechanism. I can decouple notification for the creation of an instance from the notification that the instance is fully ready by using a wait condition.
"Instance1WaitCondition" : { "Type" : "AWS::CloudFormation::WaitCondition", "DependsOn" : ["Instance1"], "CreationPolicy" : { "ResourceSignal" : { "Timeout": "PT15M", "Count" : "1" } } }
Then I need to ask Instance1 to notify the wait condition after it’s done processing, instead of notifying itself. I’ll use the UserData section of the instance to do this.
"Instance1": { "Type": "AWS::EC2::Instance", "Properties": { "ImageId": { "Ref" : "ImageId" }, "InstanceType": { "Ref": "InstanceType" }, "UserData": { "Fn::Base64": { "Fn::Join": [ "n", [ "<script>", "REM ...Do any instance configuration steps deemed necessary...", { "Fn::Join": ["", [ "cfn-signal.exe -e 0 --stack "", { "Ref": "AWS::StackName" }, "" --resource "Instance1WaitCondition" --region "", { "Ref" : "AWS::Region" }, """ ] ] }, "</script>" ] ] } } } }
Note that CreationPolicy is now defined inside Instance1WaitCondition, and the call to cfn-signal.exe notifies Instance1WaitCondition instead of Instance1.
We now have two resources that signal two different states of Instance1:
- Instance1 is marked as created as soon as it is provisioned.
- Instance1WaitCondition is marked as created only when Instance1 is fully initialized.
Let’s see how we can use this technique to optimize the booting process.
PowerShell to the Rescue
The DependsOn attribute is only available at the top level of resources, but I want to wait for Instance1 after the boot of Instance2. To allow that I need a way to check the status of resources from within the instance’s initialization script so that I can see when resource creation for Instance1WaitCondition is complete. Let’s use Windows PowerShell to provide some automation.
To check resource status from within an instance’s initialization script, I’ll use AWS Tools for Windows PowerShell, a package that is installed by default on every Microsoft Windows Server image provided by Amazon Web Services. The package includes more than 1,100 cmdlets, giving us access to all of the APIs available on the AWS cloud.
The Get-CFNStackResources cmdlet allows me to see whether resource creation for Instance1WaitCondition is complete. This PowerShell script loops until a resource is created:
$region = "" $stack = "" $resource = "Instance1WaitCondition" $output = (Get-CFNStackResources -StackName $stack -LogicalResourceId $resource -Region $region) while (($output -eq $null) -or ($output.ResourceStatus -ne "CREATE_COMPLETE") -and ($output.ResourceStatus -ne "UPDATE_COMPLETE")) { Start-Sleep 10 $output = (Get-CFNStackResource -StackName $stack -LogicalResourceId $resource -Region $region) }
Securing Access to the Resources
When calling an AWS API, I need to be authenticated and authorized. I can do this by providing an access key and a secret key to each API call, but there’s a much better way. I can simply create an AWS Identity and Access Management (IAM) role for the instance. When an instance has an IAM role, code that runs on the instance (including our PowerShell code in UserData) is authorized to make calls to the AWS APIs that are granted in the role.
When creating this role in IAM, I specify only the required actions, and limit these actions to only the current CloudFormation stack.
"DescribeRole": { "Type" : "AWS::IAM::Role", "Properties": { "AssumeRolePolicyDocument": { "Version" : "2012-10-17", "Statement": [ { "Effect": "Allow", "Principal": { "Service": [ "ec2.amazonaws.com" ] }, "Action": [ "sts:AssumeRole" ] } ] }, "Path": "/", "Policies": [ { "PolicyName" : "DescribeStack", "PolicyDocument": { "Version" : "2012-10-17", "Statement": [ { "Effect" : "Allow", "Action" : ["cloudformation:DescribeStackResource", "cloudformation:DescribeStackResources"], "Resource" : [ { "Ref" : "AWS::StackId" } ] } ] } } ] } }, "DescribeInstanceProfile": { "Type" : "AWS::IAM::InstanceProfile", "Properties": { "Path" : "/", "Roles": [ { "Ref": "DescribeRole" } ] } }
Creating the Resources
The description for Instance1WaitCondition and Instance1 is fine, but I need to update Instance2 to add the IAM Role and include the PowerShell wait script. In the UserData section, I will add a scripted reference to Instance1WaitCondition. This “soft” reference doesn’t introduce any dependency in CloudFormation as this is just a simple string. In the UserData section, I will also add a GetAtt reference to Instance1 so that these instances will be provisioned quickly, one after another, without having to wait for the full instance to boot. I also need to secure my API calls by specifying the IAM role we have created as an IamInstanceProfile.
"Instance2": { "Type": "AWS::EC2::Instance", "Properties": { "ImageId": { "Ref" : "ImageId" }, "InstanceType": { "Ref": "InstanceType" }, "IamInstanceProfile": { "Ref": "DescribeInstanceProfile" }, "UserData": { "Fn::Base64": { "Fn::Join": [ "n", [ "", "$resource = "Instance1WaitCondition"", { "Fn::Join": ["", [ "$region = '", { "Ref" : "AWS::Region" }, "'" ] ] }, { "Fn::Join": ["", [ "$stack = '", { "Ref" : "AWS::StackId" }, "'" ] ] }, "#...Wait for instance 1 to be fully available...", "$output = (Get-CFNStackResources -StackName $stack -LogicalResourceId $resource -Region $region)", "while (($output -eq $null) -or ($output.ResourceStatus -ne "CREATE_COMPLETE") -and ($output.ResourceStatus -ne "UPDATE_COMPLETE")) {", " Start-Sleep 10", " $output = (Get-CFNStackResources -StackName $stack -LogicalResourceId $resource -Region $region)", "}", "#...Do any instance configuration steps you deem necessary...", { "Fn::Join": ["", [ "$instance1Ip = '", { "Fn::GetAtt" : [ "Instance1" , "PrivateIp" ] }, "'" ] ] }, "#...You can use the private IP address from Instance1 in your configuration scripts...", "" ] ] } } } }
Now, CloudFormation provisions Instance2 just after Instance1, saving a lot of time because Instance2 boots while Instance1 is booting, but Instance2 then waits for Instance1 to be fully operational before finishing its configuration.
During new environment creation, when a stack contains numerous resources, some with cascading dependencies, this technique can save a lot of time. And when you really need to get an environment up and running quickly, for example, when you’re performing disaster recovery, that’s important.
More Optimization Options
If you want a more reliable way to execute multiple scripts on an instance in CloudFormation, check out AWS::CloudFormation::cfn-init, which provides a flexible and powerful way to configure an instance when it’s started. To automate and simplify scripting your instances and reap the benefits of automatic domain joining for instances, see Amazon EC2 Simple Systems Manager (SSM). To operate your Windows instances in a full DevOps environment, consider using AWS OpsWorks.