AWS Compute Blog

Using AWS CloudFormation to Create and Manage AWS Batch Resources

This post courtesy of Arya Hezarkhani.

AWS CloudFormation allows developers and systems administrators to easily create and manage a collection of related AWS resources (called a CloudFormation stack) by provisioning and updating them in an orderly and predictable way. CloudFormation users can now deploy and manage AWS Batch resources in exactly the same way that they are managing the rest of their AWS infrastructure.

This post highlights the native resources supported in CloudFormation and demonstrates how to create AWS Batch compute environments using CloudFormation. All sample CloudFormation, per-region templates related to this post can be found on the CloudFormation sample template site. The Ohio (us-east-2) Region is used as the example region for the remainder of this post.

AWS Batch Resources

AWS Batch is a managed service that helps you efficiently run batch computing workloads on the AWS Cloud. Users submit jobs to job queues, specifying the application to be run and their jobs’ CPU and memory requirements. AWS Batch is responsible for launching the appropriate quantity and types of instances needed to run your jobs.

AWS Batch removes the undifferentiated heavy lifting of configuring and managing compute infrastructure, allowing you to instead focus on your applications and users. This is demonstrated in the How AWS Batch Works video.

AWS Batch manages the following resources:

  • Job definitions
  • Job queues
  • Compute environments

A job definition specifies how jobs are to be run—for example, which Docker image to use for your job, how many vCPUs and how much memory is required, the IAM role to be used, and more.

Jobs are submitted to job queues where they reside until they can be scheduled to run on Amazon EC2 instances within a compute environment. An AWS account can have multiple job queues, each with varying priority. This gives you the ability to closely align the consumption of compute resources with your organizational requirements.

Compute environments provision and manage your EC2 instances and other compute resources that are used to run your AWS Batch jobs. Job queues are mapped to one more compute environments and a given environment can also be mapped to one or more job queues. This many-to-many relationship is defined by the compute environment order and job queue priority properties.

The following diagram shows a general overview of how the AWS Batch resources interact.

CloudFormation stack creation and updates

Upon the creation of your stack, an AWS Batch job definition is registered using your CloudFormation template. If a job definition with the same name has already been registered, a new revision is created. On stack updates, any changes to your job definition specifications in the CloudFormation template result in a new revision of that job definition and a deregistration of the previous job definition revision. Stack deletions only result in the deregistration of your job definition, as AWS Batch does not delete job definitions.

At the stack creation, a job queue is created using the template. Any changes to your job queue properties within the stack result in a call to the UpdateJobQueue API action. Similarly, stack deletions result in the deletion of job queues from your AWS Batch compute environment.

CloudFormation creates an AWS Batch compute environment using the properties specified in your template. Stack updates result in updates to your compute environment where possible. If you need to change a parameter that is not supported by the UpdateComputeEnvironment API action, stack updates result in the deletion and re-creation of your compute environment. Upon stack deletion, your compute environment is disabled, and then deleted.

All naming conventions specified by CloudFormation should be followed—especially in the case of resource replacement—or you run the risk of a failed stack changes. For example, all AWS Batch resource property names must be capitalized, and resource names must be changed in the case of resource replacement, as is the case in any CloudFormation stack.

If you do not provide values for ComputeEnvironmentName, JobQueueName, or JobDefinitionName in your template, a pseudo-random name is generated for you using the logical ID that you gave the resource in CloudFormation.

Launching a “Hello World” example stack

Here’s a familiar “Hello World” example of a CloudFormation stack with AWS Batch resources.

This example registers a simple job definition, a job queue that can accept job submissions, and a compute environment that contains the compute resources used to execute your job. The stack template also creates additional AWS resources that are required by AWS Batch:

  • An IAM service role that gives AWS Batch permissions to take the required actions on your behalf
  • An IAM ECS instance role
  • A VPC
  • A VPC subnet (though I’ve provided a general template, I suggest that this be a private subnet)
  • A security group

This stack can easily be deployed in the CloudFormation console, but I provide CLI commands that complete the stack creation for you. Use the Launch stack button or run the following command:

 

 

aws --region us-east-2 cloudformation create-stack --stack-name hello-world-batch-stack --template-url https://s3-us-east-2.amazonaws.com/cloudformation-templates-us-east-2/Managed_EC2_Batch_Environment.template --capabilities CAPABILITY_IAM

You can monitor the creation of the resources in your CloudFormation stack in the CloudFormation console, on the Events tab:


Confirm the successful creation of your stack by observing a CREATE_COMPLETE status. At this point, you should also be able to view the new resource ARNs on the Outputs tab:

After your stack is successfully created, everything that you need to submit a “hello-world” job is complete.

Make sure to use the accurate job definition name and revision number. You can find the accurate Amazon Resource Name (ARN) on the CloudFormation stack Outputs tab. A pseudo-random resource name is generated for your AWS Batch resources. If you do have an existing hello-world job definition, make sure that you run the command with the job definition revision created by your new CloudFormation stack from the stack outputs.

Run the following command to submit a job:

aws --region us-east-2 batch submit-job --job-name hello-world-batch-job --job-queue job-queue-from-cfn-outputs --job-definition job-definition-from-cfn-outputs:1

You can monitor the successful execution of the job in the AWS Batch console under Jobs:

When you are done using this stack and want to delete the resources, run the following command. CloudFormation deregisters the job definition, and deletes the job queue, compute environment, and the rest of the resources in the stack template.

aws --region us-east-2 cloudformation delete-stack --stack-name hello-world-batch-stack

Now that you know the basics of AWS Batch resources, here’s a more complex example.

High– and low-priority job queues with On-Demand and Spot compute environments

This CloudFormation stack creates two job queues with varying priority and two compute environments. You have one On-Demand compute environment and one Spot compute environment with a Spot price at 40% of On-Demand.

The first job queue is higher priority and feeds jobs to both compute environments, while the lower priority job queue only submits jobs for execution to the Spot compute environment.

There are two job definitions, one high-priority job queue and one low-priority job queue. Each job submitted using a given job definition is submitted to a job queue. For example, jobs submitted with an important-production-application job definition are submitted to the high priority job queue, while jobs submitted with a test-application job definition are submitted to the low priority job queue.

This example registers both job definitions and creates your compute environments and job queues. It also creates the VPC, subnet, security group, IAM service role for AWS Batch, ECS instance role, and an IAM Spot Fleet role. Use the Launch stack button or run the following command:

 

 

aws --region us-east-2 cloudformation create-stack --stack-name high-low-priority-batch-stack --template-url https://s3-us-east-2.amazonaws.com/cloudformation-templates-us-east-2/Managed_EC2_and_Spot_Batch_Environment.template --capabilities CAPABILITY_IAM

Monitor the creation of the resources in your CloudFormation stack in the CloudFormation console on the Events tab.

Again, find the job definition ARN in the outputs of the CloudFormation stack. I provide a generic name in the commands below.

After the stack creation is complete, run the following commands to submit jobs to each job queue:

aws --region us-east-2 batch submit-job --job-name high-priority-batch-job --job-queue HighPriorityJobQueue-from-cfn-outputs --job-definition ProdApplicationJob-from-cfn-outputs:1
aws --region us-east-2 batch submit-job --job-name low-priority-batch-job --job-queue LowPriorityJobQueue-from-cfn-outputs --job-definition TestApplicationJob-from-cfn-outputs:1

As with any CloudFormation stack, you can update resources for your application’s specific needs. AWS CloudFormation Designer is a graphic tool for creating, viewing, and modifying CloudFormation templates.

Any changes to resource properties that require replacement results in the creation of a new resource to reflect this change, and the deletion of the obsolete resource. Changes to an immutable compute environment or job queue properties results in replacement. Changes to updateable properties update the existing resource. Any changes to job definitions (beyond the name) result in the registration of a new revision of the existing job definition, followed by the deregistration of the previous revision.

Finally, run the following command to delete the CloudFormation stack containing your AWS Batch resources:

aws --region us-east-2 cloudformation delete-stack --stack-name high-low-priority-batch-stack

Conclusion

In this post, I detailed the steps to create, update with and without replacement, and delete your AWS Batch resources using CloudFormation templates as part of CloudFormation stacks with other AWS service resources. For more information, see the following topics:

AWS::Batch::ComputeEnvironment,

AWS::Batch::JobQueue,

AWS::Batch::JobDefinition

If you have any questions or suggestions, please comment below.