AWS HPC Blog

How AWS Batch developed support for Amazon Elastic Kubernetes Service

Batch on EKS

Today’s post is a conversation with Jason Rupard, Principal Software Engineer Developer on the AWS Batch team, about the recent launch supporting Amazon Elastic Kubernetes Services (EKS). You can read about the specific features of the service in the AWS News Blog post. Here we discuss the initial motivation and design choices when we developed the service, as well as some of the challenges that the team overcame.

Before we dive into Kubernetes, let’s review some of the history. AWS Batch was built on Amazon Elastic Container Service (ECS) and has been running massive (1.5+ million vCPU at the peak!) workloads for our customers for over 5 years. Can you tell me a bit about that design?

AWS Batch, as a managed service, has built an ‘overlay’ intelligence for cloud native job scheduling and workload-aware compute scaling on top of ECS. At the beginning, the existing services and the workloads they addressed drove many decisions on what was built and why, and the result was that AWS Batch became a layer on top of ECS, and not a feature within ECS. The overlay approach meant that we could be a bit opinionated about what constitutes a Job as a process that has a definitive start and end and scheduled for maximal price/performance. This is in contrast to services that are meant to run indefinitely and be immediately responsive to a request. It also allowed us to add support for new compute engines or platforms as they become available, such as when we added support for Fargate compute environments, and now EKS.

What lessons and design decisions did you bring over from the ECS implementation?

When we first started researching Kubernetes — its model, feature set, semantics and general view of the world — we saw that Kubernetes really wants scheduling and orchestration controllers (compute scaling) to be built from within the system as a native component. If we’re going to extend AWS Batch to support EKS, we needed to find a reasonable integration point for both technologies.

Since Batch functionality is already an overlay model, it made the most sense to continue that pattern if Kubernetes could support it as well.  We found that we didn’t have to change a lot in Batch and nothing in Kubernetes to deliver a feature set very similar to the one that delighted our existing customers on ECS.  Internally, Batch’s managed service uses native Kubernetes APIs to accomplish this integration.  We also have an internal tenet to “play nicely” with the customer’s EKS cluster.  One manifestation of that, Batch does not touch existing nodes or pods with in the cluster and targets its work to Batch-launched nodes.

We had customers that developed and deployed batch computing solutions within Kubernetes and the common challenges we heard were increased operational and optimization overheads running their self-deployed solutions. Many of the challenges could trace back to the interaction between the scheduling system, and the needed scaling, which as I mentioned are separated in those other systems.

A lot of these challenges aligned almost one-to-one with those we address with Batch for ECS. Our experiences with ECS leads us to believe that you cannot separate the intelligence of scaling from scheduling if you want to have the cost-effective performance and scaling in the cloud.

When we designed the support for EKS for running jobs, we leveraged the same approach and again designed Batch to be an overlay on top of EKS. Batch will take in the job requests, then make workload-aware scaling and scheduling decisions from an external layer above Kubernetes. Most of the k8s scheduling logic is bypassed — we tell the scheduler where to place pods that contain the job.

Finally, the overlay approach allows us to leverage existing AWS Batch features like job dependencies, array jobs, timeouts, retry strategies, and fair share scheduling at launch.  Along with those features, Batch has iterated many years on its workload-aware scaling system for jobs, improving utilization, eliminating some difficult bin-packing edge-cases while balancing throughput of job dispatch rates and our new EKS feature set gets to take advantage of those years of improvement on day one.

How does this affect the user experience?

A user — the person submitting jobs —interacts directly with the AWS Batch API for job submission.  Once the job is running, a pod for the Batch job will be placed onto Batch-managed Kubernetes nodes in their cluster. But from an operator’s stand point, it is much simpler than running services for scaling and scheduling.  There are no native Kubernetes components to install, manage or lifecycle.  Since Batch is a fully managed service, all they really need to configure are the security credentials that Batch needs to interact with a cluster, and adjust logging for Batch nodes and pods if they desire something different from other services running on their clusters.

Speaking of other services on the cluster, Batch manages nodes and pods separate from other workloads that are part of that cluster. Again, calling back to the “don’t separate scaling from scheduling” tenet, resource usage would be suboptimal if either Batch or the k8s scheduler would be able to place arbitrary pods on arbitrary nodes. Batch uses labels and taints on resources so that they are not shared with other service-oriented workloads on the cluster. Batch also avoids placing job pods on non-Batch managed resources with this same approach.

Let’s dig into some detail. What’s a good example of the integration between Batch and Kubernetes?

The smallest unit of work in AWS Batch is a Job, which has a 1:1 mapping to a Kubernetes pod. A Batch Job Definition is a template for a Batch Job and doesn’t relate to any Kubernetes concept. Batch Jobs are submitted by referencing a Job Definition and providing a name for the Job. The eksProperties attribute of a Job Definition defines the set of opinionated attributes a Batch EKS Job supports. The eksPropertiesOverride attribute of a SubmitJob request allows for overrides of some common parameters so that Job Definitions can be reused.

During Job dispatch to your EKS Cluster, Batch transforms the Job into a podspec. Batch internally decorates the podspec with additional attributes for proper scaling & scheduling functionality. Batch combines labels and taints so Batch jobs run only on Batch-managed nodes and other pods do not run on those nodes.

You can map a running job to a pod and node using the Batch APIs. The podProperties of a running job will have podName and nodeName attributes set for the current Job attempt.

aws batch describe-jobs --job 2d044787-c663-4ce6-a6fe-f2baf7e51b04
{
    "jobs": [
        {
            "status": "RUNNING",
            "jobArn": "arn:aws:batch:us-east-1:1111111111111:job/2d044787-c663-4ce6-a6fe-f2baf7e51b04",
             ...
            "jobDefinition": "arn:aws:batch:us-east-1:111111111111:job-definition/MyJobOnEks_SleepWithRequestsOnly:1",
            "jobQueue": "arn:aws:batch:us-east-1:111111111111:job-queue/My-Eks-JQ1",
            "jobId": "2d044787-c663-4ce6-a6fe-f2baf7e51b04",
            "eksProperties": {
                "podProperties": {
                    "nodeName": "ip-192-168-55-175.ec2.internal",
                    "containers": [
                        {
                            "image": "public.ecr.aws/amazonlinux/amazonlinux:2",
                            ...
                            "resources": {
                                "requests": {
                                    "cpu": "1",
                                    "memory": "1024Mi"
                                }
                            }
                        }
                    ],
                    "podName": "aws-batch.b0aca953-ba8f-3791-83e2-ed13af39428c"
                }
            },
            ...
        }
    ]
}

You can also map a pod back to its Batch job, since each pod will have labels indicating the job-id and compute-environment-uuid it belongs to. Here is an example of using kubectl to query a specific pod for information. Also note that Batch injects environment variables so the job’s runtime can reference job information if it needs to.

kubectl describe pod aws-batch.14638eb9-d218-372d-ba5c-1c9ab9c7f2a1 -n my-aws-batch-namespace

Name:         aws-batch.14638eb9-d218-372d-ba5c-1c9ab9c7f2a1
Namespace:    my-aws-batch-namespace
Priority:     0
Node:         ip-192-168-45-88.ec2.internal/192.168.45.88
Start Time:   Wed, 19 Oct 2022 00:30:48 +0000
Labels:       batch.amazonaws.com/compute-environment-uuid=5c19160b-d450-31c9-8454-86cf5b30548f
              batch.amazonaws.com/job-id=f980f2cf-6309-4c77-a2b2-d83fbba0e9f0
              batch.amazonaws.com/node-uid=a4be5c1d-9881-4524-b967-587789094647
...
Status:       Running
IP:           192.168.45.88
IPs:
  IP:  192.168.45.88
Containers:
  default:
    Image:         public.ecr.aws/amazonlinux/amazonlinux:2
    ...
    Environment:
      AWS_BATCH_JOB_KUBERNETES_NODE_UID:  a4be5c1d-9881-4524-b967-587789094647
      AWS_BATCH_JOB_ID:                   f980f2cf-6309-4c77-a2b2-d83fbba0e9f0
      AWS_BATCH_JQ_NAME:                  My-Eks-JQ1
      AWS_BATCH_JOB_ATTEMPT:              1
      AWS_BATCH_CE_NAME:                  My-Eks-CE1

...

Nice. Where can folks get started?

AWS Batch support for Amazon EKS is generally available, so customers can just visit the AWS Management Console to get access, or the AWS Batch documentation to read up on how they can get started using Batch for running workloads on their EKS clusters. Last, you can also work through some example exercises in the self-guided workshop.

Angel Pizarro

Angel Pizarro

Angel is a Principal Developer Advocate for HPC and scientific computing. His background is in bioinformatics application development and building system architectures for scalable computing in genomics and other high throughput life science domains.

Jason Rupard

Jason Rupard

Jason is a Principal Engineer on the AWS Batch team. He has been a part of the team for the past 6 years and has previously worked 5 years in EC2 server provisioning and capacity org. Jason helps AWS teams build and operate scalable and reliable services. He holds a masters degree in Computer Science.