Amazon EKS now supports Amazon Application Recovery Controller

Introduction

Amazon Elastic Kubernetes Service (Amazon EKS) now supports Amazon Application Recovery Controller (ARC). ARC is an AWS service that allows you to prepare for and recover from AWS Region or Availability Zone (AZ) impairments.

ARC provides two sets of capabilities: Multi-AZ recovery, which includes zonal shift and zonal autoshift, and multi-Region recovery, which includes routing control and readiness check. With this launch, we’ve extended the support for zonal shift and zonal autoshift capabilities to Amazon EKS, which were previously available only for AWS Application Load Balancers (ALBs) and Network Load Balancers (NLBs).

The ARC zonal shift and zonal autoshift capabilities achieve multi-AZ recovery for a supported AWS resource by shifting the ingress traffic away from an impaired AZ to other healthy AZs. When the shift ends, ARC adds back the previously impacted AZ so that it can again receive ingress traffic.

You can enable zonal shift for your EKS clusters using the Amazon EKS console, AWS Command Line Interface (AWS CLI), AWS CloudFormation, or eksctl. When enabled, you can use the ARC console, AWS CLI, or the Zonal Shift and Zonal Autoshift APIs to start a zonal shift or enable zonal autoshift for your EKS cluster.

To trigger a zonal shift on an EKS cluster, you must first select an AZ, then select the EKS cluster (version 1.28 or above) and specify an expiration time for the zonal shift to be in effect. Then, ARC initiates a zonal shift, which shifts the traffic away from the selected AZ. ARC ends the zonal shift when it expires or if you cancel it. When the zonal shift ends, traffic returns to all the healthy AZs attached to the EKS cluster.

When you enable zonal autoshift for an EKS cluster, you allow AWS to shift the traffic away on your behalf whenever ARC detects that an AZ is unhealthy. ARC uses internal telemetry to monitor critical health metrics from various sources, such as the AWS network, Amazon Elastic Compute Cloud (Amazon EC2), and Elastic Load Balancing (ELB) services. ARC ends the zonal autoshift when the telemetry indicates that the impacted AZ is healthy again. This returns traffic to all the healthy AZs attached to the EKS cluster.

Why should you use ARC zonal shift and zonal autoshift?

The AWS Global Cloud Infrastructure provides fault tolerance and resilience, with each AWS Region made up of multiple, fully isolated AZs. Leveraging this multi-AZ architecture is essential for implementing highly-available applications in a Region. Amazon EKS allows you to quickly develop highly-available applications by deploying them across multiple AZs, but addressing AZ impairments in a scalable, performant, and reliable manner requires implementing custom solutions that would take much effort to build and maintain.

Another challenge is testing for AZ impairment scenarios, which are often difficult to simulate. Insufficient testing can lead to unpredictable workload behavior when an AZ in your environment is unhealthy.

Using ARC zonal shift or zonal autoshift, you can temporarily isolate your cluster worker nodes and pods that are running in an impaired AZ and automatically shift in-cluster network traffic away from them to improve your workload’s fault tolerance and availability.

Furthermore, by using zonal shift and zonal autoshift capabilities, you can reduce your team’s operational overhead involved in planning and responding to AZ impairments.

How it works

When you’ve registered your EKS cluster as an ARC resource, you can use ARC to trigger a zonal shift for the cluster or alternatively enable zonal autoshift for the cluster. When ARC performs a zonal shift, the cluster undergoes the following changes:

The nodes in the impacted AZ are cordoned to prevent the Kubernetes Scheduler from scheduling new pods onto the nodes in the unhealthy AZ. If you’re using managed node groups (MNG), AZ rebalancing is suspended, and your Auto Scaling Group (ASG) is updated to make sure that new Amazon EKS data plane nodes are only launched in the healthy AZs. Karpenter and Kubernetes Cluster Autoscaler don’t natively support ARC zonal shift and zonal autoshift. You must reconfigure your auto scaling tool to only provision new nodes in the healthy AZs. Refer to the Amazon EKS Best Practices Guide on how to configure Karpenter and Cluster Autoscaler to only use certain AZs for launching new nodes.

Nodes in the unhealthy AZ aren’t terminated. Therefore, pods aren’t evicted in the impacted AZ. This is done to make sure of the safe return of your traffic to the AZ with full capacity when a zonal shift expires or is cancelled.
The EndpointSlice controller finds the pod endpoints in the impaired AZ and removes them from the relevant EndpointSlice resources. This guarantees that network traffic only targets pod endpoints in healthy AZs. The endpoint slice controller updates the endpoint slice to include the endpoints in the restored AZ when a zonal shift cancels or expires.

The following diagram depicts an Amazon EKS environment with east-to-west traffic flow when an AZ is unhealthy. In such a scenario, there may be network packet drops or network latency.

The following diagram depicts an Amazon EKS environment when you shift traffic away from the impaired AZ.

Preparing your EKS cluster and workloads for zonal shift and zonal autoshift

To make sure that zonal shift and autoshift work successfully in Amazon EKS, you must prepare your cluster environment to be resilient to AZ impairment beforehand. The following is a list of important steps you have to implement for your EKS cluster. These steps are further detailed in the Amazon EKS documentation.

Distribute the worker nodes in your cluster across multiple AZs.
Provision sufficient compute capacity to withstand the removal of a single AZ. Refer to the AWS documentation on static stability for more information on how to build applications that can withstand AZ impairments.
In every AZ, pre-scale your pods. These pods include your application pods and controller pods such as CoreDNS, Cluster Autoscaler, and AWS Load Balancer Controller. Refer to the Amazon EKS documentationfor more information on how this could be achieved.
Spread pod replicas across the AZs to make sure that shifting away from a single AZ leaves you with sufficient capacity. Topology spread constraints can help you achieve this.
If a leader election is necessary to support high availability (HA) for the controllers or any other application running in your clusters, then make sure that the criteria, such as having an odd number of pods or at least two pods, are consistently met during and after the AZ impairment event.
When using load balancers to route external traffic to your Kubernetes services, we recommend using only ALB and NLB. We also recommend using AWS Load Balancer controller to manage the load balancers. AWS Load Balancer controller supports Instance and IP traffic modes, of which IP mode is recommended. Refer to the AWS Load Balancer controller documentation for more information on instances and IP modes.

The preceding steps are essential for your applications and cluster environment to recover successfully with a zonal shift in Amazon EKS. In addition, we recommend the following best practices to effectively manage AZ impairments.

Limit pod-to-pod communications to be within the same AZ by using Kubernetes features such as Topology Aware Routing or integrating with a service mesh.
In the same AZ, co-locate interdependent applications and services. You can achieve this with pod affinity rules.
Implement Multi-AZ observability.
For your applications, make sure that you properly configure timeout values for external dependencies such as databases, services, and implement retries. To gracefully handle failures, implement a circuit breaker with an exponential back off pattern.

Refer to the Amazon EKS documentation for more information on how to prepare your cluster and workloads to support zonal shift and zonal autoshift, and other best practices for zonal shift and zonal autoshift.

Furthermore, we strongly recommend regularly testing and validating that your workloads can handle AZ impairments. You can test for AZ impairment by manually triggering a zonal shift or by enabling zonal autoshift and verifying that your workloads are functioning as expected with one less AZ in your cluster environment.

Getting started

We have built an example application, which we are using to walk through the ARC Zonal shift capabilities. You can either use an existing EKS cluster or create a new cluster for this walkthrough. The cluster and node groups configured in the cluster should span across three AZs and have minimum of one node in each AZ.

1.Deploy the example application

a. Start by deploying the example application on the EKS cluster. Make sure that you provide a valid username and password when creating the Kubernetes secret. This secret is used by both the MySQL database and the application to connect to it.

kubectl create secret generic catalog-db --from-literal=username=<<REPLACE_WITH_VALID_USERNAME>> --from-literal=password=<<REPLACE_WITH_VALID_PASSWORD>>
cat << EOF > catalog_deploy.yaml
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: catalog
data:
  DB_ENDPOINT: catalog-mysql-0.catalog-mysql:3306
  DB_READ_ENDPOINT: catalog-mysql-0.catalog-mysql:3306
  DB_NAME: catalog
---
apiVersion: v1
kind: Service
metadata:
  name: catalog-mysql
  labels:
    helm.sh/chart: catalog-0.0.1
    app.kubernetes.io/name: catalog
    app.kubernetes.io/instance: catalog
    app.kubernetes.io/component: mysql
spec:
  clusterIP: None
  ports:
    - port: 3306
      targetPort: mysql
      protocol: TCP
      name: mysql
  selector:
    app.kubernetes.io/name: catalog
    app.kubernetes.io/instance: catalog
    app.kubernetes.io/component: mysql
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: catalog
  labels:
    helm.sh/chart: catalog-0.0.1
    app.kubernetes.io/name: catalog
    app.kubernetes.io/instance: catalog
    app.kubernetes.io/component: service
    app.kuberneres.io/owner: retail-store-sample
    app.kubernetes.io/managed-by: Helm
---
apiVersion: v1
kind: Service
metadata:
  name: catalog
  labels:
    helm.sh/chart: catalog-0.0.1
    app.kubernetes.io/name: catalog
    app.kubernetes.io/instance: catalog
    app.kubernetes.io/component: service
    app.kuberneres.io/owner: retail-store-sample
    app.kubernetes.io/managed-by: Helm
spec:
  type: ClusterIP
  ports:
    - port: 80
      targetPort: http
      protocol: TCP
      name: http
  selector:
    app.kubernetes.io/name: catalog
    app.kubernetes.io/instance: catalog
    app.kubernetes.io/component: service
    app.kuberneres.io/owner: retail-store-sample
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: catalog
  labels:
    helm.sh/chart: catalog-0.0.1
    app.kubernetes.io/name: catalog
    app.kubernetes.io/instance: catalog
    app.kubernetes.io/component: service
    app.kuberneres.io/owner: retail-store-sample
    app.kubernetes.io/managed-by: Helm
spec:
  replicas: 3
  strategy:
    rollingUpdate:
      maxUnavailable: 1
    type: RollingUpdate
  selector:
    matchLabels:
      app.kubernetes.io/name: catalog
      app.kubernetes.io/instance: catalog
      app.kubernetes.io/component: service
      app.kuberneres.io/owner: retail-store-sample
  template:
    metadata:
      annotations:
        prometheus.io/path: /metrics
        prometheus.io/port: "8080"
        prometheus.io/scrape: "true"
      labels:
        app.kubernetes.io/name: catalog
        app.kubernetes.io/instance: catalog
        app.kubernetes.io/component: service
        app.kuberneres.io/owner: retail-store-sample
    spec:
      topologySpreadConstraints:
        - maxSkew: 1
          topologyKey: topology.kubernetes.io/zone
          whenUnsatisfiable: ScheduleAnyway
          labelSelector:
            matchLabels:
              app.kubernetes.io/name: catalog
      serviceAccountName: catalog
      securityContext:
        fsGroup: 1000
      containers:
        - name: catalog
          env:
            - name: DB_USER
              valueFrom:
                secretKeyRef:
                  name: catalog-db
                  key: username
            - name: DB_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: catalog-db
                  key: password
          envFrom:
            - configMapRef:
                name: catalog
          securityContext:
            capabilities:
              drop:
                - ALL
            readOnlyRootFilesystem: true
            runAsNonRoot: true
            runAsUser: 1000
          image: public.ecr.aws/aws-containers/retail-store-sample-catalog:0.8.1
          imagePullPolicy: IfNotPresent
          ports:
            - name: http
              containerPort: 8080
              protocol: TCP
          livenessProbe:
            httpGet:
              path: /health
              port: 8080
            initialDelaySeconds: 30
            periodSeconds: 3
          resources:
            limits:
              memory: 256Mi
            requests:
              cpu: 128m
              memory: 256Mi
          volumeMounts:
            - mountPath: /tmp
              name: tmp-volume
      volumes:
        - name: tmp-volume
          emptyDir:
            medium: Memory
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: catalog-mysql
  labels:
    helm.sh/chart: catalog-0.0.1
    app.kubernetes.io/name: catalog
    app.kubernetes.io/instance: catalog
    app.kubernetes.io/component: mysql
    app.kubernetes.io/managed-by: Helm
spec:
  replicas: 3
  serviceName: catalog-mysql
  selector:
    matchLabels:
      app.kubernetes.io/name: catalog
      app.kubernetes.io/instance: catalog
      app.kubernetes.io/component: mysql
  template:
    metadata:
      labels:
        app.kubernetes.io/name: catalog
        app.kubernetes.io/instance: catalog
        app.kubernetes.io/component: mysql
    spec:
      topologySpreadConstraints:
        - maxSkew: 1
          topologyKey: topology.kubernetes.io/zone
          whenUnsatisfiable: ScheduleAnyway
          labelSelector:
            matchLabels:
              app.kubernetes.io/name: catalog
              app.kubernetes.io/component: mysql
      containers:
        - name: mysql
          image: public.ecr.aws/docker/library/mysql:8.0
          imagePullPolicy: IfNotPresent
          env:
            - name: MYSQL_ROOT_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: catalog-db
                  key: password
            - name: MYSQL_DATABASE
              value: catalog
            - name: MYSQL_USER
              valueFrom:
                secretKeyRef:
                  name: catalog-db
                  key: username
            - name: MYSQL_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: catalog-db
                  key: password
          volumeMounts:
            - name: data
              mountPath: /var/lib/mysql
          ports:
            - name: mysql
              containerPort: 3306
              protocol: TCP
      volumes:
        - name: data
          emptyDir: {}
---
EOF
kubectl apply -f catalog_deploy.yaml

b. Applying the Kubernetes manifest file to the cluster creates two applications named “catalog” and “catalog-mysql,” respectively, with “catalog-mysql” being a MySQL database. Verify that the pods are in a running state before proceeding to the next step (this may take few minutes).

2. Enable zonal shift for your cluster

a. Open the Amazon EKS console, select the cluster and go to the Zonal shift section under Overview, as shown in the following figure.

b. Choose ‘Manage’, then choose ‘Enabled’ and save changes.

3. Validate the application

a)List the available services in the default namespace.

kubectl get svc 
NAME            TYPE           CLUSTER-IP       EXTERNAL-IP   PORT(S)        AGE
catalog         LoadBalancer   XX.XXX.XXX.XXX   <pending>     80:31932/TCP   41m
catalog-mysql   ClusterIP      None             <none>        3306/TCP       41m

b)Run kubectl port-forward to the catalog service in background mode. Note the process ID of the process.

kubectl port-forward svc/catalog 8090:80 > /dev/null &
[1] 42306

c)Using curl, invoke the catalog service, and you should observe the service return a few item IDs as shown in the following.

curl -s localhost:8090/catalogue | jq -r '.[0,1].id'
510a0d7e-8e83-4193-b483-e27e09ddc34d
6d62d909-f957-430e-8689-b5129c0bb75e
#kill the port-forward process (42306)
kill -9 <<process id of the kubectl port-forward process>

4. Understand the cluster topology

As you have validated that the application is working fine, thus you are ready to perform a zonal shift. However, before you trigger a zonal shift, you must understand the cluster’s topology, which involves identifying the AZs where the pods are running.

a. List the Region’s AZ IDs. In this case, the Region is us-west-2, thus you could observe the AZ IDs for us-west-2.

aws ec2 describe-availability-zones --query 'AvailabilityZones[*].[ZoneName, ZoneId]' --output text
us-west-2a      usw2-az2
us-west-2b      usw2-az1
us-west-2c      usw2-az3
us-west-2d      usw2-az4

b. You must identify each node in the cluster and the AZ in which it is operating, using the following commands. Entering the following commands should produce an output listing the node names and the AZs where the nodes are running. In this case, you have three nodes distributed across three AZs: us-west-2b, us-west-2b, and us-west-2c.

kubectl get nodes -o=jsonpath='{range .items[*]}"{.metadata.name}"{"\t"}"{.metadata.labels.topology\.kubernetes\.io/zone}"{"\n"}{end}' | sort -k 1 > nodes-info.txt
cat nodes-info.txt
"ip-XXX-XXX-XXX-XXX.us-west-2.compute.internal"      "us-west-2a"
"ip-YYY-YYY-YYY-YYY.us-west-2.compute.internal"      "us-west-2b"
"ip-ZZZ-ZZZ-ZZZ-ZZZ.us-west-2.compute.internal"     "us-west-2c"

c. You must use the following commands to identify the nodes and AZs where each pod is currently running. Entering the commands should produce an output listing the pod name, the AZ, and the node where the pods are running. In this case, you have the catalog application pods, which are distributed across three nodes across three AZs.

kubectl get pods -l "app.kubernetes.io/component"=service -o=jsonpath='{range .items[*]}"{.metadata.name}"{"\t"}"{.spec.nodeName}"{"\n"}{end}' | sort -k 2 > pods-info.txt
join -1 1 -2 2  nodes-info.txt pods-info.txt 
"ip-XXX-XXX-XXX-XXX.us-west-2.compute.internal" "us-west-2b" "catalog-74957c74ff-xxxxx"
"ip-YYY-YYY-YYY-YYY.us-west-2.compute.internal" "us-west-2c" "catalog-74957c74ff-yyyyy"
"ip-ZZZ-ZZZ-ZZZ-ZZZ.us-west-2.compute.internal" "us-west-2a" "catalog-74957c74ff-zzzzz"

5. Trigger a zonal shift

You should now have a good understanding of cluster topology. Next, you can trigger a zonal shift to move traffic away from an to test the Zonal shift capability.

a. Open ARC console and choose Zonal shift, as shown in the following figure.

b. Initiate a zonal shift by selecting the AZ (us-west-2b) to shift traffic away from, the EKS cluster to perform the zonal shift for, an expiry time period (10 minutes), and then choose Start, as shown in the following figures.

Zonal shift takes a few minutes to complete after you trigger it. Therefore, it is advisable to wait a few minutes before testing.

c. Validate the application by generating traffic to the application endpoint and verifying no calls are made to the pods running in the AZ from which the traffic was shifted away. You can accomplish this by first running a Kubernetes job to generate traffic to the application and then identify from the logs the pods responsible for handling the traffic and the AZs to which they belong. Entering the following commands, you observe that the traffic to the catalog service is distributed across two pods.

kubectl create job curl-job --image=curlimages/curl -- /bin/sh -c "while true; do curl -s catalog.default/catalogue; sleep 1; done"
start_time=$(date -u +"%Y-%m-%dT%H:%M:%SZ")
#wait for 20-30 seconds
kubectl logs -l "app.kubernetes.io/component"=service --prefix --since-time=$start_time --tail=50 | grep -i "/catalogue" | cut -d '/' -f 2 | sort | uniq -c > pod-logs.txt
cat pod-logs.txt
5 catalog-78679df9c4-xxxx
6 catalog-78679df9c4-zzzz

d. If you check the locations of the pods, then you can observe that none of them are operating in AZ us-west-2b, the AZ from which the zonal shift diverted traffic.

join -1 1 -2 2  nodes-info.txt pods-info.txt | tr -d \" | sort -k 3  > pods-nodes-az.txt
join -1 3 -2 2  pods-nodes-az.txt pod-logs.txt
catalog-74957c74ff-xxxx ip-XXX-XXX-XXX-XXX.us-west-2.compute.internal us-west-2a 5
catalog-74957c74ff-zzzz ip-ZZZ-ZZZ-ZZZ-ZZZ.us-west-2.compute.internal us-west-2c 6

e. Before proceeding, delete the Kubernetes job that you created for generating traffic.

kubectl delete job curl-job

6. Cancel a zonal shift

a. Test canceling the zonal shift by choosing a zonal shift that you previously created and choosing Cancel zonal shift, as shown in the following figures.

Cancelling a zonal shift takes a few minutes to complete after you trigger it. Therefore, it is advisable to wait a few minutes before testing.

b. You can generate traffic to the application and confirm that the pods operating in the AZs receive the traffic.

kubectl create job curl-job --image=curlimages/curl -- /bin/sh -c "while true; do curl -s catalog.default/catalogue; sleep 1; done"
start_time=$(date -u +"%Y-%m-%dT%H:%M:%SZ")
#wait for 20-30 seconds
kubectl logs -l "app.kubernetes.io/component"=service --prefix --since-time=$start_time --tail=50 | grep -i "/catalogue" | cut -d '/' -f 2 | sort | uniq -c > pod-logs.txt
cat pod-logs.txt
9 catalog-78679df9c4-xxxx
7 catalog-78679df9c4-yyyy
5 catalog-78679df9c4-zzzz
join -1 1 -2 2  nodes-info.txt pods-info.txt | tr -d \" | sort -k 3  > pods-nodes-az.txt                                           
join -1 3 -2 2  pods-nodes-az.txt pod-logs.txt
catalog-74957c74ff-xxxx ip-XXX-XXX-XXX-XXX.us-west-2.compute.internal us-west-2a 9
catalog-74957c74ff-yyyy ip-YYY-YYY-YYY-YYY.us-west-2.compute.internal us-west-2b 7
catalog-74957c74ff-zzzz ip-ZZZ-ZZZ-ZZZ-ZZZ.us-west-2.compute.internal us-west-2c 5
#delete the kubernetes job
kubectl delete job curl-job

7. Configure zonal autoshift for the cluster.

a. Before you can configure zonal autoshift, you need to set up an Amazon CloudWatch alarm that ARC uses to verify if the practice runs have completed successfully. For more information on ARC zonal autoshift practice runs, refer to this documentation.

b. Open the ARC console and choose Configure zonal autoshift, as shown in the following figure.

c. Choose the EKS cluster as the resource to configure zonal autoshift, choose Enable for zonal autoshift status, provide a CloudWatch alarm ARN, and choose Create. Leave the optional sections in the console as is, as shown in the following figures.

d. ARC regularly performs a zonal autoshift one time per week as part of practice runs. You can integrate with Amazon EventBridge to receive notifications for zonal autoshift and practice runs. During the practice run, you can apply the same validation steps you used to validate the zonal shift for the zonal autoshift.

Zonal shift and autoshift enables you to quickly recover from AZ impairment and increase the reliability of your Amazon EKS workloads. To be truly resilient against AZ impairment, your workloads must not only use the zonal shift and autoshift capabilities but also recover from AZ impairment by adhering to the practices outlined in the Cluster and Workload Requirements section.

Cleaning Up

To avoid future costs, delete all resources, such as the EKS cluster, which were created for this exercise. The following command deletes the application that you installed earlier to test Zonal shift.

kubectl delete -f catalog_deploy.yaml
kubectl detelet secret catalog-db
rm nodes-info.txt pods-info.txt pod-logs.txt pods-nodes-az.txt

Pricing and availability

Amazon EKS support for ARC Zonal shift and autoshift capabilities is available in all AWS Regions, excluding China and GovCloud regions. Enabling zonal shift in your EKS clusters and triggering zonal shifts doesn’t incur any more costs. However, you may incur more costs to make sure that your workloads can handle AZ failures, such as pre-scaling your pods and cluster nodes.

Conclusion

In this post, we demonstrated how you can use the ARC zonal shift and zonal autoshift capabilities to recover from a single AZ impairment. With careful planning and implementation, you can harness the full potential of zonal shift and zonal autoshift to protect your applications and data sources running in Amazon EKS clusters with an impacted AZ.

Visit the Amazon EKS documentation to learn more about zonal shift and zonal autoshift in Amazon EKS.

By leaving a comment or opening an issue on the GitHub-hosted AWS Containers Roadmap, you can provide feedback on the ARC zonal shift capability for EKS clusters. Stay tuned as we continue to evolve our features and explore more methods to help our users improve their cluster resilience and availability.

Containers