AWS Architecture Blog
Creating a Multi-Region Application with AWS Services – Part 1, Compute, Networking, and Security
Many AWS services have features to help you build and manage a multi-Region architecture, but identifying those capabilities across 200+ services can be overwhelming.
In this 3-part blog series, we filter through those 200+ services and focus on those that have specific features to assist you in building multi-Region applications. In Part 1, we’ll build a foundation with AWS security, networking, and compute services. In Part 2, we’ll add in data and replication strategies. Finally, in Part 3, we’ll look at the application and management layers. As we go through each part, we’ll build up an example application to display one way of combining these services to create a multi-Region application.
Considerations before getting started
AWS Regions are built with multiple isolated and physically separate Availability Zones (AZs). This approach allows you to create highly available Well-Architected workloads that span AZs to achieve greater fault tolerance. This satisfies the availability goals for most applications, but there are some general reasons that you may be thinking about expanding beyond a single Region:
- Expansion to a global audience as an application grows and its user base becomes more geographically dispersed, there can be a need to reduce latencies for different parts of the world.
- Reducing Recovery Point Objectives (RPO) and Recovery Time Objectives (RTO) as part of a multi-Region disaster recovery (DR) plan.
- Local laws and regulations may have strict data residency and privacy requirements that must be followed.
If you’re building a new multi-Region application, you may want to consider focusing on AWS services that have built-in functionality to assist. Existing applications will need to be further examined to determine the most expandable architecture to support its growth. The following sections review these services, and highlight use cases and best practices.
Identity and access across Regions
Creating a security foundation starts with setting proper authentication and authorization rules. The system handling these requests must be highly resilient to verify and authorize requests quickly and reliably. AWS Identity and Access Management (IAM) accomplishes this by creating a reliable mechanism for you to manage access to AWS services and resources. IAM has multi-Region availability automatically, with no configuration required on your part.
For help managing Windows users, devices, and applications on a multi-Region network, you can set up AWS Directory Service for Microsoft Active Directory Enterprise Edition to automatically replicate directory data across Regions. This reduces directory lookup latencies by using the closest directory and creates durability by spanning multiple Regions. Note that this will also introduce a shared fate across domain controllers for multi-Region topologies, because group policy changes will be propagated to all member servers.
Applications that need to securely store, rotate, and audit secrets, such as database passwords, should use AWS Secrets Manager. This service encrypts secrets with AWS Key Management Service (AWS KMS) keys and can replicate secrets to secondary Regions to ensure applications are able to quickly retrieve a secret in the closest Region.
Encryption across Regions
AWS KMS can be used to encrypt data at rest, and is used extensively for encryption across AWS services. By default, keys are confined to a single Region. AWS services such as Amazon Simple Storage Service (Amazon S3) cross-Region replication and Amazon Aurora Global Database (both covered in part 2), simplify the process of encryption and decryption with different keys in each Region. For other parts of your multi-Region application that rely on KMS keys, you can set up AWS KMS multi-Region keys to replicate the key material and key ID to a second Region. This eliminates the need to decrypt and re-encrypt data with a different key in each Region. For example, multi-Region keys can be used to reduce the complexity of a multi-Region application’s encryption operations for data that is stored across Regions.
Auditing and observability across Regions
It is a best practice to configure AWS CloudTrail to keep a record of all relevant AWS API activity in your account for auditing purposes. When you utilize multiple Regions or accounts, these CloudTrail logs should be aggregated into a single Amazon S3 bucket for easier analysis. To prevent misuse, the centralized logs should be treated with higher severity, with only limited access to key systems and personnel.
To stay on top of AWS Security Hub findings, you can aggregate and link findings from multiple locations to a single Region. This is an easy way to create a centralized view of Security Hub findings across accounts and Regions. Once set up, the findings are continuously synced between Regions to keep you updated on global results in a single dashboard.
We put these features together in Figure 1. We used IAM to grant fine-grained access to AWS services and resources, Directory Service for Microsoft AD for authentication to Microsoft applications, and Secrets Manager to store sensitive database credentials. Our data, which moves freely between Regions, is encrypted with KMS multi-Region keys, and all AWS API access is logged with CloudTrail and aggregated to a central S3 bucket that only our security team has access to.
Building a global network
For resources launched into virtual networks in different Regions, Amazon Virtual Private Cloud (Amazon VPC) allows private routing between Regions and accounts with VPC peering. These resources can communicate using private IP addresses and do not require an internet gateway, VPN, or separate network appliances. This feature works well for smaller networks that only require a few peering connections. However, transitive routing is not allowed, and as the number of peered virtual private cloud (VPCs) increases, the mesh of peered connections can become difficult to manage and troubleshoot.
AWS Transit Gateway reduces these difficulties by creating a network transit hub that connects your VPCs and on-premises networks. A Transit Gateway’s routing capabilities can expand to additional Regions with Transit Gateway inter-Region peering to create a globally distributed, private network for your resources.
Building a reliable, cost-effective way to route users to distributed Internet applications requires highly available and scalable Domain Name System (DNS) records. Amazon Route 53 does exactly that.
Route 53 includes many routing policies. For example, you can route a request to a record with the lowest network latency, or send users in a specific geolocation to a localized application endpoint. For DR, Route 53 Application Recovery Controller (Route 53 ARC) offers a comprehensive failover solution with minimal dependencies. Route 53 ARC routing policies, safety checks, and readiness checks help you to failover across Regions, AZs, and on-premises reliably.
The Amazon CloudFront content delivery network is global, built across 300+ points of presence (PoP) spread throughout the world. Applications that have multiple possible origins, such as across Regions, can use CloudFront origin failover to automatically fail over to a recovery origin when the primary is not available. CloudFront’s capabilities expand beyond serving content, with the ability to run compute at the edge. CloudFront functions make it easy to run lightweight JavaScript code, and AWS Lambda@Edge enables you to run Node.js and Python functions closer to users of your application, which improves performance and reduces latency. By placing compute at the edge, you can take load off of your origin and provide quicker responses for your global end users.
Built on the AWS global network, AWS Global Accelerator provides two static anycast IPs to give a single-entry point for internet-facing applications. You can seamlessly add or remove origins while continuing to automatically route traffic to the closest healthy Regional endpoint. If a failure is detected, Global Accelerator will automatically redirect traffic to a healthy endpoint within seconds, with no changes to the static IP.
Figure 2 uses a Route 53 latency-based routing policy to route users to the quickest endpoint, CloudFront is used to serve static content such as videos and images, and Transit Gateway creates a global private network for our devices to talk securely across Regions.
Building and managing the compute layer
Although Amazon Elastic Compute Cloud (Amazon EC2) instances and their associated Amazon Elastic Block Store (Amazon EBS) volumes reside in a single AZ, Amazon Data Lifecycle Manager can automate the process of taking and copying EBS snapshots across Regions. This can enhance DR strategies by providing an easy cold backup-and-restore option for EBS volumes. If you need to back up more than just EBS volumes, AWS Backup provides a central place to do this across multiple services and is covered in part 2.
An Amazon EC2 instance is based on an Amazon Machine Image (AMI). An AMI specifies instance configurations such as the instance’s storage, launch permissions, and device mappings. When a new standard image needs to be created and released, EC2 Image Builder simplifies the building, testing, and deployment of new AMIs. It can also help with copying of AMIs to additional Regions to eliminate needing to manually copy source AMIs to target Regions.
Microservice-based applications that use containers benefit from quicker start-up times. Amazon Elastic Container Registry (Amazon ECR) can help ensure this happens consistently across Regions with private image replication at the registry level. An ECR private registry can be configured for either cross-Region or cross-account replication to ensure your images are ready in secondary Regions when needed.
As an architecture expands into multiple Regions, it can become difficult to track where resources are provisioned. Amazon EC2 Global View helps alleviate this by providing a centralized dashboard to see Amazon EC2 resources such as instances, VPCs, subnets, security groups, and volumes in all active Regions.
We bring these compute layer features together in Figure 3 by using EC2 Image Builder to copy our latest golden AMI across Regions for deployment. We also back up each EBS volume for 3 days and replicate it across Regions using Data Lifecycle Manager.
Bringing it together
At the end of each part of this blog series, we build on a sample application based on the services covered. This shows you how to bring these services together to build a multi-Region application with AWS services. We don’t use every service mentioned, just those that fit the use case.
We built this example to expand to a global audience. It requires high availability across Regions, and favors performance over strict consistency. We have chosen the following services covered in this post to accomplish our goals:
- A Route 53 latency routing policy that routes users to the deployment with the least latency.
- CloudFront is set up to serve our static content. Region 1 is our primary origin, but we’ve configured origin failover to Region 2 in case of a disaster.
- The application relies on several third-party APIs, so Secrets Manager with cross-Region replication has been set up to store sensitive API key information.
- We centralize our CloudTrail logs in Region 1 for easier analysis and auditing.
- Security Hub in Region 1 is where we have chosen to aggregate findings from all Regions.
- This is a containers-based application, and we rely on Amazon ECR replication for each location to quickly pull the latest images locally.
- To communicate using private IPs across Regions, a Transit Gateway is set up in each Region with intra-Region between them. VPC peering could have also worked, but we expect to expand to several more Regions in the future and decided this would be the better long-term choice.
- IAM is used to grant access to manage our AWS resources.
While our primary objective is expanding to a global audience, we note that some of the replication that has been set up is one-way, such as for Secrets Manager and Amazon ECR. Each Regional deployment is set up for static stability, but if there were an outage in Region 1 for an extended period of time, our DR playbook would outline how to make each service writable in Region 2.
Summary
It’s important to create a solid foundation when architecting a multi-Region application. These foundations lay the groundwork for you to move fast in a secure, reliable, and elastic way as you build out your application. Many AWS services include native features to help you build a multi-Region architecture. Your architecture will be different depending on the reason for expanding beyond a single Region. In this post, we covered specific features across AWS security, networking, and compute services that have built-in functionality to take away some of the undifferentiated heavy lifting. We’ll cover data, application, and management services in future posts.
Ready to get started? We’ve chosen some AWS Solutions and AWS Blogs to help you!
Looking for more architecture content? AWS Architecture Center provides reference architecture diagrams, vetted architecture solutions, Well-Architected best practices, patterns, icons, and more!