We use essential cookies and similar tools that are necessary to provide our site and services. We use performance cookies to collect anonymous statistics, so we can understand how customers use our site and make improvements. Essential cookies cannot be deactivated, but you can choose “Customize” or “Decline” to decline performance cookies.
If you agree, AWS and approved third parties will also use cookies to provide useful site features, remember your preferences, and display relevant content, including relevant advertising. To accept or decline all non-essential cookies, choose “Accept” or “Decline.” To make more detailed choices, choose “Customize.”
Customize cookie preferences
We use cookies and similar tools (collectively, "cookies") for the following purposes.
Essential
Essential cookies are necessary to provide our site and services and cannot be deactivated. They are usually set in response to your actions on the site, such as setting your privacy preferences, signing in, or filling in forms.
Performance
Performance cookies provide anonymous statistics about how customers navigate our site so we can improve site experience and performance. Approved third parties may perform analytics on our behalf, but they cannot use the data for their own purposes.
Allowed
Functional
Functional cookies help us provide useful site features, remember your preferences, and display relevant content. Approved third parties may set these cookies to provide certain site features. If you do not allow these cookies, then some or all of these services may not function properly.
Allowed
Advertising
Advertising cookies may be set through our site by us or our advertising partners and help us deliver relevant marketing content. If you do not allow these cookies, you will experience less relevant advertising.
Allowed
Blocking some types of cookies may impact your experience of our sites. You may review and change your choices at any time by selecting Cookie preferences in the footer of this site. We and selected third-parties use cookies or similar technologies as specified in the AWS Cookie Notice.
Your privacy choices
We display ads relevant to your interests on AWS sites and on other properties, including cross-context behavioral advertising. Cross-context behavioral advertising uses data from one site or app to advertise to you on a different company’s site or app.
To not allow AWS cross-context behavioral advertising based on cookies or similar technologies, select “Don't allow” and “Save privacy choices” below, or visit an AWS site with a legally-recognized decline signal enabled, such as the Global Privacy Control. If you delete your cookies or visit this site from a different browser or device, you will need to make your selection again. For more information about cookies and how we use them, please read our AWS Cookie Notice.
Build and run resilient, highly available applications in the AWS cloud
Whitepapers
Resilience Lifecycle Framework
This whitepaper shares services, strategies, best practices, and mechanisms you can incorporate into your organizational and developmental processes to drive continuous resilience.
This whitepaper is intended for cloud architects and senior leaders building workloads on AWS who are interested in using a multi-Region architecture to improve resilience for their workloads.
This whitepaper provides guidance on how to instrument workloads to detect impact from gray failures that are isolated to a single Availability Zone, and then take action to mitigate that impact in the Availability Zone.
This whitepaper details how AWS uses its fault isolation boundaries, inclusive of Availability Zones (AZ), Regions, control planes, and data planes, to create zonal, Regional, and global services.
This whitepaper outlines best practices for planning and testing disaster recovery for any workload deployed to AWS, and offers different approaches to mitigate risks and meet the recovery objectives for that workload.
This whitepaper introduces a resilience analysis framework that provides a consistent way to analyze failure modes and how they could impact your workloads.
New to resilience? Read this blog to learn about the top four most important concepts to get you started on your journey to building resilient applications in the cloud.
Strengthen application resilience with myApplications and AWS Resilience Hub
Resilience Hub now seamlessly integrated into myApplications in AWS Console Home, you can effortlessly manage and enhance your application’s resilience alongside other essential metrics.
Enhance the resilience of critical workloads by architecting with multiple AWS Regions
A multi-Region approach is a reliable way to achieve a bounded recovery time for critical applications in the rare event of a service failure in a Region that is impacting your application.
Learn how performing an Auto Scaling Group (ASG) zonal shift fits in to a multi-AZ resilience strategy and considerations for how to use the feature with different architectures.
Rapidly recover from application failures in a single AZ
Performing a zonal shift with Amazon Route 53 Application Recovery Controller enables you to achieve rapid recovery from application failures in a single Availability Zone (AZ).
Enhance business continuity within an Availability Zone using AWS Elastic Disaster Recovery
There are certain situations where you might need to run your workloads in a single AZ. With AWS Elastic Disaster Recovery you can continuously replicate data from your primary AZ to a secondary AZ and recover your applications during both planned and unplanned outages.
Series: Disaster recovery (DR) architecture on AWS
This four-part series shares best practices for disaster recovery across four strategies: backup and restore, pilot light, warm standby, and multi-site active/active.
DORA scenario testing with AWS Fault Injection Service
Learn how you can use AWS Fault Injection Service (FIS) to support the DORA requirements around scenario-based testing through a structured, iterative process of identifying failure scenarios, planning and executing chaos engineering experiments, reporting on the results, and using the information learned to improve operational resilience.
Introducing AWS Fault Injection Service Actions to Inject Chaos in Lambda functions
By purposefully injecting failures and stresses into serverless components, you can uncover hidden weaknesses and validate the fault tolerance of your systems.