Jahith's Tech Sharing: AWS Disaster Recovery

Disaster recovery planning and business continuity planning are very important for any organization to come out of a disaster very quickly. Disaster recovery for on-premises environments requires a lot of effort and planning because it involves a lot of third-party services like transportation, network connectivity, etc., and various staff help to setup networking, systems, etc. Disaster recovery planning for the cloud will not require much effort, but we need to have strong planning to build redundancy to recover very quickly.

The resilience of the AWS cloud environment is a shared responsibility. AWS infrastructure is available across different AWS regions. Each region is a fully isolated geographical area; within each region, multiple isolated availability zones are available to handle failure. All AWS regions and availability zones are interconnected with high bandwidth. When we use AWS as a cloud, we have various options to manage the high availability of the system.

Within region high availability

Regions represent a separate geographic area, and availability zones are highly available data centres within each AWS region. Each availability zone has isolated power, cooling, networking, etc. AWS provides a built-in option for dealing with an availability zone outage. We have to configure our environment with multi-AZ redundancy so that if an entire availability zone goes down, AWS is able to failover workloads to another availability zone. Within a region, the high availability architecture option will ensure compliance by keeping data in the permitted region and ensuring high availability.

Cross region high availability

A multi-region disaster recovery strategy will be helpful to address the rare scenario of an AWS region being down due to a natural disaster or technical issue. Very highly sensitive applications are required to plan cross region replication options. When we plan this approach, we need to consider the AWS availability for each service. Most of the AWS services are committed to high availability. Cross region high availability can be achieved in different ways based on our budget and compliance needs. We need to choose the proper strategy.

Back up and restore
Pilot light
Warm Standby
Multi-site active/active

Backup and restore

This approach will help us to solve the data loss issue. This approach will have a high RPO and RTO rate. RPO will determine how frequently we schedule data backups. As the environment is not yet ready, building an environment using backed up data will take time, so our RTO is also very high.

Pilot light

This approach will replicate the data to another region and also set up a core infrastructure. Servers are switched off and will be used when needed for testing and recovery. This approach will reduce the RTO and RPO based on the backup schedule. This approach is cost effective in terms of recovery, but database corruption or any malware attack still require a backup.

Warm Standby

This method is similar to pilot light, but a scaled-down version of the environment is now operational. Disaster and recovery testing can be carried out anytime, so comparatively, this will improve the confidence of those who recover quickly. RTO will slightly improve when compared to pilot light, and RPO is based on the replication schedule.

Multi-site active/active

In this approach, both sites in different regions will be active and running. Requests will be distributed across regions by default. If any one of the regions is down, another region automatically picks up the request. This approach is the most costly. RPO and RTO will be reduced to near zero, but backup will be required if there is any data corruption or malware attack.

These strategies increase the possibilities of high availability in a disaster scenario. Each strategy addresses a subset of disasters but not all of them. Depending on the disaster, RPO and RTO will change.

Cross region and cross account high availability

For security or compliance reasons, many organizations require complete separation of environments and access between their primary and secondary regions. This helps mitigate the malicious threat to an organization that comes from people within the organization or any malware attack on our primary account. Having our backups or primary database routinely copied to the secondary account will help to recover the primary account.

The AWS backup feature can be used to backup data across accounts. AWS Backup is a fully managed service for centrally and automatically managing backups. Using this service, we can configure backup policies and monitor the activities of our AWS resources in one place.

Jahith's Tech Sharing

Wednesday, December 28, 2022

AWS Disaster Recovery

No comments:

Post a Comment