Profile Applicability:
Level 1
Description:
Disaster Recovery Service (DRS) is a managed service that automates the process of disaster recovery for your IT infrastructure, ensuring business continuity by facilitating quick recovery of applications, databases, and workloads. When DRS is enabled, you can define disaster recovery plans that can be triggered in the event of a failure. The service allows you to define recovery jobs to automate the recovery process, minimizing downtime and ensuring applications are restored quickly. This SOP ensures that DRS is not only enabled but also that recovery jobs are set up to automate disaster recovery.
Rationale:
Enabling Disaster Recovery Service (DRS) with recovery jobs ensures that:
Business Continuity: Automatically recovers workloads in case of failure, ensuring minimal downtime and avoiding manual recovery steps.
Automated Recovery: Recovery jobs are predefined and can be triggered automatically during a disaster, speeding up recovery processes.
Compliance: DRS is often a requirement for disaster recovery and business continuity plans (BCPs), particularly in regulated industries.
Efficiency: Automating the recovery process reduces human errors and ensures that recovery processes are executed consistently.
Impact:
Pros:
Faster Recovery: Automated recovery reduces downtime and speeds up the recovery process during a disaster.
Reduced Manual Intervention: Recovery jobs automate the entire process, ensuring the recovery process is carried out with minimal human intervention.
Improved Compliance: Helps ensure that disaster recovery processes are in place and meet regulatory compliance requirements.
Business Continuity: Ensures critical applications are available with minimal disruption during failures or disasters.
Cons:
Initial Configuration Effort: Setting up disaster recovery plans and defining recovery jobs requires some initial configuration effort.
Costs: There may be additional costs associated with running DRS jobs, including data replication and storage costs.
Default Value:
By default, DRS is not enabled. It must be explicitly configured and recovery jobs must be created to automate disaster recovery tasks.
Pre-requisites:
IAM Permissions:
drs:DescribeJobs
drs:CreateJob
drs:UpdateJob
drs:DeleteJob
Disaster Recovery Setup: Ensure that the disaster recovery service has been configured for your account and is operational.
Backup and Replication: Set up and configure the backup and replication services for the resources to be included in the disaster recovery plan.
Remediation:
Test plan:
Using AWS Console:
Sign in to the AWS Management Console.
Navigate to Disaster Recovery Service (DRS) under Services.
In the DRS Console, select Disaster Recovery Jobs.
To check if any jobs are already created, review the list of available recovery jobs.
If no jobs are configured, click Create Job to start setting up a new recovery job.
Define the recovery plan, selecting the resources (e.g., EC2 instances, databases) to include.
Set the job to run automatically when needed, ensuring the job frequency is set to the desired recovery schedule.
Save and confirm that the job is enabled.
Using AWS CLI:
To list current disaster recovery jobs, run:
aws drs describe-jobs
To create a new disaster recovery job, run:
aws drs create-job --job-name <job-name> --job-type <job-type> --resource-ids <resource-id> --other-options
To verify the job's status, run:
aws drs describe-job --job-id <job-id>
Implementation Steps:
Using AWS Console:
Sign in to the AWS Management Console and navigate to Navigate to Elastic Disaster Recovery (DRS).
- Go to the Recovery Dashboard.
Here you’ll see the list of your source servers that are protected by DRS. - Select the source server(s) you want to recover.
You can choose one or more servers depending on your recovery plan. - Choose the job type:
Initiate Recovery Job (for a real disaster event)
Initiate Drill Job (for testing / practice recovery without impacting production)
- Select recovery options:
Use most recent recovery point or choose a Point-in-Time snapshot.
Verify or adjust launch settings (instance type, VPC, subnet, security groups, etc.).
- Confirm and start the job.
DRS will launch the recovery instances in your target AWS environment. - Monitor the job.
Go to the Job History tab to track status and progress.
Verify that recovered instances are running as expected.
Using AWS CLI:
To create a disaster recovery job, run:
aws drs create-job --job-name <job-name> --job-type <job-type> --resource-ids <resource-id> --other-options
To verify the status of the job, use:
aws drs describe-job --job-id <job-id>
Backout Plan:
Using AWS Console:
If the disaster recovery job configuration causes issues (e.g., resource contention, incorrect configurations), navigate to Disaster Recovery Jobs.
Select the job you want to remove and click Delete to remove the job from the environment.
Using AWS CLI:
To delete a recovery job, run:
aws drs delete-job --job-id <job-id>
Verify that the job is deleted by running:
aws drs describe-jobs