Sunday, October 9, 2016

Determine the backup strategy - SQL Server Brain Basher of the Week #056

Determining the backup strategy for a SQL Server database is not an easy task as there are many things to be considered. As per my experience, many implement a strategy without considering anything, example, just a daily full database backup even without asking the business requirement.


Generally, we consider two things: RTO and RPO. RTO (Recovery Time Objective) refers time it takes for recovering. In other words, how long business can survive without the system. RPO (Recovery Point Objective) refers the acceptable data lost at a recovery. In other words, business accepts to repeat the work for a certain time period.

This week question is based on backup strategies. Here is the question:

You have a database which requires a recovery (or backup) strategy based on the given requirements;
  1. Size of the database is 25 GB.
  2. The database must never be unavailable for longer than 8 hours.
  3. At the recovery from a failure, no more than 1 hour of transactions may be lost.
  4. Total backup throughput is 100 MB/minute.
  5. Total restore throughput is 80 MB/minute.
  6. Average rate of changes during office hours is 500 MB/hour.
  7. Average amount of new data is 200 MB /day.
  8. Office hours - 09:00 - 17:30.
What would be your plan the requirements given?

How do you start this. You should start with #2 and #3 because they talk about RTO and RPO. Whatever backup strategy we planned, we must make sure the restoration can be done within 6 hours. Not only that, we have to make sure that we must not lose more than 1 hour data. Okay, let's work on this.

We cannot survive without a full database backup. Let's calculate and see how much time it needs for backing up and restoring the database.

Database size = 25 GB = 25,600 MB

Backup throughput = 100 MB/minute
Time takes for backing up the database = 25600/100 = 256 minutes = 4 hours 26 minutes

Restore throughput = 80 MB/minute
Time takes for restoring the database = 25600/80 = 320 minutes = 5 hours 33 minutes

As you see, it takes about 4-5 hours for backing up and we can easily set it up during off peak hours. However, it takes 5-6 hours for restoring that is something we need to consider.

Since the RPO is 1 hour, we need to make sure that Database Recovery Model is Full and it needs Transaction Log backup every hour. Let's calculate the time it needs.

Changes per hour - 500 MB
Time takes for backing up the log = 500/100 = 5 minutes
Time takes for restoring the log = 500/80 = 7 minutes

Let's see whether we can have strategy like - Daily full backup and hourly Transnational Log Backup during office hours. If we plan the full backup at 00:00, it can complete it before 05:00. And we can have hourly Transnational Log Backup starting at 10:00 and ending at 18:00. In worse case, if something happens at 17:30, we have to restore the full backup and 8 log backups. Let's calculate the time and see.

Restoration time for the full database backup = 5 hours 33 minutes
Restoration time for log backups = 7 x 8 = 56 minutes
Total duration = 5 hours 33 minutes + 56 minutes =  6 hours and 29 minutes.

Since we have been given the RTO as 8 hours, the strategy we checked works fine with it. However, we need to think about the growth as well. We add approximately 200MB daily, approximately 1GB weekly. This increases the time it takes for full database backup. Assuming that database becomes 30 GB after a month;

Time takes for restoring the database = 30720/80 = 384 minutes = 6 hours 24 minutes.

If we recalculate the time;
Total duration = 6 hours 24 minutes + 56 minutes =  7 hours and 20 minutes.

Now this has to come to closer to the given RTO. As you see, within few months, this strategy will not be able to handle the requirements. If you see something like this, then you cannot handle it with traditional backup types, you might have to go for either disk level backups or partial backups. Let's talk about more on that with another post.

No comments: