Is your data backed up? In this video, you’ll compare differential, incremental, and synthetic backups and learn about grandfather-father-son strategies and 3-2-1 strategies.
Throughout this course, we’ve constantly mentioned how valuable it is to have a backup of your data. This can not only allow you to go back in time and retrieve a previous version of a file, but this is something that can also help you recover if you happen to have some type of disaster. There are obviously a number of considerations when planning a backup system. You have to think about the total amount of data you’re going to be backing up and the type of backup you’re going to perform.
You also have to think about where you’re going to store all of this backup data and on what type of media, and where you plan on keeping all of these backups. You also have to think about the software you’re using to manage the backups because that’s also the software you’ll use to restore those backups. You also have to think about the type of backup that’s taking place and on what day of the week you’ll be performing this backup.
A full backup is when you make a copy of everything that’s on a particular file system. This would include operating system files, user documents, and anything else that’s installed on that system. Because you’re making a copy of everything that’s on that device. This is usually the backup type that takes the longest. Transferring terabytes of data from every system that’s on your network may not be practical to do every day of the week.
And of course, you have to think about where you’re going to store all of this information once you’ve created the backup. To make this process a bit easier, you may want to use a different type of backup such as a differential backup. With a differential backup, you first perform a full backup. Every subsequent backup will only contain data that’s been changed since that last full backup. That’s the different part of the differential.
So if you perform this differential backup every day, you’ll notice that each day, the differential backup gets larger, and larger, and larger as more information has changed since the last full backup. To restore this data, you need two pieces of information. You need the last full backup and you need the last differential backup.
Let’s implement a differential backup on my network. We’re going to first start on early Monday morning and create a full backup of all of the data that’s on a system. We’ll then perform a differential backup on Tuesday morning, and this would obviously only be the data that has changed since the last full backup. We’ll then perform another differential backup on Wednesday. This Wednesday backup is a differential backup that only contains data that has changed since the last full backup on Monday.
And then Thursday, we perform the same differential. And again, this is only the data that has changed since the last full backup. Now let’s say that the system has suddenly crashed and we need to restore all of this data. We’ll grab the last full backup, that’s the one that was taken Monday, and the last differential backup, in this example, it’s the one from Thursday. Put. Those together and we’ve now recovered all of the data on our system
An incremental backup starts exactly the same as a differential backup where we take a full backup before we do anything the next backup will be an incremental backup that contains changes that have occurred since the last full backup. On this first day, this is very similar to the differential backup that we just looked at. However, things change a bit on the next backup day where we’re only going to back up things that have changed since the last full backup and since the last incremental backup.
This means this backup could be very small if nothing has changed since the last incremental backup. Each day of the incremental backup, we are copying anything that has changed since the last full backup and the last incremental backup. The restoration process for this is very different than the differential because you not only need the last full backup, you need each incremental backup that has occurred since that last full backup was made.
So if we implemented an incremental backup, we would of course take our full backup on Monday morning. On Tuesday morning, we would only copy the information that has changed since that last full backup. On Wednesday, we would only copy information that has changed since Tuesday. On Thursday, we would only copy information that has changed since Wednesday.
And if we do need to recover the entire system at that point, we would need to get all of these data sets together. So we would need the full backup, Tuesday’s incremental backup, Wednesday’s incremental backup, and Thursday’s incremental backup to put them all together to create the full restore. Earlier in this video, we talked about a full backup and how it’s very time consuming to copy all of the data to create that full backup.
But what if there was a way to create full backup without actually copying all of the data from that system. We can do that through something called a synthetic backup. With a synthetic backup, we are creating that first backup and then we have incremental backups that occur after that. Then when we would like to create a new synthetic full backup, we’ll simply take all of those data sets we had previously and combine them all together into what ultimately becomes the synthetic full backup.
This is obviously much faster than performing a full backup because we already have all of these data sets backed up. We just need to combine them all together to create that synthetic full backup. So let’s say on our network we would like to have a full backup on Monday and a full backup on Friday. So Monday morning, we will perform a full backup and store that information in one data set.
We will then have incremental backups that occur on Tuesday, Wednesday, and Thursday. On Friday, instead of performing another full backup by copying all of the data off of that system, we’ll simply combine all of these data sets together offline to create the synthetic full backup. From this point forward, we can continue with our incremental backups because we now have effectively created a full backup.
To summarize this, we have full backups, which will copy all data selected on that system. This obviously takes quite a bit of time because we’re copying all of that data but it’s very quick to restore because you only need the single backup set created from the full backup. With a differential backup, we’re backing up all data modified since the last full backup each day.
This means we’ll have a moderate backup time as those backups become larger and larger each day, and a moderate restore time because we only need the last full backup and the last differential backup. The incremental backup will backup all new files and anything that’s modified since the last incremental backup. This backup time is very low because we’re only copying things that have incrementally changed.
But it does have a high restore time because you not only need the full backup, but every incremental backup that has occurred since the last full backup. And with a synthetic full backup, you have a copy of all of the data that’s on that system. It is a low backup time because we’re using an incremental backup every day. And it’s also a low restore time because there’s a single backup set which is our synthetic full backup.
Of course, backing up the data is only part of the job. We also have to make sure that all of that data can be restored. So you may want to try simulating a disaster. Take a completely clean system and try restoring your data from your backups. Then once the restoration is complete, you can check the data to confirm the operating system and all of your user data is intact.
This restoration should be scheduled on an ongoing basis so that you can audit the backup process and confirm at any time that you’ll be able to restore all of your users’ data. One way to store data is to have all of your backups occur on site. This means you won’t need any type of internet or wider network link. The data that you’re backing up is immediately available to you and you can use it to restore right after you’ve made the backup.
And generally speaking, it’s less expensive than having an offsite backup system. For an offsite backup, we obviously need some type of network connection. So this might be a wide area network or an internet link. We also have to make sure that we have a way to retrieve that data if we need access to it. So if a disaster occurs, do we still have access to our offsite backup.
But since this data is stored offsite, you could effectively restore it to any system regardless of where you happen to be. In practical terms, many companies combine these. So that you have some data that’s stored site and some data that’s stored offsite. This gives you multiple copies of the data and gives you some options when it comes time to restore that data.
One strategy for backups is to use the Grandfather, Father, Son rule, or GFS. This is three separate backup rotations that occur at different times of the month. You have a monthly backup, a weekly backup, and a daily backup. So you might have a full monthly backup that occurs on the last day of the month and you would consider that to be your grandfather backup. At the end of the year, you would have 12 separate monthly backups.
You would also have backups that would occur once a week, and each of those backups would be referred to as the father backup. And of course, you would take backups every day, whether that’s an incremental backup or a differential backup. And you can refer to those backups as the son backup. So here’s a common grandfather, father, son schedule that you might set up.
You might have the grandfather or monthly backups occur on the last day of the month. You can see there’s one on this 31st and another one on the next 31st of the month. You would then have weekly or father backups. In this calendar you can see they all take place on a Monday. And then we would have daily or son backups occur every workday.
The details of when you would perform these backups will depend on how you want to implement them in your organization. As long as you have monthly grandfather backups, weekly father backups, and daily son backups, then you have everything you need for a GFS backup strategy. Another popular strategy for backups is the 3, 2, 1 backup. This is something you can use for your business. It’s also something you can implement at home.
The 3 in the 3, 2, 1 strategy means that you will have three copies of data that are always available to you. There would be obviously the primary copy that you would use on your system and then two other backups that would be available. The 2 is referring to the type of media that you would use and you would need at least two different types of media. So you might have a local drive, a tape backup, or a network attached storage device. Any two of those would satisfy the strategy for a 3, 2, 1 backup rule.
And lastly, the 1 in the 3, 2, one is that one copy of this data should be kept off site. If something did happen to your business or to your home, you would still have a copy of that data stored somewhere safe.