The throughput and redundancy advantages of RAID are used in numerous computer systems. In this video, you’ll learn about RAID 0, RAID 1, RAID 5, and RAID 1+0.
<< Previous Video: Understanding SCSINext: An Overview of CPU Socket Types >>
The term RAID stands for redundant array of independent disks. You may also those referred to as a redundant array of inexpensive disks. But these days, the standard is to use independent disks.
It’s also important to keep in mind, even though the term RAID implies a redundancy, not all RAID standards have redundancy built into them. We’re going to learn about the different RAID levels today. And you’ll know exactly which RAID levels are redundant and which RAID levels are not.
The first one we’re going to look at this one called RAID 0. And that stands for a striping of data across disks. There’s also RAID 1, RAID 1 mirrors information across disks. RAID 5 uses striping, but adds parity to that configuration.
And the last rate this you’ll need to know for your certification is a nested RAID configuration called RAID 1+0. You may see this also referred to as RAID 10. And that effectively is a stripe of mirrors. So we’re combining the 1 and the 0 together to create a different type of RAID format.
As we go through this description of these different RAID types, keep in mind that the RAID configuration inside of your computer could all be done in software. Your operating system may have the ability to see multiple drives and be able to provide the RAID 0, the RAID 1, the RAID 5 type capabilities in the software of the operating system itself. You don’t have to have any special hardware. The operating system does all of the hard work for you.
One of the challenges you have with this, especially for the more advanced RAID types, is you’re now using the CPU of your computer to manage this RAID functionality. So you may find that the performance of the software-based RAID is of a lower performance than if you were using a hardware-based RAID.
With hardware RAID, you have a controller inside of your computer that is specifically designed to do RAID. And you generally configure this in the BIOS of your computer. This usually has its own BIOS associated with it. And when you boot your computer, you have the option for configuring the BIOS of your RAID controller.
That RAID controller now handles everything associated with striping or mirroring or however you are setting up your RAID configuration. Your operating system now doesn’t see any of this. It’s simply using whatever disks this RAID controller allows it to see. And you’ll see in a moment, all of these different methods you can use to configure this RAID controller and show one particular view to your operating system.
One advantage here is that the RAID controller is hardware. And for the more advanced RAID functionality, your hardware is generally going to be of a much higher performance than if you were using something like a software-based RAID.
The first RAID type we’ll look at is RAID 0. You may also hear this referred to as striping. And that’s because we take a single file or a single block of data and we stripe that data across multiple drives.
I’m using two drives here, but you can do many physical drives, where I will take a single file and I’d split it up into small blocks. I put one of those blocks on one drive and one block on another, then another block on the first drive, and then the second drive. So the same file is now split between all of these disks.
One advantage of this is that now you’re able to write information very quickly because you’re writing half of the information to one disk. And at the same time, writing half of the information to the other disk, effectively doubling the speed that you would normally have if you were writing it all to the same drive. So the performance on RAID 0 is very, very high. You’re able to get a lot of throughput on a RAID 0 configuration.
But unfortunately, as you can tell, if you lose one of these disks, there’s no redundancy there. You’ve lost half of your data. And there’s no way to recreate that. There’s no way to recover from that. You just have to hope that you have a very good back-up, because once you lose a drive, you’ve now lost all of that data.
RAID 1 is a RAID type called mirroring. With mirroring, we’re effectively duplicating the traffic across multiple physical drives. If we have a file and we’ve broken it up into different blocks, you can see that we’ve written one block to Disk 0 and we’ve written an identical copy of that block to Disk 1.
Now we have a duplicate of information. Everything on DIsk 0 is exactly the same as everything on Disk 1. And that means there’s a lot of writing that’s taking place. When we write to disk, we’re really writing to multiple disks all at the same time.
This also means that we’re going to need twice as much disk space than we normally had. So if you need 2 terrabytes, you’re effectively going to be installing 4 terrabytes and it’s going to be configured as RAID 1. The advantage of this of course is that if we lose one of these disks, if it fails, well that’s OK, we have an exact duplicate of the data on the other physical drive.
We are not out any time. Our system continues to run. Most people don’t even realize that we’ve lost one of those disks.
And ideally, we would then remove the bad disk and replace it. And this original disk would then rewrite everything and copy everything over to the brand-new disk. And then our disks are synchronized. And we’re ready for another failure and so that we can maintain the redundancy of all of our data.
With RAID 0 we were striping, but there was no redundancy there. If we lost a RAID 0 disk, we lost our data. With RAID 5, we’re still going to do striping. We will have multiple, physical drives. And we will write blocks of data across all of those. But with RAID 5, we are also including an additional block of data called parity.
We’ll have our computer system or our hard drive controller look at all three of the blocks we’re writing. In this case, it’s three blocks. You can have more than these four drives that we have here. But we’ll write these three blocks. And then that system will calculate a parity, which then we can use to recreate any of the data if we lose a disk.
And it puts the parity in different places on different physical drives. So it distributes that parity out across the drive. So if you lose a drive, you’ve only lost a little bit of data and a little bit of parity.
One of the advantages now is that we’re using the disk space efficiently. We’re not having to recreate exactly the same information on every single disk, like we did when we were mirrored. We’re not duplicating data. We’re just taking a piece of the data that we would normally write and add a little bit extra to it. So there’s some efficiency in storage that we’re building right into this.
This is highly redundant. We are striping data, so we have good performance. And then we also have redundancy. If we lose a disk, let’s say we lost Disk 3, and we now needed to rebuild Block 1B, 2B– 3B would be gone. But we’d be able to use the parity B information to effectively rebuild that on the fly.
So if you were running in a software-based configuration, you may see a performance difference because your machine is having to recalculate all of this data as it goes. If this is hardware-based RAID, your hardware controller generally is built for this particular problem and it’s able to keep up with those speeds. And most people don’t even know that a problem occurred.
Now, imagine taking this idea of striping and the speed of striping that data and combine it with the redundancy that we had in a mirrored environment. And if you do that, you have RAID 1+0, which we could also call a stripe of mirrors. You will effectively take a group of drives, it could be at least four drives, I’m using six drives in this scenario, so that you could see it work. And I would build mirrors between pairs of drives here.
So I’ve got one set of mirrors here. I’ve got another set of mirrors in the middle. These are Block 2, Block 5, Block 8, Block 11. Those are duplicated. And here are another pair of drives that are duplicated as well. And I’m going to stripe information. Block 1, Block 2, and Block 3 are striped across all of those mirrors.
That way if I lose one of these stripes, it’s OK. I’ve got a mirror of the stripe available to me right here. So this now gives me the flexibility of striping and the speed that I would like to have, with the redundancy of mirroring that I didn’t have before.
And I didn’t have to worry about any parity calculations. I know exactly where my data happens to reside. And now, I’m able to get the best of RAID 0 and the best of RAID 1 combined in a single RAID configuration.