Linux RAID vs ZFS RAID

So before we dive in I thought first I would just give a little explanation on what RAID is and why it's useful. So RAID stands for "Redundant Array of Independent Disks", and it was developed for the purpose to combine multiple smaller disks into a single array in order to achieve redundancy. So the main way that RAID achieves this is with disk striping, disk mirroring, and disk striping with parity. So now these things can come with other benefits besides redundancy, such as faster speeds and also better throughput.

So first let's talk about ZFS, so ZFS is fundamentally different than other types of RAID’s in that it actually contains a file system as well. And so you can think of ZFS as volume manager and a RAID array in one, which allows extra disks to be added to your ZFS volume which allows extra space to be added to your file system all at once. So ZFS includes all typical RAID levels that you normally will come to expect, they just go by different names. For example, a RAIDZ1 can be compared to a RAID5, giving you the ability to lose a single drive in a pool, and RAIDZ2 can be compared to RAID6, allowing the loss of two disks per pool.

One highlight that ZFS has over traditional RAID is that it's not susceptible to the RAID write hole, and it gets around this by having variable width striping on the zpool. So ZFS comes with some other features that traditional RAID doesn't have, which is the L2 Ark and the ZIL, or the ZFS intent log, and what this does is it allows RAM and SSDs to work as a cache for high speed. So Brett has done some great content talking about these features in a previous tech tip, so if you're interested, definitely check that out on the video called “ZFS Read and Write Caching”.

So next we're going to talk about MD RAID or Linux RAID. So this is a utility that allows you to manage and create software RAID at the Linux level and the main difference between this and a ZFS is it doesn't have a file system on top of it, so it's strictly a block level device which you will then have to put your own file system on top to then use in the same way as a ZFS pool. So MD-REG also can't be carved up into multiple logical volumes of different sizes once you build your array in the same way that CFS can create multiples Zvolves or Z Volumes.

So it is possible, however to use LVM or Logical Volume Manager to get the same desired effect, but you would have to use this on top of your MD RAID. So you can add additional drives to an MD RAID after it's been built to add as a hot spare or if you want to actually expand your MD raid you can do that as well through the robust MD RAID command line tool "mdadm".

For us here at 45 Drives, we’ll use ZFS for most instances over MD RAID because all in all, it's much more robust and works flawlessly in most instances. So there is one typical situation where we would recommend MD RAID over ZFS, and that's in a situation where you're looking to share out block devices through I-scuzzy. And the way you would do this is you would still create your MD RAID, and then on top you would use LVM or Logical Volume Manager to then chop up your MD RAID into logical volumes. You can then share those out via I-skuzzy.

So those among you who understand the underlying architecture of how everything works, will understand already why this is, but for everyone else, I will try to explain. Since CFS is inherently its own file system, you could think of that as an extra layer that sits on top of the block devices. Now, while you can certainly share out ZFS pools as I-scuzzy block devices, the main reason for I-scuzzy is for the operating system to view the storage as a native storage drive, which will then be mounted and have another filesystem put on top of it. So in the case of Windows for example, it would see it as a disk drive that you would then mount and put something like NTFS on top of it. While this will certainly work, you can see where having more layers is going to increase latency and reduce throughput in most cases.

So now I thought I'd bring you over to my desk where I'd run through a few tutorials on how to set up the most common RAID arrays you'd see out in the wild, and I'll do this using both MD RAID and ZFS.

Okay so welcome to the little mini tutorial that we're going to set up here. Essentially what I thought we would do today is just build RAID5 in both mdadm and ZFS and just compare and contrast how different the two of them are, and also how similar they are. So first and foremost, just let's get some housekeeping out of the way. We're using CentOS 7 here, and essentially this guide or tutorial is assuming that you have ZFS mdadm already installed on your system.

So if you wanted to find where to find ZFS on Linux for CentOS, you would go to this link and then you would find the RPM for your version of CentOS and you would just find a link and copy it and then run a yum install for that RPM. Once you do that you'd be able to then install ZFS with the "yum install zfs" and then you could run a modprobe then to make sure that it's loaded. And then for mdadm it's quite a bit simpler; you most likely already have the packages so just run a "yum install mdadm", and then you should have everything you need to get started on this tutorial or follow along if you'd want to.

So, first and foremost I thought what we do is create a zpool with three disks that we had set up previously for this. So we're gonna "list block", we can see we've got six block devices for this tutorial that I created beforehand, and I was just going to do a RAID5 with three disks for both setups, just for simplicity. Okay so let's just start. To create a zpool on Linux, the first thing you're gonna want to do is type “zpool create”, and now we're gonna name it. So for this purpose we'll call it “tank”, and then we want the type of raid, so RAIDZ1 is very similar to a RAID5 where it has one parity disk for the pool. And then we're going to say what disks we want to give to this, so "SD", let's go B to D, so that's three disks.

Okay, so next we can run a "zpool list" to make sure that was created, and a "zpool status'. As you can see, our pool is called “tank”, it's online, we have a RAIDZ1 with three disks. So the thing that's different about ZFS over traditional RAID is this is also a file system built-in. So once you create the pool it actually automatically mounts it as a file system. So if we run a "display file system" we can see “tank” is mounted at “/tank/”, so we can actually now already go in, and we can start creating files and use it as a traditional file system, just like you would normally any other file system, whereas Linux RAID as you'll see is quite a bit different than that. So let's leave that off there for now and go over to our Linux RAID.

Now, in order to set up a very similar RAID5 in this situation, it's going to be a bit different syntax, but nothing too difficult so “mdadm --create –verbose /dev/md0 –level=5 –raid-devices=3 /dev/sd[e-g]”. Okay, it seems to have completed, so now let's just run “lsblk” and we can see that all of our drives all have their partitions on them, so let's run a “cat /proc/mdstat” and we can see that our RAID5 is active with our three disks, and it's a level 5, and it's there. So, the difference between this is if you run a file system check, like a “df” you can see there's no file system here. Essentially, what you did in this instance is you created one logical pool of disks in a RAID5 situation without a file system on top of it. So if you wanted to carry it to its logical conclusion to be similar to the ZFS side, you could then put a file system on top of that. So we can do that right now. We could run “mkfs.ext4”, and we're gonna point it to our newly-created block device.

Okay, once that's done, let's make a directory for where we're going to mount this. So I already created that actually. So okay, that's no problem. So we have the directory created where we're gonna put this, and then next essentially all you have to do is now I mount it. So “mount /dev/md0 /mnt/md0”. Okay, so now let's run another “display file systems”, and we can see our RAID5 is now mounted at “md0”. So we can then come in here and just like we were on the ZFS, we can now come in and use it as a traditional file system. Okay, great. Now you've got your RAID’s created.

Now let's say a time comes where you want to get rid of these RAID arrays, either on ZFS or mdadm. Essentially, it's pretty simple in both cases, but let's first start with ZFS. So, as you can see here if you run a “zpool destroy” it will destroy it, but right now I'm inside of the directory so obviously that won't work, so let's just get out of it. Now it's as simple as running “zpool destroy tank”. So now we take a look, “zpool list”, no pools available. But, that being said, even though we destroyed the pool, if you run a “list block” and you take a look, you can still see that B, C, and D still has a bunch of partitions on it. We want to get rid of those if we want to use those disks for something else. So that's also very simple. Then we're gonna run a “wipefs -a /dev/sd[b-d]”. There we go, let's run it back and now we see B, C, and D are completely ready to go to do something new with or create a new array, or whatever you want.

Okay fine, that's great. So what is it you want to do if you have a Linux RAID? How do you get rid of it in that case? No problem. So first things first, what you're gonna want to do is get out of the directory, so let's go back through. And first things first, we're gonna unmount our file system. So “umount md0”. Great, then we're gonna stop our array. Okay, so as you can see it stopped. Now we're going to remove our array. Okay, so it seems to have, ahead of the way once we stopped it, no problem.

Okay, so now that it stopped and we removed it, let's take a look here. Let's take a “lsblk” and we can see our drives are there, but if you run a “cat”, you'll still see RAID5. So just to be safe, I think what you would want to do (just to make sure you've got all the partitions and all the data wiped off of it), you could run a “0/block”, but in this instance we can do the same thing that we used on the ZFS. So we could just run “wipefs -a” and this time it's E, F, and G. There we go, so “lsblk” again. So now we've got all six of our disks back and we can do whatever we want with them; it's as if the RAID arrays never existed.

One final thing that I definitely should touch on before we get out of here, and that's the fact that although I mounted the filesystem with the mdadm in the mount point, that actual mount point would not stay there if you were to reboot your Linux machine, and so I guess I really should have touched on the “fstab”. And so essentially it's this (vim /etc/fstab), and this is where you have to keep a mount point for the file system that you just created, and this allows it to propagate over reboots and stay mounted as a file system, not a temporary file system.

And so, something very similar to this is what you would have in this file for your mounted RAID and I just wanted to touch on that, because I did not want to have this whole thing and just forget about this specific part, which is a pretty important part because if you restarted your machine with just the first part, and you come back and you say “well where's my file system”? So, this is definitely an important part of it, and you'd have to do this as well.

Alright, great. Well, hopefully you found some of this interesting. I didn't want to go too in depth because the video’s already getting pretty long, so maybe what I'll just say is, if there's any specific things you'd like us to go over, any tutorials on ZFS, or Linux raid, or what-have-you, maybe just leave it in the comments below and we can set something up in the near future.

Alright, let's head back to my desk. So we typically like to end our tech videos with a nice fun fact and today I thought we'd talk about Direct I/O. So in years past, this would be another situation where we would typically recommend something else besides ZFS if you require Direct I/O, but thankfully ZFS has enabled it for quite a while now. And so what a direct I/O essentially is, is having the ability to interface directly with the storage themselves and being able to bypass read and write caches.

Well hopefully you found this informative and you learn something new today, and if you have any questions or comments, definitely leave them down below, and if you have recommendations for some future videos, leave that down below as well. So thanks for watching guys and we'll see you on our next tech tip.