Hi can someone tell me the difference between RAIDZ1 and RAIDZ2 in a practical sense like you were on the field, rather than the theory.
I know RAIDZ1 is ZFS variant of RAID5 and RAIDZ2 is ZFS variant of RAID6; I will use all these terms interchangeably because I'm not too interested in the ZFS special sauce here.
In the early 2000s, a lot of people were pushing RAID5. Having worked in a hosting / colocation data centre for many years I had witnessed many RAID5 failures. What would happen is an array would degrade, and more often than not a secondary drive will fail due to undue load on the array as part of the degraded status. Also a lot of times failure would happen on the rebuild process because a lot of the HW implementations were flakey -- but again also the undue stress on all drives as you rebuild. This is why I would suggest a RAID10 setup at the time because of the double lucky failure, and more importantly because you can trivially use a software implementation which is much more safe. Also a lot of the motherboards at the time were offering RAID but this was really just a binary blob in the kernel doing software RAID with a facade of making it appear like hardware which fooled a lot of people.
Well we've finally done away with hardware / proprietary RAID and we have ZFS, mdadm, etc. I've normally dismissed RAID6/RAIDZ2 because of the parity/rebuild process and concerns of putting undue stress on the drive. But I think maybe this is premature and that I didn't really understand the consequences of a single drive failure versus a double drive failure. So this is kind of what I want to know:
1. When a single drive fails, is there any undue stress on the array, or because the array can pretty much operate unaffected, there's actually no performance degradation until you rebuild the missing drive, and in the case of software it's really just a negligible hit on the CPU if it has to do hashing/erasure coding/etc. I guess the rebuild process is really just the cost of a zfs scrub at this point but at least it is on a healthy array.
2. The good news of RAID6 over RAID10 is you can always survive two drive failure; but I think this is where things are concerning because the rebuild across two drives places a lot of undue stress on the remaining disks, and if any of those disks die then you're shit outta luck. This scenario is much more similar to a single drive failure in a RAID5 failure. But again, I think the rebuild cost is that of a zfs scrub but with the minimal set of disks. So RAIDZ2 would be a much more solid choice over RAID10 right; at least you will always know you can survive a two drive failure?
One thing that raidz has going for it over a block level raid is that zfs knows which blocks are in use and recovery does not need to read all blocks on all disks to recover.
Zfs also prioritizes user reads and writes over recovery (resilvering), which may not be universal practice with RAID.
It's certainly still the case that if your redundancy drives failed, you're at greater risk of data loss, and IMHO, disk failure rate during repair is higher than base rate: maybe from extra use, but also because of the risk of correlated failures --- if your environment contributed to failure of the first N disks, the other disks were also in that environment and may be about to fail.
I think also that ZFS supports online rebuild. I remember in my data centre days that a few of the RAID5 products only did rebuilds offline. The HP/Compaq Smartarray stuff was the bees knees though; reliable stuff.
Almost all “normal” damage is and should be repaired online with zfs. Those offline repairs meant the hardware controller had no idea how to interact with filesystem directly, probably for the best. Level of abstraction purists don’t like this aspect of zfs.
If something particularly bad happened, or you tried being really “clever”, you can get into a rare situation of not being able to import the pool or have it import in read only mode. There are tools to help repair that kind of metadata damage. Then proceed with the normal online repair if needed.
raidz doesn't work exactly like raid, but conceptually it's helpful to carry over that knowledge. the biggest difference is that all drives can potentially have parity blocks on them.
1. If a drive is missing, and it contained a data block you want to read, then you have to do parity calculations to recalculate that block. this means potentially all drives must use their read capacity for this calculation. I think this would be considered stress and max read throughput is significantly reduced. If your block size is very large, or your files much smaller you might get away with minimal performance hit but you're also wasting a lot of capacity/benefit of z2. (In certain pathological cases z2 can have the storage profile of a double mirror, but with all the complications of z2) The rebuild process will require a recalculation for every missing block type, basically every drive will need to perform a read for each restored block. Writing new data to a degraded z2 pool can force zfs to be quite wasteful. For example, a 5 disk z2 pool with 1 drive missing will mean a maximum of 2 data blocks with 2 parity blocks, instead of the expected 3 and 2. restoring that drive will not automatically restore that capacity unless the files are written again to the restored pool. The drives will be filled unevenly, this has performance and storage efficiency penalties.
2. If you replace both degraded drives at the same time, and you aren't using resilver_defered, it should only need to read all other drives once and write to both new disks. But you might not want this, depending on many complicated factors.
I know RAIDZ1 is ZFS variant of RAID5 and RAIDZ2 is ZFS variant of RAID6; I will use all these terms interchangeably because I'm not too interested in the ZFS special sauce here.
In the early 2000s, a lot of people were pushing RAID5. Having worked in a hosting / colocation data centre for many years I had witnessed many RAID5 failures. What would happen is an array would degrade, and more often than not a secondary drive will fail due to undue load on the array as part of the degraded status. Also a lot of times failure would happen on the rebuild process because a lot of the HW implementations were flakey -- but again also the undue stress on all drives as you rebuild. This is why I would suggest a RAID10 setup at the time because of the double lucky failure, and more importantly because you can trivially use a software implementation which is much more safe. Also a lot of the motherboards at the time were offering RAID but this was really just a binary blob in the kernel doing software RAID with a facade of making it appear like hardware which fooled a lot of people.
Well we've finally done away with hardware / proprietary RAID and we have ZFS, mdadm, etc. I've normally dismissed RAID6/RAIDZ2 because of the parity/rebuild process and concerns of putting undue stress on the drive. But I think maybe this is premature and that I didn't really understand the consequences of a single drive failure versus a double drive failure. So this is kind of what I want to know:
1. When a single drive fails, is there any undue stress on the array, or because the array can pretty much operate unaffected, there's actually no performance degradation until you rebuild the missing drive, and in the case of software it's really just a negligible hit on the CPU if it has to do hashing/erasure coding/etc. I guess the rebuild process is really just the cost of a zfs scrub at this point but at least it is on a healthy array.
2. The good news of RAID6 over RAID10 is you can always survive two drive failure; but I think this is where things are concerning because the rebuild across two drives places a lot of undue stress on the remaining disks, and if any of those disks die then you're shit outta luck. This scenario is much more similar to a single drive failure in a RAID5 failure. But again, I think the rebuild cost is that of a zfs scrub but with the minimal set of disks. So RAIDZ2 would be a much more solid choice over RAID10 right; at least you will always know you can survive a two drive failure?