> The reason is because they use "SMR", which severely hurts random-write performance.
It hurts random-write performance after a threshold, when an on-disk staging area becomes exhausted. For short bursts, the drive behaves well -- it probably would legitimately work in a RAID array if the array were initialized from a clean slate and not rebuilt.
> But this is yet another interesting edge case that RAID software doesn't handle and reacts by just blowing away your data.
It's not obvious how a RAID controller should "handle" this. The drives have no outward indication that they suffer from random write saturation. From the controller's perspective, the degraded performance looks very much like a drive failure.
> the degraded performance looks very much like a drive failure.
Sure, but given the choice between "During a rebuild, it looks like another drive isn't doing so well, so I should give up and trash the array"
and
"During a rebuild, it looks like another drive isn't doing so well, so I should notify the administrator and meanwhile try to maintain as much redundancy as I can"
Disks "failing" is the problem. If you treat drive state as binary (flawless / eject), you can easily eject too many drives for errors on different 0.000001% of data, crashing the array along with the data.
It hurts random-write performance after a threshold, when an on-disk staging area becomes exhausted. For short bursts, the drive behaves well -- it probably would legitimately work in a RAID array if the array were initialized from a clean slate and not rebuilt.
> But this is yet another interesting edge case that RAID software doesn't handle and reacts by just blowing away your data.
It's not obvious how a RAID controller should "handle" this. The drives have no outward indication that they suffer from random write saturation. From the controller's perspective, the degraded performance looks very much like a drive failure.