[G4] EIDE and Ultra ATA RAID
Philip J Robar
philip.robar at myrealbox.com
Wed Apr 27 18:48:20 PDT 2005
On Apr 25, 2005, at 4:40 PM, Ralph Garrett wrote:
> On Apr 25, 2005, at 5:51 PM, Philip J Robar wrote:
>
>> This is simply not true. Every drive you add increases the chance
>> of the array failing. See http://www.pcguide.com/ref/hdd/perf/
>> raid/concepts/rel_Rel.htm.
>>
>> Phil
>
> Sorry but that concept of reliability is based on flawed logic. (By
> the same logic, buying multiple Lotto tickets would greatly
> increase my chances of hitting it rich) For the RAID to fail, only
> one drive has to fail. So the MTBF for the Array is the same as any
> single unit. Adding more components doesn't increase the likelihood
> of any single unit failing (as long as heat is being dealt with
> properly).
No, it's not. And it's rather unkind of you to distort my claim in
your analogy by adding the word "greatly" so as to make your position
seem correct. Buying more than one lottery ticket does increase the
chances of you winning. I don't have the figures to say by how much
and since I'm not you I can't say as to whether the increase is worth
the cost to you or not.
I made no claims as to the statistical strength of the failure rate
as drives are added to an array, however there is no doubt that
adding drives to any array does increase the chances of you
experiencing the failure of an individual drive. This is why it is
important to realize that a so called "RAID" 0 set is actually not
RAID at all as it has no "Redundant" drives. As you increase the
number of drives in a "RAID" 0 set up your chances of an individual
drive failing increases and thus so does the chances of the entire
array failing.
Assume an individual drive MTBF (Mean Time Between Failure) of
1,000,000 hours. (Which is typical of server oriented drives and is
becoming common for consumer drives.) You have a 4.29% chance of the
drive failing within 5 years. Adding an identical drive increases the
chance of a failure to 8.39%. Even more important to note is that the
failure rate increases exponentially, not linearly.
True RAID works around this problem by having redundant copies of the
data (e.g. RAID 1) or by check summing the data so as to be able to
recreate it on the fly (RAID 5). For instance a two drive mirror with
the above per drive MTBF has an array MTBF of 1,500,000 hours, which
means a 5 year failure rate of 2.88%. In practical terms though most
of us can count on a mirrored array never failing as long as we
replace failed drives as soon as they fail.
To put things in a slightly different perspective, a site with a 112
drive array will experience a drive failure about once a year. A site
with 448 drives with loose 5 drives a year.
Since RAID 0 is not a performance win for most desktop users it
really comes down to a personal decision. Does the convenience of
having all of your drive space collected into a single volume offset
the slightly increased, in the near term, chance of loosing and
having to restore all of that data?
Phil
For a more detailed explanation of hard drive availability and MTBF
(Mean Time Between Failure) see this excellent white paper:
http://www.zzyzx.com/products/whitepapers/pdf/
MTBF_and_availability_primer.pdf
http://tinyurl.com/7fp7q
More information about the G4
mailing list