[mdlug] ECC RAM failure data - jre
Aaron Kulkis
akulkis00 at gmail.com
Thu Feb 26 06:53:13 EST 2009
john_re wrote:
> Do you use ECC RAM? Do you have any data about failure rates?
>
> I'm evaluating this for a system with 8GB DRAM, &
> http://en.wikipedia.org/wiki/Dynamic_random_access_memory#Errors_and_error_correction
> says
> "Tests[ecc]give widely varying error rates, but about 10-12upset/bit-hr
> is typical, roughly one bit error, per month, per gigabyte of memory.
>
Wow... that seems extremely high to me... for circuitry that's
supposed to be driving transistors back and forth between full
cut-off and and such high saturation that the transistor behaves
like a diode... I find it amazing that an error rate this high
is tolerated within any segment of the industry.
> In most computers used for serious scientific or financial computing and
> as servers, ECC is the rule rather than the exception, as can be seen by
> examining manufacturers' specifications."
Absolutely. A single bit error can/will produce errors, any
of which is going to cost somebody $$$$.
>
>
> So, for that data 8GB DRAM is about 8 errors per month, ie about
> one per 3-4 days.
>
> What rates do you have?
More information about the mdlug
mailing list