|
|
|
|
|
by SpikeGronim
4910 days ago
|
|
If you follow the advice in this paper[1] you will be measuring media errors in your drives. That means re-reading all data every N days, even archived data. Without periodically re-reading and validating (checksumming) the data you can't tell if it has rotted in place. Since the distribution of errors over drives is very exponential you should then pro-actively remove the worst drives in your system. That will avoid an accumulation of errors and sudden multiple drive failure as described here. Durability is like a diamond: it is forever. 1. http://research.google.com/pubs/pub32774.html |
|