| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by rcxdude 897 days ago
	This would be true if you were to optimize for the very extreme case of running an error correction code over all of your data at once. This would give you the absolute best case tradeoff between redundancy and data storage, but would be completely intractable to actually compute, which is the point they are making. In practice error correction is used over smaller fragments of data, which is tractable but also doesn't give you as good a tradeoff (i.e. you need to spend more extra space to get the same level of redundancy). From what I understand one of the appeals of the codes mentioned in the article is that it might be tractable to use them in the manner described, in which case you might only need, say 3 extra servers out of thousands in order to lose any three, as opposed to 3 extra out of 20. But it seems like it is not likely. (In practice, I would say existing error correction codes already get you very close to the theoretical limit of this tradeoff already. The fact that these 'magical' codes don't work is not so much of a loss in comparison. While they would perhaps be better, they would not be drastically better).