Hacker News new | ask | show | jobs
by qwertyuiop924 3565 days ago
And people with 64TB, 256 core machines don't have RAID arrays attached to their machine for this exact reason?

If it's "machines" plural, than you can do replication between the two. There's your fallover in case of complete failure.

1 comments

> If it's "machines" plural, than you can do replication between the two.

This is the start of a scaling path that winds down Distributed Systems Avenue, and eventually leads to a place called Hadoop.

(Replication and consensus are remarkably difficult problems that Hadoop solves).

Fair 'nuff, but if you don't distribute compute, and you store the dataset locally on all the systems (not necessarily the results of mutations, just the datasets and the work that's being done), you'll still possibly reap massive perf gains over Hadoop in certain contexts.
> you'll still possibly reap massive perf gains over Hadoop in certain contexts.

Certainly, and unfortunately, the exact point at which Hadoop becomes the better option over big iron is generally an ongoing debate and shifting target. But there's no doubt that such a point actually exists.

...I'm not sure if it does. Bain's Law still stands.

But if it does, then it's a pretty big chunk of data, and a very fast network.

Couldn't find anything for Bain's Law, do you have a reference I could follow?
I believe it's a reference to this old (but insightful) comment: https://news.ycombinator.com/item?id=8902739

As such, it is actually "Bane's Rule" which states, "you don't understand a distributed computing problem until you can get it to fit on a single machine first."

(Thanks to nekopa, who also referenced it further down in this thread.)