Hacker News new | ask | show | jobs
by spolsky 4975 days ago
Geo-redundancy is a tough engineering problem. We're building a long term solution but it's a lot of work and it's not in place today.

If this is the kind of problem that excites you, we're hiring :-)

2 comments

It's particularly hard to shoehorn in after the fact. Certain development models (e.g. replicated state machines) make it much easier... mix in some magic Paxos dust and it can handle machine failures as well.

Sadly the better implementations I've used myself (or have heard about) are not publicly available. The closest thing in semi-widespread use seems to be Zookeeper, but it's more like Oracle when you really wanted SQLite (standalone service vs. library).

How tough it is depends on how you're engineering your geo-redundancy. I've been doing it since 1998 and a simple active-passive solution is not as hard as some would believe but it does cost money. Active-Active is much more challenging and multi-master is obviously the ideal and the most difficult to engineer with geo-network latency. I haven't seen one solution that couldn't be staged to at least provide active-passive DR capability for a pretty reasonable price. You can even do it in EC2, RackSpace Cloud or Joyent if your feeling "cloudy".