Hacker News new | ask | show | jobs
by doug_neumann 1972 days ago
Just to be clear, our solution is about failing over all of your environment to another region. So in our case, A, B, & C would all be coming up together from snapshots in that other region. But there are still consistency concerns as you mentioned.

We do everything we can to snapshot servers in quick succession (as much as we can with AWS), but they won't be fully consistent across nodes. We've found, though, that clustered systems like this have built-in capabilities to deal with these inconsistencies. It's kinda similar to if you lost power to all 3 systems and then brought them back up - they might not all be at the exact same point in time, and the application would need to sort that out.

If you'll tell me what application this is (is it a database platform?) I can do some quick research for you.

1 comments

How do you ensure that your services will be up in the event of an AWS outage?
Our service runs multi-region active across 3 regions. We can survive a 2 region outage and still operate successfully.