| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by doug_neumann 1972 days ago

Just to be clear, our solution is about failing over all of your environment to another region. So in our case, A, B, & C would all be coming up together from snapshots in that other region. But there are still consistency concerns as you mentioned.

We do everything we can to snapshot servers in quick succession (as much as we can with AWS), but they won't be fully consistent across nodes. We've found, though, that clustered systems like this have built-in capabilities to deal with these inconsistencies. It's kinda similar to if you lost power to all 3 systems and then brought them back up - they might not all be at the exact same point in time, and the application would need to sort that out.

If you'll tell me what application this is (is it a database platform?) I can do some quick research for you.

1 comments

treis 1971 days ago

How do you ensure that your services will be up in the event of an AWS outage?

link

doug_neumann 1971 days ago

Our service runs multi-region active across 3 regions. We can survive a 2 region outage and still operate successfully.

link