Hacker News new | ask | show | jobs
by mh- 1124 days ago
Events that are enriched off of other data in the same DB.

The system design predates me, but it is solid (albeit difficult to operate at this scale - usual stuff like schema changes, replication bootstrapping).

1 comments

What is your mean time to recovery like when you have to restart your system?
We keep hot masters (typical MySQL master-master-slave(s) setup) on standby.

Bootstrapping one from scratch (like if we need to stand up a new replica)? We restore a disk image on GCP from point in time snapshots, then let it catch up on replication. So it'll depend on how far behind it is when it comes online.

TLDR: a few hours in the worst case. ~Zero downtime in practice.