Hacker News new | ask | show | jobs
by melor 3395 days ago
We host a number of our customers' database systems on us-east-1.

What worked well for us (https://aiven.io):

- Architecturally relying only to a few cloud provider services (only need VMs, disk, object storage)

- Upfront investment on being able to move services from one region to another without downtime

- Pre-existing tooling for easily (manually) reconfiguring backup destinations on the fly

- Not running everything on just AWS

What did not work so well:

- Backups should automatically reroute to a secondary backup site on N consecutive failures

- Alert spam, need more aggregation

- New failure mode: extremely slow EBS access, some affected VMs were kinda working, but very slowly: need to create a separate alert trigger for this