|
|
|
|
|
by zzzcpan
2980 days ago
|
|
Well, you can't really solve high availability problem by testing how long replication and restoration take. It's a lot more nuanced than that and requires real expertise in distributed systems. Usually it means getting rid of PostgreSQL/MySQL completely in favor of distributed solutions, as it's cheaper and is a better investment into the infrastructure, than attempting to build high availability on top of it. |
|
I agree with you in principle, but for most systems it's total overkill. It wouldn't be total overkill if distributed solutions were easy to set up and without tradeoffs, but we're nowhere near being there.
In most cases then, restoration time is the biggest barrier to getting "high-enough" availability without re-engineering everything for a totally different system. Often you can prevent that from becoming an issue by siloing functionality into separate databases, offloading logs and analytics for example. Or buying faster SSDs for your DB servers... There are many approaches depending on the size of your dataset, and most people never outgrow those options.
To put it this way: Gitlab.com's database is small enough that fitting it in RAM on a commodity server is easily doable. While they'd still need to have snapshots on disk, at that point beating the restore speeds they're reporting would be trivial.