Hacker News new | ask | show | jobs
by eloff 2979 days ago
I'm looking around for more details on these "regional" disks that replicate between two zones at the block level. Is that just a fancy term for os level mirrored disks using the cloud persistent disks?
1 comments

There's a previous blog post[1] about HA replication. It uses block-level replication managed by PD infrastructure.

[1] https://cloudplatform.googleblog.com/2017/11/Cloud-SQL-for-P...

Block device based replication for Postgres seems a bit unconventional given that Postgres has native synchronous replication support with WAL streaming.

Intuition tells me that you might get better performance if you let the DB itself do the replication but I can't really justify that without real review of what happens.

The postgres docs (https://www.postgresql.org/docs/10/static/different-replicat...) say that the WAL solution has no "Master server overhead" in contrast to the File System Replication solution, but it's not explained and I'm not sure what is meant by that.

I guess with a block device based solution, recovery takes longer, because failover entails you have to actually mount the block device (as no 2 machines can mount it rw at the same time), and then start the DB (or in a more basic implementation, just boot the entire second machine as part of failover), while with WAL streaming both postgres instances would already be running? Wo failover would be faster with WAL streaming?

I would be great if somebody from GCP could elaborate what the tradeoffs here are, how long failover takes, and whether we can expect similar performance and behaviour as with WAL shipping.

Amazon's Aurora Postgres database does a similar thing: your master in one zone replicates to a disk that is in all the other zones. Unlike normal Postgres RDS instance it also auto-scales storage to what you use.

Amazon claims better scaling then ordinary Postgres for this.

Just speculating but it’s possible the block level is faster because it’s replicated over a dedicated and optimized SAN rather than (potentially) contending with normal network traffic. I assume the database state would only be crash consistent though.
I read that, but it doesn't answer the question. What are these regional disks, and can they be used directly?
A regional disk is a logical disk that is synchronously replicated at the block level across exactly two zones within the same region [1]. Since the disks are always identical, with no replication lag, the HA control plane can seamless fail over the whole database to a new master that plugs into the same disk. It's all in the article.

Regional disks aren't publicly available yet, but they are in alpha [2]. Like normal persistent disks, everything is backed by Google's internal Colossus system [3].

[1] https://cloudplatform.googleblog.com/2017/11/Cloud-SQL-for-P...

[2] https://cloud.google.com/sdk/gcloud/reference/alpha/compute/...

[3] https://cloud.google.com/files/storage_architecture_and_chal...