Hacker News new | ask | show | jobs
by scurvy 1467 days ago
I've yet to make it out of the lab. The failover solutions seem to be A-B rather than A-B-DR. That is, you can have a primary and a replica (probably in the same DC/AZ), but you can't do a primary, a replica, and another replica in another datacenter that can function on its own (in a DR capacity).

Citus uses physical replication for worker and coordinator nodes, and things start getting complicated when you're trying to monitor the status of all the replicas. With vanilla PostgreSQL, you don't have this issue. I'm guessing that they solve this in Azure with block-device primitives at the storage level or something along those lines? It's probably not insurmountable to do it yourself, but we're not yet at the grip-n-rip enterprise offering.

1 comments

Making HA easier while self-hosting is definitely something that's high on our list of future improvements. I wouldn't be surprised if there will be some announcements regarding this in the coming year. One of the main issues (imo) is configuring HA for PG isn't easy, even when using regular PG. And this becomes even more complicated in the case of Citus, because you're effectively running multiple PG servers, all of which you need to configure HA for.

For my understanding about the multi-region DR capability you would want: Do you want to be able to write to the DR region all the time? Or do you want to be able to switch over to the DR region in case the main DC is down?