|
|
|
|
|
by danielmewes
3997 days ago
|
|
That's a great question. In our current architecture, we perform failover by selecting a new primary among the existing replicas for a given shard. We do not however add new servers to the replica list during a failover condition.
Therefore, if you have a table with two replicas per shard and have a one node failure, you would only have a single replica left. Currently we do not automatically failover in that case. We always require a majority of replicas to be present for any given shard before failing over the primary of that shard. I believe we could relax this constraint in this specific case, and allow failing over despite only a single replica being left, as long as we still have a majority for the table overall. (I'm going to double-check this with one of our core developers...)
Even if we did perform the failover, that would not restore write availability of the table though, since there wouldn't be a majority to acknowledge a write (unless you set write_acks to "single").
It could still be useful to restore availability for up to date read queries.
We might add support for auto failover in this scenario in the future. |
|