|
|
|
|
|
by jlokier
2418 days ago
|
|
If I understood correctly, the extra round trip on the reader only occurs with left-over intents, which are the product of an earlier failure. So: - Writes are faster due to fewer round trips in normal running. - Reads are the same speed in normal running. - After coordinator failure events, the new coordinator starts to clean up left-over intents asynchronously (nothing specific is waiting for it to finish). - Reads are slowed by extra round trips in the short time after a failure event, until the left-over intents are cleaned up. But only in that time period, and only for those ranges touched by transactions during the failure event. - The cost of extra round trips done by the reader just during recovery is much less important than the round trips that happened on every cross-region write with the older algorithm. - But if you care about read latency being consistent all the time, including during recovery from coordinator failure events, maybe you need a more sophisticated high-availability configuration for the logical coordinator. |
|
Left over intents are also visible for an ongoing txn, before those intents are resolved. But there's no added latency in the read path, I've commented elsewhere in the thread to explain how.
> After coordinator failure events, the new coordinator starts to clean up left-over intents asynchronously (nothing specific is waiting for it to finish).
The cleanup happens by readers on demand, there's no separate global coordinator scanning the keyspace and resolving old write intents.