Hacker News new | ask | show | jobs
by valenterry 1883 days ago
Here's what I take away from this post:

> We built automation that can respond rapidly to load concentration and individual server failure. Because the consistency witness tracks minimal state and only in-memory, we are able to replace them quickly without waiting for lengthy state transfers.

So this means that the "system" that contains the witness(es) is a single point of truth and failure (otherwise we would lose consistency again), but because it does not have to store a lot of information, it can be kept in-memory and can be exchanged quickly in case of failure.

Or in other words: minimize the amount of information that is strictly necessary to keep a system consistent and then make that part its own in-memory and quickly failover-able system which is then the bar for the HA component.

Is that what they did?

1 comments

They've basically bolted on causal consistency.

It's a great change.

Thank you for dropping the "causal consistency" term. I read the wikipedia article.

So, causal consistency in this context means that 1) it does not matter if a write to object A or object B came first, because they are seen as "unrelated". This obviously allows for performance improvements over general "strong consistency" but still offers more guarantees than eventual consistency.

Second, for every "writer" (which would be a 1:1 relationship to the number of S3 objects) an amount of metadata needs to be kept in-memory for such cases where the access to an object _might_ lead to an outdated read or where an "older" write would potentially overwrite a "newer" write.

All that being said, there still is one single instance that, if it goes down, makes the whole system unavailable for the S3 objects it manages until it is replaced. So that means a lower availability compared to a solution that uses eventual consistency (like Cassandra).

Would this be equivalent to having an in-memory SQL database to store the metadata for some other system (such Cassandra) with a quick failover but still single point of failure to enhance the consistency guarantees - just more optimized/customized so that it can work with a huge system like S3?