Hacker News new | ask | show | jobs
by rkarthik007 1590 days ago
Spanner gets both correctness and low latency from tight synchronization. They do COMMIT_WAIT, meaning wait for the max clock skew to pass. But the max clock skew without TrueTime will be around 500ms (which it impractical to wait out). So, the 7ms is lower latency, but 500ms is impractical (more so than just calling it higher latency). And any other technique to drop latency (without HLC) will violate correctness.

Disclosure: co-founder/CTO of YugabyteDB project

3 comments

fwiw i think you’re both saying something very similar. true time has to be correct about max skew in order for it to not break the assumption spanner is built upon. you could also use a looser time bound and still have correctness, but end up with a database that is useless to most/all customers
Interesting. How does this differ from HLC in practice - in the article you say you use a 500ms max skew for HLC?
The difference is in how the regular path (exercised most of the time) vs an edge case when there is a conflict (typically in larger clusters with a pathological access pattern) works. With TrueTime, the latency is always 7ms and no issues in the pathological cases. With HLC, the latency is lower in most cases, but high in the pathological cases (when it can be 500ms), but these should not matter for many use cases.
What do you think of RIFL/TAPIR?
Thanks for pointing this out, was not aware of TAPIR. Will take a look, seems pretty interesting.