| > Each client in the Jepsen test harness is (independently) scheduling n writes per second. Umm... that's kind of an important detail. I'm betting that your mechanism for achieving this effectively synchronizes your clients to act in concert, or at least as close to "in concert" as is possible for your clock to measure. That explains the probability. > There's an interesting probability anecdote called the Birthday Paradox Yeah, I thought of the Birthday Paradox with this problem, but this is a different variant. The probability that two people in the room have the same birthday and no one else in the room has a birthday later in the year follows different probabilities. Try writing a program that spawns 5000 threads and has them get the current time in microseconds, and then write it in a file. You won't have any collisions unless you do some kind of precise coordination between them. In fact, you likely only have a shot at getting the same timestamp if you call from different threads, because just executing the instructions to read the current time takes long enough that two calls in a row will get different values. > TL;DR: microsecond timestamps do not provide sufficient entropy for uniqueness constraints over common workloads. See, that's the part I have a problem with, because I've had quite the opposite experience (without even having Cassandra involved). |