Hacker News new | ask | show | jobs
by Zambyte 304 days ago
> isn't the time making it more collision resistant?

That seems to depend a whole lot on the pattern your application generates UUIDs in. If you're generating a consistent distribution over time, sure. If you generate a whole lot in bursts, collision seems to be way more likely.

1 comments

You have to generate 2^37 (137,438,953,472) UUIDv7s in the exact same millisecond to have a 50% chance of collision.

(Not disagreeing with you, just adding perspective.)

The math is interesting here as you'll probably want to run your system for several years, not just a single millisecond. So it's a repeated trials problem. I spent some time trying to figure out the ID generation rate that would be a "break even point" between UUIDv4 vs UUIDv7, but I didn't trust the answer I got.

(Agreeing with both parents)

Good observation. Could you share the math even if you don't trust it? I don't have pen and paper here and I'm curious.

After thinking it more, I have the feeling (against my initial intuition) that v4 might dominate either way unless you consistently generate tons of UUIDs for an impractical number of years.

I ran some numbers by GPT-5[0], and for the scenario of generating 10k UUIDs in one ms every 10ms, over three years, it came up with a 0.0025% chance of collision for UUIDv7, and a 0.000000084% chance for collision with UUIDv4.

[0] https://kagi.com/assistant/dd7d8c48-44e4-499b-9f2f-33663d125...

I checked against my notes, I see about the same numbers using the `n**2` taylor series approximation. I missed that the probability of `>=1` collision is about the same as exactly one collision, but I suspect that's quite reasonable as this scale.