Hacker News new | ask | show | jobs
by nitsky 35 days ago
True, but it makes the specific collision the post observed completely impossible.
1 comments

I left a more detailed comment on the parent, but it's definitely not impossible!
The scenario in this post is that the first uuid was created one year before the duplicate uuid. That isn’t possible with v7
You're heavily leaning on "collision like this" to relate to the exact time stamps for your statement to be true.

It's equality possible to interpret the "like this" to the collision itself, without a focus on the 1 year distance between the creation dates.

So I guess both views are valid.

The inclusion of a timestamp in v7 makes collisions impossible unless the generating systems think that the time is the same down to the millisecond, which makes the temporal distance quite relevant.
Plenty of systems end up generating multiple UUID's in a single millisecond.

The issue with UUIDv7 is that you also have significantly less entropy since you only have a 62 bits (sometimes less, depending on implementation) of "random" data. So while the time aspect of format lowers the chances of collisions, generating two UUIDv7's in the same millisecond (depending on implementation) have a significantly higher chance of collision than two UUIDv4's.

It's still incredibly unlikely, but it's also incredibly unlikely you generate two matching UUIDv4's, but it does happen.

TLDR; It's possible to generate matching UUIDv7's, don't assume otherwise.

I answered this in another HN topic just the other day: https://news.ycombinator.com/item?id=48061098

But essentially, using UUID v7 you actually have less risk of collisions than with UUID v4.

Because of the birthday paradox, if you have N bits of randomness, you can expect a collision approximately after (2^((N/2)-1)) random numbers.

With v4, you have 122 bits of entropy over all time, so will see a collision after 2^60 allocations, approx 1.2 x 10^18.

With v7, you sacrifice 48 bits of entropy to give you 74 bits of entropy every millisecond, so you will see a collision after approximate 2^36 allocations per millisecond, approx 6.8 x 10^10 per millisecond.

You could argue that the risk of a collision is too high per millisecond because it's likely that 68 billion UUIDs are generated every millisecond. And maybe I'd agree. But the counter argument is that with v4 you'd expect a collision after 2^24 milliseconds, or 280 minutes, allocating at the same rate of 68 billion UUIDs per millisecond.

Obviously "all time" is longer than "280 minutes", so v7 is actually statistically less likely to cause collisions than v4, even though it seems counter-intuitive because it has a smaller space devoted to entropy. The key insight is that the time provides bits that are guaranteed to be unique, so only collisions within the same timestamp are significant, and every bit used to provide known-unique values is worth 2 bits of entropy.

Surely the scenario where he generates the same number of items as he did between 2025 and now, but did it in 1 tick of v7 UUIDs also runs into it?
The scenario being the collision itself, the time period isn’t particularly relevant aside from it occurring much quicker than expected.