Hacker News new | ask | show | jobs
by j-pb 1842 days ago
They almost got it right, a better implementation would overflow regularly to make use of the entire key space, and counter untuitively more resistant to overflows.

Clocks aren't reliable enough for timestamps anyways so garbage collection is the only thing you kinda wanna rely on them for.

A good sweet spot seems to be, 32bit milliseconds + 96bit of entropy. This overflows appeoximately every 50 days, allowing for 50 day rolling data retention.

1 comments

Not the worst idea—50 days is a nice sweet spot between infrequent enough to have some indexing benefit and frequent enough that potential downsides will be discovered early in the product’s life cycle.

Personally I wouldn’t do this. A scenario where for each individual millisecond of elapsed time, 96 bits of entropy is an upgrade over 80 bits of entropy, is fairly extreme. I don't think there are many databases in the world which would ever need more collision mitigation than that.

> I don't think there are many databases in the world which would ever need more collision mitigation than that.

Individual instances? Maybe not. But for those an autoincrement key would also work. That is not the scenario that ULIDs and GUIDs are advertised for.

The goal is to have an universally/globally unique ID. So whenever you encounter two IDs you can be (resonably, probability wise) sure that they won't collide.

Any such sheme thus must, by definition, serve every single use case now and forever everywhere. That's a tough one.

Also it's not really 80bit vs 96bit (which due to the birthday paradox is already a huge difference) but more 80bits vs. 128bit as the timestamp is recycled with sufficient usage.

I'm actually concerned that 96bit isn't enough, as it relies on the assumption that you'll use this scheme for for data spanning years, in order to properly use the timestamp as entropy.