Hacker News new | ask | show | jobs
by baix777 1939 days ago
The creation of integers doesn't scale for large data volumes. Plus you need a place to create these integers, and a failover location, which adds complexity. Multiple machines can each be creating their own guids in a very simple manner.

And the odds of a guid collision is extremely low, and for most applications is acceptable. Having worked with petabytes of data guid performance isn't really an issue as there are more important factors to worry about.

1 comments

> The creation of integers doesn't scale for large data volumes.

At which specific integer does the scaling start to slow down?

One concurrent insertion, or one network partition.

UUIDs can be generated on many machines with no awareness of each other and merged later.

I would recommend reviewing my prior comments on this, as I address the concerns of multiple nodes needing to be able to independently produce identities without collisions or coordination.

If you know beforehand the maximum number of participants in your system, you can divide the keyspace across that quantity. If you are using BigInteger or equivalent, you have an infinite number of these things to work with, so it doesnt really matter if you wind up skipping trillions of identities at first. The original article even advocates for this as its first point, but without as much practical justification.

If you're going to use a key space in the trillions, and partitioned and sparsely populated rather than sequential... isn't that just reinventing what the UUID already is?

A UUID is just a 128-bit integer, with creation algorithms designed to partition that space by things that already have enough entropy to need no further synchronization.

What you're proposing sounds like roll-your-own-UUID, which might be similarly inadvisable as roll-your-own-crypto.

Don't try to compare this to rolling your own crypto. The stakes are nowhere near the same for IDs as for crypto, and the stakes are the defining feature of the "don't roll your own crypto" meme.
Yes I'm comparing it and you don't get to tell me I can't.

It's not about the stakes. It's the idea that in rolling your own, you're going to get it wrong, or otherwise do worse than existing ways that have already solved the same problem.