Hacker News new | ask | show | jobs
by corytheboyd 1310 days ago
Yeah I don't really get the point of this article, if you need random values of a specific size don't use uuid, it's literally specified to be one exact length and format.
6 comments

>>Yeah I don't really get the point of this article,

To get clicks?

You're not wrong lol
The number of comments saying "using UUIDs for secrets isn't that bad" suggests this article needs to be written...
one exact length and five "versions" of the format (so far)

https://en.wikipedia.org/wiki/Universally_unique_identifier#...

I made a comparison list with the most known uuids out there, a couple of days ago, it was quite fun discovering all the different kinds of uid and their pros/cons.

https://adileo.github.io/awesome-identifiers/

KSUIDs are fairly popular and missing from your list:

https://github.com/segmentio/ksuid

what's the resolution on those? 32 bits, 100 years.. that seconds right? doesn't sound excellent for time ordering. 100 years also seems a little short but at least I'll be dead
Don't look at it as being your problem in 100 years, but as helping employment in 100 years and helping the economy ;)
ULID example should be in uppercase.

Love this chart tho.

Also most well-designed systems only use the UUID as the representation format and use raw bits in performance-critical parts.
The raw bits are the UUID, the hex string is just a human-readable representation that also plays nicely with JSON.
Tell that to Django (well 5 years ago anyways iirc, don't know what it does now). Pretty sure it used to store uuids as strings columns in your sql.
I suppose Django wouldn't consider the speed gains of using raw integers in the database worth the hassle of dealing with binary data when you have to manually deal with the database somehow. I usually use string columns for UUIDs myself for the same reason.

It's also not given that it'll be a performance benefit, you probably receive UUIDs as strings from some client and probably want to return UUIDs as strings to the client, and that conversion isn't free.

Yep, looks like it does the right thing in PostgreSQL but not anywhere else [0].

https://docs.djangoproject.com/en/4.1/ref/models/fields/#uui...

I feel like it did strings in postgres too, not too long ago and I had a <brain explode> moment when I worked on a codebase and had to figure out why queries were terrible
Supposedly the behavior hasn't changed since at least version 1.8:

https://docs.djangoproject.com/en/1.8/ref/models/fields/#uui...

It may not have worked correctly on your project for some reason?

Or to PowerBI, which will any UUID to a string even in joins. That cast + string comparisons + killing of indexes is not conducive to performant queries...
It’s a 128 bit integer - the serialization format does not change the fact.
Use uint128_t instead.
It is also highly recommended that you include a check digit into it, to minimize the chance of a collision. I've used https://arthurdejong.org/python-stdnum for that purpose.
I don't see how a check digit minimizes the chance of collision. (Here, I'm assuming that a check digit is calculated from the other digits. What am I thinking about incorrectly?)
Looking at the docs for the library linked, it appears to be a Verhoeff algorithm check digit... so yeah, you're correct.

This is effectively a simplistic stand-in for a CRC type system -- useful to detect if the data has been corrupted, but not useful to avoid collisions.

And if someone is worried about UUID collisions, they need to rethink their priorities in life.
You are correct, this should teach me not to write comments when I'm too tired. :/

The check digit wouldn't really help with collisions, since if the strings are the same the digit will be too. They are primarily useful when we need to ensure correctness on human input.