Hacker News new | ask | show | jobs
by sgarland 664 days ago
They are, actually, you just have to coordinate the ranges each node has.
1 comments

What if you don't know how many nodes you have? UUIDs can also be generated on the client side (in cases where you can trust the client).
> UUIDs can also be generated on the client side (in cases where you can trust the client).

I'm fairly certain the first rule of websec is you never trust the client. I definitely would not trust a user's browser to directly insert a value into a DB.

> What if you don't know how many nodes you have?

Shouldn't matter; you have a centralized system that hands out chunks of IDs on-demand (and has its own mechanism to ensure no repeats). This is similar to what Vitess [0] does.

[0]: https://vitess.io/docs/20.0/reference/features/vitess-sequen...

> I'm fairly certain the first rule of websec is you never trust the client.

Not every piece of information is confidential in every system. Sometimes a UUID is just that, a UUID.

> you have a centralized system that hands out chunks of IDs on-demand

I don't follow. If your system requires a central node that can reliably generate unique auto-incrementing integer IDs, why bother with UUIDs at all? Just base-64 encode the integer ID, or hash it with a salt to protect against enumeration attacks, if you want.

If you don't want the dependency to a centralised system, just use UUIDv7, which is just a timestamp plus random bits, or implement a shorter version of it. There is no need to overengineer.

> I don't follow. If your system requires a central node that can reliably generate unique auto-incrementing integer IDs, why bother with UUIDs at all?

I also don’t follow. I thought your initial assertion was that auto-incrementing integer IDs weren’t always possible, thus the need for UUIDs.

Monotonic ints, or more broadly anything k-sortable, are generally optimal for RDBMS indices due to most indices being B+trees. That’s why there’s such enormous effort towards NOT using UUIDv4.

> just use UUIDv7

Indeed; this is my recommendation when devs insist they can’t possibly use integers. Personally, I maintain that most places can use ints, it’s just that they’ve hideously over-complicated things to the point that it would be far too much work.

> your initial assertion was that auto-incrementing integer IDs weren’t always possible, thus the need for UUIDs.

You suggested timestamp+autoinc, and my initial assertion was that auto-incrementing integer IDs weren’t always possible, thus the need for the random part after the timestamp (a la UUIDv7). I see that we have actually been on the same page.

When the competition is a random 128 bit number, you can assume you've got a 32 bit node count (4.3 billion nodes) a 45 bit millisecond count (1100 years) and you've still got 51 bits letting each node generate 2 quadrillion IDs per millisecond.

The real benefit of UUIDs is the 'consistency' of the one-size-fits-most approach. If you can do without IDs humans can read out, or readable plain text logs, or compressibility, or recognisable formats for different types of ID? Then UUIDs can be used for anything from customer orders to web requests to log lines.