| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Alupis 1312 days ago
	There's probably a non-trivial amount of folks that equate a UUID with "unguessable" given their appearance. They are, after all, not sequential and using them to obscure things like number of users (using a UUID in place of an incrementing number) seems like a natural fit. Given how easy it is to generate a UUID in most languages, and given the low likelihood of a collision within a system - it wouldn't be a huge leap to think UUID's could replace homebrewed random string generators for things like password reset tokens, etc.

2 comments

bigiain 1312 days ago

> There's probably a non-trivial amount of folks that equate a UUID with "unguessable" given their appearance.

That's near enough to true for anyone not operating at "web scale".

FAANG/BAT engineers need to care. My systems with 10s or 100s of thousands of users (or, you know, a few thousand users tops) are without doubt going to be re-written (probably several times) well before I have to worry about having so many UUIDs in the wild that this becomes a reasonable thing to worry about.

For me, at the scale of systems I run (or will conceivably run in the medium term future), I think the simplicity/understandability of code that uses native language UUID functions is "the right thing". Whoever does the next big rewrite to support a few million MAU will be thankful they don't have to work out WTF I was thinking when I decided to roll my own random access tokens.

ndriscoll 1312 days ago

I doubt FAANG engineers need to care either. Ignoring that the author imagines 8k IoT devices per living human for one service, 2^64 requests per second is an absurd number to use. Assuming one server can do 10M RPS, you'd need 1.8 trillion servers to handle that load. You'd also need over 2 billion Tb/s of bandwidth to receive just the UUIDs with no overhead.

It doesn't matter what computing resources your attacker has; the limit is how much your infrastructure can handle, and the author casually overestimates that by about 10 orders of magnitude. So replace 35 minutes with 350 billion minutes, or about 660,000 years.

doctor_eval 1312 days ago

Thanks for this. I thought I must be missing something because this seems like such an obvious point.

I find it hard to believe that there is a problem with a (cryptographically random) 122 bit session key considering that a brute force attack on it will result in a DDoS, which is obviously self limiting.

Lots of people here are saying “never use a uuid for a session key”, but I don’t understand this. What’s the accepted entropy for a session key?

smaudet 1312 days ago

I think the even more absurd rec is to use 160 bits as a "sweet spot"? Why? Who said that? Which real world scenarios? Why not 159 or 161...

Then you realize the author is just talking out their rear end with no thought...

"Yes I often find my cracking buddies with their super computers just give up hacking my online user service when I bumped my user token length from 159 to 160 length", said nobody, ever.

sidpatil 1312 days ago

> "Yes I often find my cracking buddies with their super computers just give up hacking my online user service when I bumped my user token length from 159 to 160 length", said nobody, ever.

Reminds me of this sketch: https://youtu.be/IHfiMoJUDVQ

dpcx 1312 days ago

Even they shouldn't need to be concerned much with collisions. Wikipedia suggests[0] "generating 1 billion UUIDs per second for about 85 years". Is it possible? Sure. Is it likely? Not really.

[0]: https://en.wikipedia.org/wiki/Universally_unique_identifier#...

bigiain 1312 days ago

I guess from the article it's not just collisions, but the (significantly more likely) problem of guessing a UUID that's valid (out of all the issued tokens).

But yeah, even that is very very low risk. The article had to make some outrageously pessimistic assumptions to get it's "38 minutes!" number. Issuing a million tokens a second with two year validity, and getting attacked with the entire hash rate of the bitcoin mining community. And having both enough backend capacity to handle all those requests while at the same time having no observability or rate limiting to mitigate a brute force attack.

Dylan16807 1312 days ago

> I guess from the article it's not just collisions, but the (significantly more likely) problem of guessing a UUID that's valid (out of all the issued tokens).

Assuming random UUIDs:

If you're counting all the UUIDs anyone makes, then valid<->attacker matches are a subset of all possible collisions and therefore less likely.

If your baseline is only the collisions between valid UUIDs, then whether an attacker is more or less likely to collide depends on whether they're generating UUIDs at least half as fast as the system they're attacking.

jsjohnst 1312 days ago

> That's near enough to true for anyone not operating at "web scale". FAANG/BAT engineers need to care.

I’d argue even then it’s really not much a concern. You’d need to generate 1 billion UUID v4’s per second for over 75 years to have a 50% chance of there being a single collision.

withinboredom 1312 days ago

You can generate sequential UUIDs, IIRC, that’s the best way to store them in a db and still have good partitioning/indexing. I don’t use UUIDs often, but I vaguely remember researching this problem space at some point.

Alupis 1312 days ago

I think most languages let you chose which version of UUID you want - with most defaulting to the random version (I think 4?) by default.

There are other versions that are sequential/time-based though, but using these could open the door to de-obfuscating whatever data you wanted to protect via UUID's in the first place (like how many sales orders you receive per hour, etc).

withinboredom 1312 days ago

I don’t think uuids are designed for obfuscation, though they certainly help with that as a side effect. I could be wrong though, I’ve never looked into it.

Alupis 1312 days ago

They (randomized type 4 UUID's) obfuscate as a side effect because they are much more difficult to guess due to their randomness. As the article points out though, they are not impossible to guess... but it will come down to your risk tolerance and what the UUID's are "protecting".

People like to reach for UUID's when obfuscation is needed because inventing your own duplicate-aware random string algorithm isn't what most folks want to spend their time thinking about. Plus, these days, many databases come with UUID-aware data types that make using UUID's fairly straight forward.

edgyquant 1312 days ago

UUIDs are a vast improvement over integers for preventing simple attacks like +/-ing the id and seeing what happens.

mrkeen 1312 days ago

But then you're back to collisions, and you may as well be using longs.

withinboredom 1312 days ago

I think v7 uses microseconds since epoch + random data. The odds of a collision should be practically 0, or more likely to find a sha256 collision.

morelisp 1312 days ago

> more likely to find a sha256 collision.

This is obviously, and egregiously, false.

withinboredom 1312 days ago

I don't know. You'd need quite a number of threads + machines generating uuids in the exact same microsecond to get an opportunity for a collision. It doesn't seem obviously false.

morelisp 1312 days ago

I didn't say a collision is easy, I said it's obviously false it's harder than colliding a sha256, a space roughly 95780971304118053647396689196894323976171195136475136 times larger.