Hacker News new | ask | show | jobs
by inopinatus 1843 days ago
Caveat programmer: this could be problematic, not in the sense it doesn't work, but in the sense that someone working on backend code may have a preconceived expectation that UUIDs are also effectively a keyspace i.e. they're hard to guess. The validity of that is already challenged by variants defining temporal or logical order, and evaporates completely if you let clients declare their own that you accept at face value. Applications may have potentially guessable/gameable object identifiers sloshing around inside as a consequence, which is modestly ironic given that one benefit many folks expect from adopting UUIDs in the first place is hardening up the attack surface of trivially enumerable sequences.

There are a few mitigations but my favourite is the "casino chips" approach: pregenerate them server side, and allocate to clients on demand, including en masse if need be ("here kid, have a few million UUIDs to get you started"). Verify with whatever simple signature scheme comes with your application server framework, or at small scale just toss them in a crude LRU store.

Or, remember where the UUID came from, and apply their organisational scope to any trust you place upon it. This might work particularly for multi-tenanted SaaS. However it requires that all usage is tenant-bounded end-through-end throughout your application. This may be in conflict with a) your framework, b) your zenlike contemplation of simplicity in data management, or c) programmers in a hurry forgetting to scope their queries properly.

Ultimately, relying on UUIDs as intrinsically unguessable security tokens is probably not a great idea, but it's one that remains thoroughly embedded in the programming zeitgeist. As ever, nothing useful comes without a compromise. Keep your eyes open to the systemic consequences of design choices, and don't leave traps for your fellow developers.

3 comments

He's not saying clients can create their own ids. The applications can.

The concepts he's talking about are required for cqrs. Which is a popular concept applied with mostly DDD or microservices.

There definitely are people out there in this thread proposing clients be able provide UUIDs. I’ve seen it elsewhere too.

I’ve also personally experienced UUID collisions due to badly set up VM environments under Windows. It isn’t a good idea to blindly trust any value - and that includes supposedly ‘never collide’ id’s like UUID.

For what it’s worth, I also had the joy of debugging someone’s distributed hash table that was using md5 as the hash bucket key (this was... 2 decades ago?) and had no way to handle collisions because obviously that is impossible.

This seems more an issue of the libraries random generator to form uuids.

Eg. I use guids ( .net) and i have never seen an issue.

I getcha, but these days the ambit reach of "application" extends to Javascript executing client-side in an environment that's basically overrun with lions/tigers/bears, and I'll suggest that's particularly a consideration when the front-end is a SPA participating in a CQRS/event-sourced overall application architecture.
For perspective, the npm uuid package is now being downloaded ~50M/week. It's usage is ubiquitous at this point, on any platform JS is running.

https://www.npmjs.com/browse/depended/uuid

Little bit later to reply.

Unfortunately that doesn't mean much.

Since nodejs is a server side language and can handle that package too. And it's not "solely" for js/spa's.

Should you ever use a plain token (where you just check if it exists in some authed_users table) vs, I dunno, some sort of signed/HMAC type thing, where you have to call some function on it? I genuinely don't know but I know enough to generally leave authentication up to those that do know.

Maybe I'm just thinking of OAuth where there are multiple hops involved?

The comparison to OAuth is quite reasonable. Perhaps the most obvious parallel is the use of a state parameter during the three-legged exchange, without which it's exposed to a CSRF clickjacking attack.
Right. Maybe it's paranoid, but it seems like a bearer token has potential avenues for forgery (CSRF or others), replay attacks, add-on jacking, etc. Also harder to coordinate with distributed apps. I think the Captain Tightpants approach would be to initialize some client-side private key/cert, use that to sign each request and verify based on that cert.

That should also make it easier to, say, verify and unwrap the request at the gateway to the server, before sending it to the rest of the application-proper.

In Java there is a UUID generator based on SecureRandom. That's about as unguessable as you're going to get.
It's not a question of whether UUIDs can be generated unguessably. They can be, as you point out.

It's whether the UUIDs in your system can be reliably presumed to be unguessable - including the UUIDs that were generated by code which was written after you wrote your query that assumes unguessability.

Today you might say "Oh, this SecureRandom-based UUID generator is unguessable and meets all of our requirements". Tomorrow you might say "Ah, this SecureRandom-based UUID generator is too slow, let's generate them in our Android app instead of on the server". But now the UUIDs stored in your database aren't reliably unguessable, because you accept whatever your client API tells you without verification. How plausible is it, within the timeframes you actually get, to review every query for whether it assumes the trustworthiness of the UUID generation? Better to assume UUIDs have some convenient properties, than to assume that they're unguessable just because the API is cryptographically secure today.