Hacker News new | ask | show | jobs
by ww520 3729 days ago
UUID as PK is perfectly fine; however, as with any design decision it really depends on your needs and weighing alternatives.

Some popular ways to do PK: natural key, sequentially generated key, and UUID. Personally I would prefer natural key if I can find an immutable natural key for the table. However, natural keys are hard to find. The food example in this case doesn't have natural key. Also if the natural key requires a compound key, it's just not worth the pain.

Sequential auto generated key is good when you need to hand it off to users, like order id or ticket number. It's short and simple and it's auto generated. The downsize is when you migrate databases, you need to seed the new database carefully or it would start from beginning again. Also in record creation, you need to do extra read to get back the newly generated key.

UUID is mostly worry free. It can be generated anywhere and doesn't need to be in the database. For a setup with distributed databases, I would use UUID just to have global uniqueness. For an offline app, I would use UUID to create the data records locally and later sync them with the main database. UUID is good when it's used internally and not exposed to the users.

2 comments

Sequential IDs can also leak business information when exposed to customers such as how many orders your taking - an interested party can place an order at 10am and another at 11am. Once you've done that you can compare the IDs and know how many orders were taken that hour.
Why should UUIDs not be exposed to users? Because of the their unwieldy look, or because of security concerns?

I ask because I have an app where previously I was accessing a particular data point via pk, and the user saw the pk in the url bar. But it could expose user/site data as any user could access that pk, or just guess at the next sequentially generated pk. I switched to uuid and now it's a little ugly in the url bar, but no data is exposed.

I meant the key handed off to the user for human interaction. As a user I would prefer order number 7356 than 7458e3a9-716b-4352-b2e4-b5b67d0c089b.

If there's no human interaction with the key, UUID is perfectly fine.

I've seen scenarios where UUIDs were used intentionally for this, to prevent users who get a full export of table data to reference the UUID. If it's just a sequence of small integers they might them to more permanent and intrinsically tied to the data, whereas UUIDs seem more like computing artifacts.

Yeah, I know, this could easily fixed by not friggin' showing surrogate keys to those users in the first place, but, well, data integration is ugly business.

Gotcha, just wanted to make sure I wasn't missing something. Makes total sense, thanks!
> But it could expose user/site data as any user could access that pk, or just guess at the next sequentially generated pk.

Using a UUID instead of a sequentially rolling integer ID isn't solving your problem, you're just doing security through obscurity. The real solution is implementing real authentication & authorization - not making the key harder to guess.

> Using a UUID instead of a sequentially rolling integer ID isn't solving your problem, you're just doing security through obscurity.

A common sentiment, but not true if your id is cryptographically random. It amounts to capability security which is the right approach to security if used comprehensively.