Hacker News new | ask | show | jobs
by foxylion 2953 days ago
The article contains a footnote about UUIDs as primary keys.

> UUID as a primary key is a terrible idea, by the way — cryptographic randomness is utterly designed to kill locality of reference, hence the performance penalty

Is there anyone who can go a little bit more in detail?

We planned to migrate our database to use UUIDs as primary keys. This will allow creating new rows on clients knowing the new primary key before sending them to the server (simplifying client and server code).

10 comments

(The author is assuming that the primary key controls disk layout, which is usually true.) One advantage of using an incrementing integer is that rows will be ordered on disk based on when they were created. This often helps performance. If a query asks for 25 consecutive rows, there is a good chance they will all be on the same page. If you use UUIDs, then they could be on 25 different pages and you will have to do 25x the disk IO to handle the query.
> One advantage of using an incrementing integer is that rows will be ordered on disk based on when they were created.

Well, kind of. A lot of people think the auto incrementing integer function in many RDBMSs will always increase, or will never have gaps. It's likely but not guaranteed that n+k was created after n. If you really need to store the creation date, then you should store that in a datetime/timestamp column.

> If a query asks for 25 consecutive rows, there is a good chance they will all be on the same page. If you use UUIDs, then they could be on 25 different pages and you will have to do 25x the disk IO to handle the query.

This is true, but it also means that if you need to write 25 different rows, it will be in 25 different pages. That sounds bad because non-sequential writes are slower, but you have to remember that it could be 25 different connections trying to write! In other words, you create a hot spot with sequential inserts. If that's the end of the table, you'll have threads constantly waiting for other processes to do inserts since inserts lock the page being inserted.

So, yes, clustering on a UUID can cause problems (fragmented indexes, inefficient reads), but clustering on an autoincrement can also cause issues depending on your work load.

In reality, what you need to do (in the general case) is cluster on your business key even if it's not the primary key for your table.

> It's likely but not guaranteed that n+k was created after n.

This is true in mysql if you rollback a transaction, or use a INSERT INTO ... ON DUPLICATE KEY UPDATE.

In the first, the rollback doesn't revert the sequence, in the second the "insert part" will always increase the number, even if there is a duplicate to update.

My point is that nothing stops you from modifying the value of an auto increment column, nor from inserting directly with a specific value. Yes, rollbacks don't roll back consumed values, but an auto increment column isn't immutable and the table isn't required to use the next value.

I've seen an application do things like an INSERT ROLLBACK SELECT LAST_INSERT_ID() to "reserve" IDs... or even perfectly acceptable things like reserving IDs out of a SEQUENCE. Those weren't all MySQL systems, but it did lead to confusion sometimes why gaps might appear or why timestamps might be "inconsistent".

The above one was a potential problem through 5.7 though, as it was possible to reuse some values since MySQL kept the auto increment value in memory only. INSERT followed by a ROLLBACK, then restart the server and you could get reused IDs. It's rare, but I've seen it. However, but it looks like they changed it with 8.0 to save the auto increment value to a system table now. That's a good thing.

> One advantage of using an incrementing integer is that rows will be ordered on disk based on when they were created

If each identifier starts with a logical time, say lamport timestamp, you can still get the same ordering effect without incrementing integers in a centralized place somewhere.

I've written my own UUID generation function which uses the current timestamp with microsecond precision for the first 64 bits and a random value for the last 64 bits. So far it's been a great success.

Collisions are extremely unlikely unless you have Google scale and generated UUIDs are mostly ordered.

I've also done this at a medium ish scale. We had to be careful with how the uuids were generated, though. Specifically which portion of the bytes contained the timestamp (and in what order) since different databases store uuids differently.
Have used this to great results in postgres, mysql, redis, etc.
> Is there anyone who can go a little bit more in detail?

* UUIDs are way more painful than serials to recognise, remember, input or transmit especially if you're not dealing with huge tables. "18574" is easy to read/grok, "21caeffa-0fca-4f4e-b845-46ef0576e42a" is not.

* UUID are 128 bit instead of 32 for most serial PKs by default, this may or may not matter. Note that this doesn't just impact the table itself (lowering data locality and thus performanes: less stuff fits into caches) but also any FK as well as wire transmission (where the size explodes as you're going to transmit a hex version of the UUID so always at least 32 bytes, a decimalised u32 maxes out at 10 bytes).

* because UUID are random data they're intrinsically non-ordered (as opposed to serial ids which… are), this means your writes are all over the place and clustering is defeated, this adversely impacts your reads and writes in SQL dbs (some other DB techs especially distributed ones prefer the scattering: https://news.ycombinator.com/item?id=14524174)

The latter can he mitigated by using "ordered UUIDs": you can generate UUID1 (nominally time-based) such that the final value has a sequential "head" and a random "tail", either by taking control over the process or by generating a regular UUID1 and rejiggering it a bit: https://stackoverflow.com/questions/412341/how-should-i-stor...

> UUIDs are way more painful than serials to recognise, remember, input or transmit...

Completely agree, but I've found this can be a non-technical "feature" too. Serial integer primary keys are much more susceptible to human error when doing any sort of direct database manipulation.

Make a typo on a integer PK? Wrong user gets deleted. UUID typo? Row not found (almost certainly).

Another source of error I've seen is when someone in sales asks "Hey, can you remove User #1234?", but they really meant Customer #1234." With UUIDs, there's no "collision" between the tables.

Clearly there are better process/tool-based ways to prevent these types of mistakes, but it's a useful side effect of UUIDs.

You can somewhat mitigate the typo problem with integers by encoding them to the outside with parity included.

An 8 bit parity should be able to easily tell any possible typo in a 32 or 64bit integer and even correct errors. You could even put the parity into the lowest 8 bit of the integer.

I'm currently working on this for a project of mine to not only prevent typos but tolerate them by using the parity and using letters that are far apart on my QWERTZ keyboard (bit local but it should work for most keyboard sets and I have parity to fall back on)

Yes but that's a full UUID, using a 64bit integer with some encoding is far shorter.
> UUIDs are way more painful than serials to recognise, remember, input or transmit especially if you're not dealing with huge tables. "18574" is easy to read/grok, "21caeffa-0fca-4f4e-b845-46ef0576e42a" is not.

One of the reasons we use them is because theyre not easily recognized or sequential.

> The latter can he mitigated by using "ordered UUIDs": you can generate UUID1 (nominally time-based) such that the final value has a sequential "head" and a random "tail"

Specifically, in PostgreSQL such UUIDs can be generated using uuid_generate_v1mc()

you can use uuid_to_bin for this, check the docs: https://dev.mysql.com/doc/refman/8.0/en/miscellaneous-functi...
I would suggest to write your own UUID generator. Instagram published their own approach [0], we are using something very similar to it. It seems to originate at Facebook, as the IDs generated from both instagram and FB are very similar. It can be completely automated in Postgres (probably in Mysql too).

On the plus side, because it's timestamp based, you can use the generated IDs in sorting and paging as there is a guarantee that each passing second will yield larger bigint. One caveat. If you are going to use it in Javascript, make sure you send it as string as Javascript only supports 53bit integers (due the fact that all integers in Javascript are floating points).

[0] https://instagram-engineering.com/sharding-ids-at-instagram-...

Or you can shuffle it and convert to binary using uuid_to_bin like in this example uuid_to_bin(uuid(), true): https://dev.mysql.com/doc/refman/8.0/en/miscellaneous-functi...
The one time I used UUIDs as primary keys I quickly regretted it. They're huge, they're not possible to manually enter ("which row ID was causing that bug?") and they make foreign key relationships ugly and large too.

If you want to generate IDs independently of the database you can do so using a "ID generator" mechanism.

Set up three redis instances (or MySQL or anything else that can increment a counter). Have each one increment by three each time. Start then at 0, 1 and 2.

Now you can ask any of those theee instances for a new ID and you'll get one that has not been used before, thanks to them being offset from each other.

I first saw this technique used by Flickr when they switched to a Shaffer database.

I believe one of the sought after features when people want to generate ids outside the database is for offline apps, so there is no way to connect to outside counters. The client should have a way to locally generate an id that is final and will be used for the rest of the entity's life. This simplifies the synchronization with the backend because otherwise there must be a (temporary) client id and a final true id. This leads to very convoluted code that deals with two logically different entities, the "pending" ones and the "synced" ones.
UUID (or GUID on SqlServer) are essentially 128-bit integers. The fact they don't have inherent meaning beyond uniquely identifying a row and establishing referential integrity is a plus. In my opinion the primary key is for the database's needs; a human-friendly number or code for lookup is both optional and trivial to implement so when there's a requirement for an easy reference number or code for a record I use both a GUID for the primary key and a unique (or autonumber) field for human reference.
Is "Shaffer" a typo of "sharded"? If not I'm curious what the term means (all that google turns up are a bunch of papers by a guy with that name)?
There are functional benefits for having UUIDs as a primary key, but yes there are performance impacts on writes and ORDER BY. The best way to find out how it will impact your application is to have performance tests in place, and test out the primary key change in a development environment. I do not think you'll able to determine the impact on performance/scalability based on "pure thought".
Why would you want to ORDER BY on a UUID field? Not trolling, I honestly can't think of a reason why you would want to do this. Secondly, aren't UUIDs treated by the database engine as a 128-bit integer? If they are being treated as varchar fields then I can see how this would affect performance negatively but again, I question if this is really the case.
What if you just want a stable row ordering, and don't care what that ordering is?
For random (and not timespace prefixed) uuids, you can end up hitting more blocks because if locality of reference, if you are using b+trees. If you are using an lsm index, you get blocks of data written at the same time in the index anyway, so your "slow" disk isn't so bad, because that is in your cache already. For b+trees and random uuids, data in blocks are basically scattered everywhere. So your index lookup of 1 billion items could hit 1 billion leaf blocks, instead of 1 billion / entries per leaf.
Why not generate an UUID field that’s unique, but keep a surrogate integer primary key on the server? You can join on either one on the server, but keep track of the rows using the UUID. Asking as someone with no SQL expertise (relative to the HN crowd).
That's generally the route I've gone in the past, particularly for complex data models. The entity has a uuid field with a unique constraint along with an auto-incrementing int/bigint for a primary key. The UUID is what gets exposed for any external/API usage, but all internal join logic leverages the int primary key.

That way you mitigate the potential performance impact of doing complex joins on UUID fields. While also gaining a bit of flexibility in any future change management process by decoupling your internal database ids from any publicly exposed id. So instead of e.g. having to coordinate with external applications to ensure things don't break when you switch your id from an int to a bigint, you keep the uuid consistent and internal database optimizations and changes stay transparent to stakeholders of the database.

I use UUIDs in mysql as primary keys in tables with a few million records, and I'm about to migrate a few tables with billions of records from bigint to uuid.

Never saw any real problem. I use char(36), as it's easier to query manually when needed, but I' looking into binary(16) for those billion row tables.

I think most issues with fragmented tables are old problems since ssd, and the overhead is something you will only notice in benchmarks.

If your keys ate supposed to be uuids, the just use them and get the hardware to handle it. In reality, you're most likely to be affected by a zillion other things before an uuid as pk.

> I use char(36), as it's easier to query manually when needed, but I' looking into binary(16) for those billion row tables.

I would use whatever data type your RBDMS's UUID generator returns or the programming language your application is written in. If your RDBMS supports a UUID or GUID data type, however, I would 100% use that because you'll invariably have functions which help you deal with it.

Remember, however, that many (most?) RDBMSs store records in pages (or blocks) of a fixed size typically between 4KB or 8 KB, and they won't allow a record to span a page (usually when a record is too long for one page, non-key data will be moved to non-paged storage which is slower). In other words, if you reduce your record size by 20 bytes you might not actually see as big a change as you'd expect. You'd be storing less data per record, but you're maybe not changing the records per page. You're not increasing the efficiency of your data store at all because of how the data is physically stored. It also means that the answer might be different for each table since each table has a different row size.

Bottom line, however, is that I would favor storing UUIDs the way your particular RDBMS vendor tells you they should be stored. If your application has particular problems with storing UUIDs that way I would look at alternatives, but generally the RDBMS vendors have thought about this a little bit at least.

MySql doesn't have a UUID data type, the UUID() function returns a varchar. The way you store it is mostly preference and driver defaults. The C# driver used to handle binary(16) as Guids, then they deprecated that in favor of CHAR(36). But when dealing with a bilion rows, each byte counts and I'll favor binary(16) because it's smaller and that helps with the index sizes and memory usage.
but it supports conversion to binary (+shuffle to keep order) using "uuid_to_bin"
Yes, but this is new to mysql 8, which is pretty recent. Also, I always use uuid v4 to explicitly prevent any ordering.
I can't comment on uuid as I never used that for ids. But, I can tell you that asking for a next ID from a postresql primary key sequence will make sure it is never used again because it is an atomic operation. Once a next value is requested and you keep it, you can be sure it will not be generated again. I've used this for many things.
we use both uuid's created in our application layer and stored as char(35) and also uuid's produced in the database stored as uuid data types. We have tens of millions of records and have no issues. Once we get upgraded to PG10 and can use hash indexes for them we expect a further speedup.

I don't know anything about the author, but every time in the past ive heard someone say that uuid's as keys are a problem it turns out theyve never actually tried it, theyre just saying what theyve heard in the past. Theoretically they should be worse than they are - and theres no doubt that a regular int/bigint would be faster - but the truth is there are so many other things that are going to slow you down before that.

uuid's aren't guaranteed to be unique at generation, there is still a non-zero chance of it having a collision. using it as a primary key to be generated by the database helps mitigate that, as there will normally be a uniqueness clause on the index.

creating that uuid on the client likely will not accomplish what you're hoping.

It's pretty unlikely to get a collision[1], and the client should be able to handle the gracefully (regenerate the UUID and retry).

[1] https://www.quora.com/Has-there-ever-been-a-UUID-collision

The rarity of UUID collisions depends on the quality of implementation. The Quora answers assume UUIDv4 with a high-quality RNG.

I was able to reproduce UUIDv1 collisions at will when the timestamp had microsecond resolution and the clock sequence had to be generated randomly each time. That is, I simply had to get two processes to generate the same fourteen bits within a microsecond.

That only seems possible if both processes were running on the same physical machine (and hence share the same input MAC address).

It's a nasty corner case, but I'm sure the UUID designers considered it a valid tradeoff, since I believe the algorithm was designed for a distributed scenario. In the single-machine case an atomic counter would be a much easier solution with very reasonable efficiency anyway. Still, it might have been clever to also include the local process id in the UUID, I wonder why they didn't to that.

At any rate the problem is easily worked around by running the two processes on different machines, i.e. ensuring you have at most one UUID generating process per host (with respect to the database table in question).

> It's pretty unlikely to get a collision

unlikely is not zero, which was why I commented. I'd hate to rely on uniqueness of something that has a chance of not being unique.

If you generate 1 billion UUIDv4s a second, it would take, on average, 85 years for you to produce a duplicate, and the resulting list of UUIDs would take up ~45 exabytes. And keep in mind that even if inserting a row fails because you've somehow managed to generate a duplicate UUID, it is trivial to make a new UUID and retry. Since the database enforces the uniqueness constraints of primary keys, I'm hard pressed to come up with a scenario in which generating a duplicate UUID would actually do anything serious.
GP pointed out the special version of the algorithm that handles collisions gracefully. Note that this algorithm ("open addressing") is so frequently used that you will probably find a variation of it in pretty much any piece of software you have ever used. It's a well-understood method, not only from a practical perspective but also in terms of theory; check out e.g.: https://en.wikipedia.org/wiki/Linear_probing
Primary keys on postgresql are unique (and non null). It’s not possible to change that behavior as it’s the definition of a PK field.

You can thus create more UUID fields that are unique and non-null by specifying that instead of “primary key” on column creation.

https://www.postgresql.org/docs/8.1/static/ddl-constraints.h...

Can't you just enforce uniqueness anyways for that field? And then have the client retry?
Yes, we are planning to just fail, as the chance of a collision is really low and the user will then be able to just retry creating the object.
You can easily guarantee uniqueness within the system without any coordination.
> uuid's aren't guaranteed to be unique at generation

If they are not unique, they are not really UUIDs. In which case you should tweak the algorithm to make uniqueness guaranteed. Like add a client id and its logical time in there.

The difference here seems to be the exact definition of "unique".

A UUID has a finite length, so if we generate N new UUIDs in a finite time-interval it seems clear that - in theory - we must have collisions for large values of N (or even infinite N), regardless of how clever we are in seeding our random number generator. But I don't think this can be fixed without using a variable length value or refusing to mint new UUIDs once the available bits are used up.

Of course in practice that should not really happen; at least if we only run on hardware from vendors where we can assume that every MAC address will be unique, the only way to actually get a colission - in practice - would be by generating more than 2^B UUIDs on the same machine within a fairly short timeframe and fairly large B.

EDIT: In v1 of the algorithm it seems that B=14 and the "short timeframe" is the smallest resolution increment your system clock supports. That is if we assume the "uniquifying" clock sequence is produced by incrementing a counter. So we can say that for practical purposes, a collision is impossible on a modern system unless we have invalid MAC assignments to our hardware, unreasonable transaction rate, or an incorrect implementation of the UUID algorithm.

Am I missing something?

> by generating more than 2^B UUIDs on the same machine within a fairly short timeframe and fairly large B

Which can be made impossible.

IDs are part of the finite system and don't exist on their own. And the system can perform a finite number of operations both concurrently and in a finite time-interval with some amount of uniqueness distinguishing nodes and other useful properties. Making it possible to always find an id generating algorithm for this system that can never have collisions.

I agree, if we are allowed to make the UUID arbitrarily large (and not be limited to the 128 bits from the "official" UUID algorithm), it should always be possible to set B to a large enough value so that our scheme assigns a unique and finite-sized UUID to every possible state of the system, since all the system's parameters (number of servers, time, etc) should be discrete and finite. I.e. basically make B large enough that 2^B becomes greater than the number of discrete timesteps in our experiment. That's an interesting observation.
You can get away with a lot less than 128 bits for most systems. It's only that big because it tries to be universal.