|
|
|
|
|
by jandrewrogers
1777 days ago
|
|
Definitely, I've run into this problem many times. UUIDs are pretty bulky data types in database terms, usually larger than most other types in a typical table on average and possibly larger than rest of the row after including the index overhead, which creates cache pressure. Logical columns in a table are commonly stored physically as a vector, of UUIDs in this case, expressly for the purpose of compressing the representation. This saves both disk cache and CPU cache. For queries, the benefit of scanning compressed vectors is that it reduces the average number of page faults per query, which is one of the major bottlenecks in databases. Also, some data models (think graphs) tend to be not much more than giant collections of primary keys. Using the smallest primary key data type that will satisfy the requirements of the data model is a standard performance optimization in databases. It is not uncommon for the UUID that the user sees to be derived from a stored primary key that is a 32-bit or 64-bit integer. |
|
The main feature of a uuid is it allows distributed generation. 32-bit or 64-bit integers are almost always sequential numbers. The sequential nature allows efficient page filling and index creation, but the contention involved in creating a sequence grows rapidly with scale.
So while a 128-bit uuid is larger than a 64 bit integer, this version allows for the bulk of the benefits of sequential integers while reducing the biggest drawback of contention at the point of creation.