Hacker News new | ask | show | jobs
by WorldMaker 1257 days ago
ULIDs are sorted semantically different from UUIDs. UUIDs have 6+ different sort orders depending on who you ask and which library call you make. All of those sort orders are different from what you'd get just string forms of UUIDs. There are cross-platform sorting endian issues due to the struct order that GUIDs were originally designed to be grouped by.

ULIDs have a single, consistent sort that matches both byte patterns and string representation. That's a huge semantic difference.

Sure, ULIDs make no claims to accurate sorting or total ordering or monoticity beyond a single machine, but ULIDs aren't designed to be a Snowflake/Thrift replacement, they are designed to be a UUID replacement. You are correct that they make no more guarantees than UUIDs, but they don't have to, that was out of scope of their design. I can understand how that makes it less useful for some of your applications, but that doesn't make it not useful for all sort of applications. (Including many applications that once used UUIDs successfully but want something with a cleaner string representation and fewer cross-platform sorting headaches.)

1 comments

ulid1 < ulid2 does not reliably tell you anything about the ordering of events one and two (without careful architecture most systems don’t have), but people read the spec and want to believe that it does. I think this makes the provisions for sorting more misleading than helpful.

It would help if the spec talked about clock skew and other issues to design and test for; ordering has a heavy cost.

You still seem to be assuming that the only reason to sort IDs is a total ordering of events. There are plenty of other reasons someone may want a good enough partial order of events or a sortable ID of things that aren't events in a system.

It might help if the spec mentioned things like clock skew and distributed time keeping, but mostly just to state that those problems are out of scope of the spec itself. You are right, ULID itself wasn't designed for those specific classes of problems. I think you are just too easily dismissing the classes of problems that ULID does solve reliably well.

I admit I don’t see the use case for IDs that are mostly ascending.
If you are curious, just some of them:

- UIs that want minimal visual thrashing in "user time" (wall clock time)

- Databases and B-Trees and other storage with primary key indexes/clusters that offer slightly better writes/clustering for "roughly but not necessarily exactly in order insertions": key/value stores, document databases, SQL databases

Wall clock time is plenty of time for a UI to rely on a partial ordering.

Databases are generally designed for arbitrary order insertions, it's just common that they are most optimized/efficient for "roughly in-order". A partial order is generally good enough to opt-in to most of the optimizations/efficiencies and reduce worst cases, especially if the "out-of-order" insertions occur below wall clock time and things like journaling-based transaction semantics versus competing with clustered inserts, reindexing scanners/services, and so forth.