Hacker News new | ask | show | jobs
by staticassertion 1848 days ago
Another benefit of using sequential integers is that you can leverage a number of optimizations.

For one thing you can represent a range of data more efficiently by just storing offsets. This means that instead of having to store a 'start' and 'end' at 8 + 8 bytes you can store something like 'start' and 'offset', where offset could be based on your window size, like 2 bytes.

You can leverage those offsets in metadata too. For example, I could cache something like 'rows (N..N+Offset) all have field X set to null' or some such thing. Now I can query my cache for a given value and avoid the db lookup, but I can also store way more data in the cache since I can encode ranges. Obviously which things you cache are going to be data dependent.

Sequential ints make great external indexes for this reason. Maybe I tombstone rows in big chunks to some other data store - again, I can just encode that as a range, and then given a lookup within that range I know to look in the other datastore. With a uuid approach I'd have to tombstone each row individually.

These aren't universal optimizations but if you can leverage them they can be significant.

1 comments

Doesn't the offset approach run into trouble when sequence values get skipped due to rollbacks?
It's going to be an optimization that assumes some constraints on how you interact with your database.