Hacker News new | ask | show | jobs
by kijin 989 days ago
If your database simply shards keys sequentially, it's going to get hotspots in a lot of use cases, like plain old integer keys and timestamps, not just UUIDv7. In that case it would be fair to say that your database is doing it wrong.

Fortunately, there's no rule that says you should shard your keys using the sequential part up front.

One of the rules for generating randomness from environmental sources is to throw away the high bits and only use the low bits. Distributed databases should do the same if they want a good distribution.

2 comments

What distributed databases shard on the low bits? How do they do something like a range query?

The closest I’ve ever heard of is sharding based on a hash (e.g. CockroachDB can do this on request[1]) but most distributed databases with strong consistency (Spanner descendants in particular) default to “doing it wrong”.

[1]: https://www.cockroachlabs.com/docs/stable/hash-sharded-index...

As I understood it, a big part of the premise of the post was that they see sequential storage (either in db or cache layer) as desirable