|
|
|
|
|
by drob
3574 days ago
|
|
The data is sharded by customer and then sub-sharded by end user within the customer. For all but the tiny customers, 100% of the data on a logical shard will belong to the same customer. That means our subqueries will never touch data from more than one customer unless the customer is very small. (And, if the customer is that small, it should be easy to make the query fast anyway.) |
|
May I ask how many logical shards do you have per physical shard/machine? And what is the average size of a logical shard on disk?
You wrote "the data is sharded by customer and then sub-sharded by end user within the customer", but malisper wrote above that "clustering by time winds up being a much bigger win". Isn't it contradictory?