| HN Mirror

Naturally it depends on the business use case or product situation, but a lot of XYZ per client architectures fail because some things don't "scale down" enough while others don't "scale up" enough.

Warning: Broad generalizations ahead.

Most successful shard strategies work because each division is hopefully roughly uniform. It's kinda like with binary tress, they work best when balanced. Clients are often more of a long tail, skewed, distribution. You often have tons and tons of small clients where the per-client overhead could be painful, while at the same time your biggest clients might outgrow what you can support in a shard.

To strawman your pods/client, dealing with 1k vs 1mm individual deployments is way different than dealing with a clientId column where the unique elements go from 1k to 1mm. Good indexing might be cheaper. But if you different regulation domains (HIPPA, GDPR, China, etc.) it can be easier to just run whole different data centers.

These balancing acts are what make data infra problems fun to work on.