|
|
|
|
|
by Ozzie_osman
816 days ago
|
|
Curious why you needed to shard at 7TB? I can imagine for some workloads, especially if it's write-heavy, you might start hitting constraints around vacuuming and things like that? But 7TB should be manageable on a (somewhat large and beefy) single machine. |
|
First, the data size is growing and we didn't really know the growth rate in advance. Sharding gives you some flexibility in the infrastructure sizing. And yes, you don't want to wait until the last minute.
Second, it helps us to spread the disk I/O. Possible on a single machine if you're a little bit careful with disk types and sizes. But again, the overall load still grows.
Third, all the bulk operations take a long time on a single server. Each of the distributed servers takes about an hour to back up and 2-3 hours to restore. I'd feel uneasy if it was much longer.