|
|
|
|
|
by MBkkt
605 days ago
|
|
It's like saying that postgres was designed for distributed setups, just because there are large postgres installations. We all understand that clickhouse (and postgres) are great databases.
But it's strange to call them designed for distributed setups. How about insertion not through a single master? Scalable replication? And a bunch of other important features -- not just the ability to keep independent shards that can be queried in single query |
|
Scaling by the number of replicas of a single shard is less efficient than scaling by the number of shards. For ReplicatedMergeTree tables, due to physical replication of data, it is typically less than 10 replicas per shard, where 3 replicas per shard are practical for servers with non-redundant disks (RAID-0 and JBOD), and 2 replicas per shard are practical for servers with more redundant disks. For SharedMergeTree (in ClickHouse Cloud), which uses shared storage and does not physically replicate data (but still has to replicate metadata), the practical number of replicas is up to 300, and inserts scale quite well on these setups.