|
|
|
|
|
by thinkersilver
2838 days ago
|
|
The use cases overlap neatly with Kafka's. Everything from it's usage of zookeeper, time-and-storage-based retention tuning are similar The announcement does not clarify the reason they use this over kafka. Is it because Kafka doesn't scale to millions of logs on a single cluster or is it because kafka is not sympathetic to heterogeneous disk arrays containing SSD and HDD. I strongly suspect it may be latency of writes at scale but this is pure speculation. I don't know. If I understand why anyone might use this I'd contribute to building language bindings for the APIs. |
|
- It's designed to work with a large number of logs (roughly equivalent to partitions in Kafka), hundreds of thousands per cluster is common.
- Sequencer failover is very quick, typical failover time when a sequencer node fails is less than a second.
- It supports location awareness and can place data according to replication constraints specified (e.g. replicate it in 3 copies across 2 different regions and 3 racks).
- Because of non-deterministic data placement, it is very resilient to failures in terms of write availability.
- If a node/shard fails, it detects the failure and rebuilds the data that was replicated to failed nodes/shards automatically