Hacker News new | ask | show | jobs
by petermelias 3153 days ago
To clarify the use of the term "rebalancing" here for people who are familiar with Kafka's terminology: this is (I believe) referring to partition _storage_ rebalancing (redistribution of topics replicated across brokers). It is _not_ referring to Kafka's notion of rebalancing across a consumer group subscribed to a set of topic partitions.

The specific concern here is the possibility that Kafka's ISR strategy can potentially result in a corrupt leader partition and truncate messages to recover from a broker machine failure. The unclean leader election configuration setting for Kafka brokers is relevant here.

Also "better" is subjective depending on your configuration, requirements, and storage backends.

1 comments

another point added to 'rebalancing' -- when kafka rebalances the partitions, it has to copy all the data for the partitions that are moved around. it might not be a big problem when retention is small. however it is pretty worse when retention period is longer, rebalancing is going to exhaust all the bandwidth (both network and I/O) in the cluster. people don't realize the fact until they want to grow the cluster (adding more brokers) to support increased traffic.