| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by manigandham 2954 days ago

This is what partitions are for. They are used to scale out both publishing and consumption by assigning each consumer a single partition. This logic used to be complicated client-side code but now is included into Kafka itself, along with automatic offset saving on some interval (saved as entries in another topic).

Also consumers can be in a "consumer group" so you can have multiple clusters of consumers each reading the entire log separately, but shared within each cluster.

Also I'd recommend looking at Apache Pulsar for a next-generation architecture that combines Kafka's log semantics with the low-latency routing and individual message acknowledgements of message queues: https://pulsar.incubator.apache.org/