Hacker News new | ask | show | jobs
by Xorlev 3365 days ago
> Well, there's knobs you can manually control to throttle the producer, but it's in your own hands. You're dancing at the edge of a cliff if a consumer has died and messages start expiring; there's nothing stopping data loss.

At least how we run Kafka, our logs expire after 7 days, and our alerts go off pretty quickly if consumers fall behind. Additionally, we archive all our messages to S3 via a process based on Pinterest's Secor [1]. If we were to ever run so far behind that we needed to start over, we can just run mapreduce jobs to rebuild datastores and then let consumers catch back up.

Since Kafka is explicitly a pub/sub replicated+partitioned log, it doesn't make sense to provide backpressure. A single ailing consumer would cascade failure through your system. If you need synchronous or bounded replication, Kafka isn't for you.

Having run Kafka in production for 2 1/2 years now, I can say with certainty that we've never felt like we were lacking in terms of features from Kafka its self, nor have we ever had a consumer fall so far behind it could never catch back up. We do leverage our archives for batch jobs though.

[1] https://github.com/pinterest/secor

1 comments

I think it's worth qualifying my criticism more explicitly. I think Kafka doesn't have the right feature set for Kafka Connect. When trying to use it as a data pipe for real-ish time updates between two persistent stores, rather than a persistent store in itself, it's inadequate.