Hacker News new | ask | show | jobs
by lbradstreet 2951 days ago
I love Kafka and the log orientated streaming model, but I often have to think twice before recommending it to clients who would have to manage the ops themselves. Having a managed service on GCP, and Confluent's existing cloud offering on AWS really brings down the barrier to entry. There aren't really any AWS/GCP serverless equivalents (Kinesis has 7 day retention maximum, no key compaction, and less surrounding tooling such as KStreams/KSQL).
1 comments

May I ask why you wouldn't recommend it to teams that have to manage it themselves? I haven't used it myself but my team is currently looking at using it internally. The first project will be integrating it into our log pipeline between nodes and our logstash instances.
There is absolutely no reason other than the overhead you need to self manage the service like you would self manage any other internal service.

If you are comfortable at operations you'll be fine. Some people are not good at ops so outsourcing the problem making the ops side someone else's issue can also be useful.

Self hosting will offer far more options when it comes to scaling and tweaking. Overall on bare hardware costs it's cheaper and faster although up front costs will be higher.

Kafka usecases are rarely elastic so don't gain that advantage in the cloud. Also Kafka's missing tierd storage makes it expensive if storing big volumes of data.

In addition to what others have stated, it’s also a question of productivity - when your team has to maintain ops, you are not doing something else. Is it your highest and best use to be tuning and maintaining Kafka? It could be, but only you and your team can really answer that.

In practice, it’s better to offload things that are not core to what your company is making money on, until you hit constraint points from scaling. As one of the comments mentioned, at PB-scale processing, even bare metal may make sense. (But not always - one of my former employers went down this path early and ended up losing all productivity in R&D because of people fighting to keep that baremetal setup alive for months on end. This really hurt future revenue growth and distracted eng. leadership from key changes in their industry.)

Like with most complex questions, the immediate answer is “it depends”.

It's not that I wouldn't recommend it, I would just think twice recommending it for very small ops teams for certain use cases. It's not so hard to manage, but having it be managed by Confluent is a great option.