Hacker News new | ask | show | jobs
by 549362-30499 3037 days ago
I'm scratching my head about "we dont have money to pay for kafka+zk+monitoring(kafka,zk)". Kafka and Zookeeper are both open source. As are monitoring and alerting tools such as Prometheus. Surely the hosting and storage costs are similar. So what does this project offer its creator, other than a great deal of infrastructural debt and all the latent bugs of a roll-your-own solution that lacks a community?
2 comments

Just setting that up will cost at least 2-4gb of ram, and we are streched thin as it is, 2gb of ram would mean we have to get one more node for our kubernetes cluster in gcloud.

Me and my team understands the 300 lines of code that go into rochefort and can twist and modify it for our needs.

Performance will make or break our startup, which deals with real time user behaviour analytics, and doing high performant java for a while, I know very well how much time I will have to spend looking at G1 logs to fine tune it.

I am sure we wont use rochefort after we scale up, but for now I think gives us greather velocity than kafka (just because if we want to modify kafka we have to spend a week on a simple change).

I want to be able to add more meta information in the header, or read the files from another process, rsync and read them to my laptop, add custom reducers etc, all those things will take me minutes with rochefort and days with kafka.

I'm assuming he means money as in time, to install, configure, optimize, and monitor those distributed systems.
I'm a self-taught, have a single dedicated server, I have a single-instance Kafka running on top of ZK. Yes, I lose the benefits of replication, failover, etc. I don't need that though. The whole installation took me half an hour, learning Kafka took maybe 3 hours, and as long as my server's been up, Kafka's been up.

Granted, I am not monitoring Kafka, but I do other processes.

The other nice thing is that now that I have ZK, other software that need it can just reuse the same process.

I think using the maintenance cost as a reason to write your own tool, is a short-sighted decision.

Even if that's the case, deploying and scaling a Kafka cluster is something that hundreds of companies have figured out and publicly written about. It's something that you can hire an experienced engineer to fix. When this thing runs into problems, they will be all new ones.
These days you can also find kubernetes scripts that handle the Kafka installation, setup, routing, etc.

That just leaves the "simple work" of administering and tuning which, as you pointed out, is competence that's steadily growing in the industry.

As opposed to the time required to implement, debug, and support a custom solution?
I love the concepts Kafka defines so clearly, but the software is too complex and have dozens of knobs you have to adjust.

Simple code with "obviously no bugs" vs complex code with "no obvious bugs".

> I love the concepts Kafka defines so clearly, but the software is too complex and have dozens of knobs you have to adjust.

This is one of our biggest headaches, and it's not even that a Kafka server itself is so configurable. We have hundreds of teams writing client applications, and jumping on bridges because Kafka clients have poor configurations is getting old. Too many knobs to twiddle, but I guess that's what happens if you're expecting to be able to tweak for high performance.