| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by htn 3566 days ago
	FWIW, you can get Kafka packaged as a fully managed and HA service from https://aiven.io on AWS and also Azure, GCE and DigitalOcean. But if the Auth0 runs their entire operations on AWS, maybe Kinesis would have been a more natural transition.

3 comments

mpd 3566 days ago

Eh, Kinesis has some pretty significant trade-offs to know about if you are comparing it with Kafka (e.g. data retention time and write latency).

link

janczukt 3566 days ago

We need an on-premise and cloud story, so cloud only solutions did not cut it for us.

link

PieterH 3566 days ago

The article is a little old. How has the system run since you deployed it? Do you have any interesting figures?

link

janczukt 3566 days ago

It continues to run beautifully. Since we rolled it out back in 2015 we had zero issues with real time logging. I have particularly fond memories of the first week after rollout, it felt like vacation. I finally could get some sleep.

link

ZenoArrow 3566 days ago

I'm in a similar boat. I'm hoping to propose Kafka to help with some data replication and consolidation tasks, but it has to be both on-premise and as low maintenance as possible (low maintenance in the sense of the work local developers would do).

To anyone reading this with Kafka experience, do you have any tips/advice when it comes to maintaining a Kafka service?

link

hashmp 3566 days ago

Use 5 zookeepers, on a separate set of servers.

Use configuration management such as chef to allow you to quickly build new nodes and to roll out changes accross the cluster. You will need to make tweaks. The chef Kafka cookbook which is the top result on Google has means of coordinating restarts of brokers accross the cluster. Use consul as a locking mechanism for this. You could use zookeeper, but consul works well for auto DNS registration and auto discovery.

Use the yahoo Kafka-manager app to manage the cluster and to see what is going on.

Don't use the Kafka default of storing data in /tmp/. Your OS will periodically clean it.

link

Gigablah 3566 days ago

> Don't use the Kafka default of storing data in /tmp/.

That seems like MySQL level of bad defaults.

link

cbsmith 3565 days ago

Five kind of kills performance compared to three, and doesn't map well into AWS, where you generally have 3 or 4 AZ's. I tend to go with three but make sure you've got fully automated responses towards failures.

link

ZenoArrow 3566 days ago

Five zookeepers? Seems like a lot. Why five? Is it hard to keep them active?

Thanks for the tips.

link

darkr 3566 days ago

Zab [the distributed consensus algorithm that powers ZK] shares some similarities with Paxos, and requires a quorum of nodes to be online.

If you want highly available ZK, your choices are 3, 5, 7... nodes, for which you can have 1, 2, or 3 nodes offline at any one time.

If you have one node fully down on a 3 node cluster, and there is even a tiny network blip or partition (as often happens in cloud environments) then you are down.

link

abritishguy 3566 days ago

Kinesis is very poor

link