| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by simonw 3863 days ago
	The worst thing about Kafka in my experience has been the consumer libraries for languages like Python. That's not to say that they are terrible or unusable, just that they don't have nearly as much polish as the core of Kafka itself. I'm very much looking forward to new client libraries built against the new consumer API.

4 comments

emmett9001 3863 days ago

https://github.com/parsely/pykafka

PyKafka is currently used in production at Parse.ly, and I've gotten feedback from a lot of other folks who are using it in production as well. The big benefit over kafka-python is that PyKafka supports multi-consumer groups that balance consumption via ZooKeeper with its BalancedConsumer interface. See this thread ( https://github.com/Parsely/pykafka/issues/334 ) for more detail on the differences between the two libraries.

The PyKafka project is prioritizing support for Kafka 0.9 in the next few weeks/months. This includes ensuring that the existing consumers work against the updates to the 0.8.2 consumer API as well as implementing support for the new consumer API introduced in 0.9. Roadmap information can be found here ( https://github.com/Parsely/pykafka/blob/master/doc/roadmap.r... ).

link

czinck 3863 days ago

I'd say the Python library I used was borderline unusable, we stopped using Kafka (it was just a trial period, wasn't rolled to production yet) because of limits in one of the most popular Python interfaces. The interface worked well enough, the API was good, but they didn't (and the bug tracker seemed to imply they wouldn't) support synchronizing reads across processes for the same group. What's the point in a distributed synchronized log if you can't do synchronized distributed reads of the log?

link

emmett9001 3863 days ago

Sounds like old news, but if this is still an issue, PyKafka does allow balanced reads across a consumer group. https://github.com/parsely/pykafka

link

czinck 3863 days ago

Yeah, it's no longer relevant for that project, but I like the ideas behind Kafka and will probably use it again so I'll look at PyKafka before I look at kafka-python in the future.

link

vdnkh 3863 days ago

Same problem for .NET/C#. Nothing established/built enough to feel comfortable using it in production.

link

kppullin 3863 days ago

While it feels a bit hacky and unclean, you may want to try using IKVM (http://www.ikvm.net/) to translate and import the Java client in to your .NET project.

Given the difficulty in building a client period (distributed systems, race conditions, etc), being able to rely on the widely adopted & supported official client is quite attractive.

In my test cases the performance is on par running natively on the JVM, except when compression is enabled.

Another option is using the REST proxy and accepting the trade-offs that imposes.

link

felipesabino 3863 days ago

Same here with node.js.

All options are too painful, either use the buggy packages available OR mix the stack with java just for the kafka bit. :(

link

joshbaptiste 3863 days ago

Hence why I use Groovy for any Kafka endeavors.

link

vorg 3863 days ago

The same problem exists with Python, C#, node.js, and Groovy.

link