Hacker News new | ask | show | jobs
by dpkp 3658 days ago
kafka-python maintainer here. Our library is designed to be correct first, easy to use second, and fast third. It should not be surprising to anyone that using C extensions improves python performance. I have avoided requiring C compilation in kafka-python primarily because I've found that very few python users care about processing >10K messages per second per core (remember in python w/o C extensions you are generally bound to a single CPU, so spinning up multiple processes usually improves performance. see multiprocessing). I've also found the python infrastructure for distributing C extensions to be not easy (see goal #2 above). But that is changing! I would definitely consider leveraging C extensions for wire protocol decoding given the recent improvements to wheel distribution on linux. I'm not sure whether I would go so far as to delegate the entire client to a C extension. Part of the fun of python is that you can play with all of the guts at runtime. I've found users are very willing to hack up kafka-python internals to help debug issues. I dont think I could expect the same community involvement if it was all distributed as a complied C extension. But I could be wrong.

Anyways, always fun to read benchmarks. I hope kafka-python makes someone out there smile. That's the best benchmark in my book.

1 comments

Distributing Python +C extensions are easy with Conda.

https://conda-forge.github.io/