Hacker News new | ask | show | jobs
by _dark_matter_ 3710 days ago
https://hadoop.apache.org/docs/r1.2.1/streaming.html
1 comments

This is likely the best answer for those who wish to code within the map/reduce paradigm by hand and would prefer to use python.
BUT WHY

Your performance is going to be complete and utter crap because you're paying for serialization on every single data element.

Dask is higher performance and more pythonic: http://matthewrocklin.com/blog/work/2016/02/22/dask-distribu...