|
|
|
|
|
by srean
5152 days ago
|
|
Hahaha I hear you. "Me too"s are frowned upon here, but I could not resist. There are so many things that I want to learn that I wish sleep was just optional. I know Hadoop is the poster child of all things good, but its API really makes me fall asleep. Not a big fan. Add to that the fact that Google's implementation is (or atleast used to be) 4~6 times faster for similar sized processing jobs and more fun to code in (the latter may be entirely subjective). The funny part is that Google's clusters then were made of weaker machines ! I dont know how it is now. EDIT: It is indeed in C++ that alone cannot fully explain the discrepancy though. Java can be slower but shouldnt be that much slower. I am sure caching, memory footprint and as you said data access latencies and overall design plays a big role. If I remember correctly UIUC has an open source mapreduce framework written in C++ and they claim a similar speedup over hadoop. |
|