|
|
|
|
|
by x0x0
4163 days ago
|
|
I've used hadoop at petabyte scale (2+pb input; 10+pb sorted for the job) for machine learning tasks. If you have such a thing on your resume, you will be inundated with employers who have "big data", and at least half will be under 50g with a good chunk of those under 10g. You'll also see multiple (shitty) 16 machine clusters, any of which -- for any task -- could be destroyed by code running on a single decent server with ssds. Let alone hadoop jobs running in emr, which is glacially slow (slow disk, slow network, slow everything.) Also, hadoop is so painfully slow to develop in it's practically a full employment act for software engineers. I imagine it's similar to early ejb coding. |
|
It's comical how bad Hadoop is compared even to the CM Lisp described in Daniel Hillis' PhD dissertation. How do you devolve all the way from that down to "It's like map/reduce. You get one map and one reduce!"