|
|
|
|
|
by _huayra_
1771 days ago
|
|
The scale factor was really overblown and without caveats. If you have Google-scale computations where it is not unlikely that at least 1 / 10k machines serving that request will bite the dust during the query, then of course MapReduce makes sense. However, in most other cases there are now far better alternatives (although tbh I'm not sure how many were around when MapReduce was introduced). The main limitation around mapreduce is the barries imposed by the shuffle stage and after the end of the reduce if chaining together multiple mapreduce operations. Dataflow frameworks remove these barriers to various degrees, which often lowers latency and can improve resource utilization. |
|