The scale factor was really overblown and without caveats. If you have Google-scale computations where it is not unlikely that at least 1 / 10k machines serving that request will bite the dust during the query, then of course MapReduce makes sense.
However, in most other cases there are now far better alternatives (although tbh I'm not sure how many were around when MapReduce was introduced).
The main limitation around mapreduce is the barries imposed by the shuffle stage and after the end of the reduce if chaining together multiple mapreduce operations. Dataflow frameworks remove these barriers to various degrees, which often lowers latency and can improve resource utilization.
Exactly this. I remember in 2007 being able to process TBs of data on commodity hardware with Hadoop. You got decent throughput, decent fault tolerance out of the box wrapped in Java that many average software developers (yours truly included) were comfortable with. You could scale data and people.
It dramatically reduced the cost of entry for many ad-tech applications.
However, in most other cases there are now far better alternatives (although tbh I'm not sure how many were around when MapReduce was introduced).
The main limitation around mapreduce is the barries imposed by the shuffle stage and after the end of the reduce if chaining together multiple mapreduce operations. Dataflow frameworks remove these barriers to various degrees, which often lowers latency and can improve resource utilization.