| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by amelius 1778 days ago
	It was more about scaling than about relational algebra though.

2 comments

_huayra_ 1778 days ago

The scale factor was really overblown and without caveats. If you have Google-scale computations where it is not unlikely that at least 1 / 10k machines serving that request will bite the dust during the query, then of course MapReduce makes sense.

However, in most other cases there are now far better alternatives (although tbh I'm not sure how many were around when MapReduce was introduced).

The main limitation around mapreduce is the barries imposed by the shuffle stage and after the end of the reduce if chaining together multiple mapreduce operations. Dataflow frameworks remove these barriers to various degrees, which often lowers latency and can improve resource utilization.

link

prasadjoglekar 1778 days ago

Exactly this. I remember in 2007 being able to process TBs of data on commodity hardware with Hadoop. You got decent throughput, decent fault tolerance out of the box wrapped in Java that many average software developers (yours truly included) were comfortable with. You could scale data and people.

It dramatically reduced the cost of entry for many ad-tech applications.

link

CRConrad 1778 days ago

> It dramatically reduced the cost of entry for many ad-tech applications.

You say that as if it was unequivocally a point in its favour.

link