| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by dikei 876 days ago
	MapReduce was basically a very verbose/imperative way to perform scalable, larger than memory aggregate-by-key operation. It was necessary as a first step, but as soon as we had better abstraction, everyone stopped using it directly except for legacy maintenance of course.

2 comments

lupire 876 days ago

The abstraction came first. MapReduce was quickly used as a basis for larger-than-machine SQL (Google Dremel and Hadoop Pig). MapReduce was separately useful when the processing pieces require a lot of custom code that doesn't fit well into SQL (because you have hierarchical records, not purely relational, for example)

link

DeathArrow 876 days ago

Can you point, please, to the better abstractions?

link

willvarfar 876 days ago

SQL comes to mind.

Every time you run an SQL query on BigQuery, for example, you are executing those same fundamental map shuffle primitives on underlying data, it's just that the interface is very different.

link