| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by coffeemug 4278 days ago

Not the author of the project but I can think of two reasons.

Firstly, you can think of map/reduce as the infrastructure for higher level operations (sort of like the assembly language of large scale data processing that higher-level data processing systems compile to). A breakthrough in the quality of the operational engine significantly impacts the experience of doing higher-level work, so if someone finds a better way to run map/reduce jobs, it's a win for everyone. Shipping jars instead of docker containers, and not having snapshots are serious drawbacks in the existing map/reduce infrastructure that significantly impact users in negative ways.

Secondly, an easier way to specify map/reduce jobs (via a simple web server that exposes API endpoints to do data grouping, mapping, and reduction) is a dramatically simpler, more composable way to expose map/reduce jobs. Building higher level infrastructure on top of this abstraction is an order of magnitude easier than doing it on top of Hadoop, so it could be a better underlying platform for the generalization work being done in the community.

1 comments

jdoliner 4278 days ago

This says what I was trying to say a lot better than I did. +1 coffeemug.

link