|
(2nd user & developer of spark here). It depends on what you ask. MapReduce the framework is proprietary to Google, and some pipelines are still running inside google. MapReduce as a concept is very much in use. Hadoop was inspired by MapReduce. Spark was originally built around the primitives of MapReduce, and you see still see that in the description of its operations (exchange, collect). However, spark and all the other modern frameworks realized that: - users did not care mapping and reducing, they wanted higher level primitives (filtering, joins, ...) - mapreduce was great for one-shot batch processing of data, but struggled to accomodate other very common use cases at scale (low latency, graph processing, streaming, distributed machine learning, ...). You can do it on top of mapreduce, but if you really start tuning for the specific case, you end up with something rather different. For example, kafka (scalable streaming engine) is inspired by the general principles of MR but the use cases and APIs are now quite different. |