Hacker News new | ask | show | jobs
by nchammas 4276 days ago
> I would love to see MR and Spark compete on the exact same hardware configuration.

You may find this benchmark [1] interesting to read.

It needs some updating (a lot has changed since February 2014), but it compares Shark (which uses Spark as its execution engine) to Hive (using Hadoop 1 MapReduce as its execution engine) and a number of other systems.

The benchmark is run on EC2 and is detailed in such a way that it should be independently verifiable. Hive and Shark are run on identically sized clusters, though I don't know if the other details of the configuration were identical.

[1] https://amplab.cs.berkeley.edu/benchmark/