| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by discardorama 4276 days ago

It's interesting, but not earth-shattering. The "10x fewer nodes" means nothing; how powerful are the new nodes? What's the network? Do you use SSDs? etc. etc.

They also tuned their code to this specific problem:

"Exploiting Cache Locality: In the sort benchmark, each record is 100 bytes, where the sort key is the first 10 bytes. As we were profiling our sort program, we noticed the cache miss rate was high, because each comparison required an object pointer lookup that was random..... Combining TimSort with our new layout to exploit cache locality, the CPU time for sorting was reduced by a factor of 5."

I would love to see MR and Spark compete on the exact same hardware configuration.

2 comments

saryant 4276 days ago

The article says exactly what they ran on. EC2 i2.8xlarge instances which have 32 cores, 800GB SSD and 244GB RAM.

link

grier 4276 days ago

Also used the "Enhanced Networking" option on the instances which means single root I/O virtualization underneath.

link

discardorama 4276 days ago

I read that. But how does that compare with the nodes they're comparing against ("10x fewer nodes")?

link

rxin 4276 days ago

The old entry had 10Gb/s <full-duplex> (40 nodes/rack 160Gbps rack to spine. 2.5:1 subscription), 64GB of RAM, and 12 x 3TB SATA.

The network part is probably the most important one here, and both have comparable network.

link

discardorama 4276 days ago

Since each node was handling 500GB of data (roughly), I think the disk speed may have been a more critical factor since each node had 244GB of memory. Their nodes used SSDs; the older nodes used spinning rust. The seek times alone will be a killer.

link

rxin 4276 days ago

Not sure why you mentioned seek time. In large scale, distributed sorting, I/O is mostly sequential.

link

tlipcon 4276 days ago

If only that were true -- the shuffle is typically seek-bound when the intermediate data doesn't fit into cache (plenty of papers show this pretty conclusively).

link

Twirrim 4276 days ago

3TB SATA would indicate spinning rust too, so slower storage. It's far from an apples to apples comparison.

link

discardorama 4276 days ago

Not just 800GB of SSD; 8x 800GB of SSD!

link

trhway 4276 days ago

yes, that's the key. The local IO on previous reported systems was HDDs, thus 300-600Mb/s at best with 6 drives per machine, while this guys are getting 3.3Gb/s. Getting full performance - 1.1Gb/s - of 10Gb network - were they lucky or AWS now is that good ? (couple years ago i was able to get only 400Mb/s node-to-node on cluster compute nodes there)

link

nchammas 4276 days ago

> I would love to see MR and Spark compete on the exact same hardware configuration.

You may find this benchmark [1] interesting to read.

It needs some updating (a lot has changed since February 2014), but it compares Shark (which uses Spark as its execution engine) to Hive (using Hadoop 1 MapReduce as its execution engine) and a number of other systems.

The benchmark is run on EC2 and is detailed in such a way that it should be independently verifiable. Hive and Shark are run on identically sized clusters, though I don't know if the other details of the configuration were identical.

[1] https://amplab.cs.berkeley.edu/benchmark/

link