Hacker News new | ask | show | jobs
by saryant 4276 days ago
The article says exactly what they ran on. EC2 i2.8xlarge instances which have 32 cores, 800GB SSD and 244GB RAM.
3 comments

Also used the "Enhanced Networking" option on the instances which means single root I/O virtualization underneath.
I read that. But how does that compare with the nodes they're comparing against ("10x fewer nodes")?
The old entry had 10Gb/s <full-duplex> (40 nodes/rack 160Gbps rack to spine. 2.5:1 subscription), 64GB of RAM, and 12 x 3TB SATA.

The network part is probably the most important one here, and both have comparable network.

Since each node was handling 500GB of data (roughly), I think the disk speed may have been a more critical factor since each node had 244GB of memory. Their nodes used SSDs; the older nodes used spinning rust. The seek times alone will be a killer.
Not sure why you mentioned seek time. In large scale, distributed sorting, I/O is mostly sequential.
If only that were true -- the shuffle is typically seek-bound when the intermediate data doesn't fit into cache (plenty of papers show this pretty conclusively).
Hi Todd,

Except in the case of MR 2100 nodes the entire dataset fit in memory :)

3TB SATA would indicate spinning rust too, so slower storage. It's far from an apples to apples comparison.
Not just 800GB of SSD; 8x 800GB of SSD!
yes, that's the key. The local IO on previous reported systems was HDDs, thus 300-600Mb/s at best with 6 drives per machine, while this guys are getting 3.3Gb/s. Getting full performance - 1.1Gb/s - of 10Gb network - were they lucky or AWS now is that good ? (couple years ago i was able to get only 400Mb/s node-to-node on cluster compute nodes there)