I think I'm talking about cluster-scale network bisection bandwidth vs attached storage bandwidth. With replication/erasure coding overhead and the economics, the order of magnitude difference still prevails.
I think your point is a good one in that it is more economics than systems physics. We size clusters to have more compute/network than storage because it is the design point that maximizes overall utility.
I think it also raises an interesting question in that let's say we get to a point where the disparity really no longer holds: that would justify a complete rethinking of many Spark-like applications that are designed to exploit this asymmetry.
And that's for one SSD. If you're running on a server rather than a laptop, aggregate storage bandwidth will almost certainly be higher than any single network link.
I think your point is a good one in that it is more economics than systems physics. We size clusters to have more compute/network than storage because it is the design point that maximizes overall utility.
I think it also raises an interesting question in that let's say we get to a point where the disparity really no longer holds: that would justify a complete rethinking of many Spark-like applications that are designed to exploit this asymmetry.