|
It's probably worth extending "Your data fits in RAM" to "Your data doesn't fit in RAM, but it does fit on an SSD". So many problems will still work with quite reasonable performance when using an SSD instead. By using a single machine with an array of SSDs, you also avoid the complexity and overhead of distributed systems. My favourite realization of this: Frank McSherry shows how simplicity and a few optimisations can win out on graph analysis in his COST work. In his first post[1], he shows how to beat a cluster of machines with a laptop. In his second post[2], he applies even more optimizations, both space and speed, to process the largest publicly available graph dataset - terabyte sized with over a hundred billion edges - all on his laptop's SSD. [1]: http://www.frankmcsherry.org/graph/scalability/cost/2015/01/... [2]: http://www.frankmcsherry.org/graph/scalability/cost/2015/02/... |