| HN Mirror

This is Stonebraker's argument for shared-nothing architecture and it applies well for interactive ad-hoc analytics on well structured data.

Many orgs these days store all data in data lake shared-disk architectures and pull down the subsets. The performance hit of pulling down data over high bandwidth channel such as s3 - ec2 is much more reasonable to companies than storing everything on expensive compute instances just so that the "data would be there" ready for querying if somebody ever needs it.