Hacker News new | ask | show | jobs
by kevinemoore 2455 days ago
I just want to give a plug for sharing data in the public cloud and S3 in particular. Jed Sundwall (AWS Global Open Data Lead) sums it up really well: "The cloud completely changes the dynamic for sharing data. When data is shared in the cloud, researchers no longer have to worry about downloading or copying data before getting to work. Instead, they can deploy compute resources on-demand in the cloud, where a single copy of the data is made available. It is much more efficient to move algorithms to where the data is, than to move the data to where the algorithms are, and this makes it cheaper for researchers to ask more questions and experiment often." See the full whitepaper here: https://s3-us-west-2.amazonaws.com/opendata.aws/AWS_Sharing_...
1 comments

> "It is much more efficient to move algorithms to where the data is, than to move the data to where the algorithms are"

I love this quote, thanks. I do try to do things in the cloud as much as possible, but often times it's more practical for TCO reasons to do things locally.

This quote makes me wonder if in the future we'll see some sort of external SSDs with a RasberyPi-like portable GPU hooked up. Some sort of dedicated Storage+Computer USB hybrid.

What I like about our schema/anonymization solution, is you can put fake data and real code online, and then people can make changes to the real code on the cloud, and you can run those reliably on data locally.

There's no doubt that local processing is a lot cheaper than the cloud for a lot of workloads.

That's a very interesting pattern--publishing "fake" (perhaps safe or anonymized) data online along with code to spur research and development then running the enhanced code locally on private (e.g., PII data) on local compute resources.

We hope Quilt packages can play a role to make that easier. The package serves as an interface and layer of abstraction between the code and the data so the same code can be run against the safe or private data.