|
|
|
|
|
by kevinemoore
2455 days ago
|
|
I just want to give a plug for sharing data in the public cloud and S3 in particular. Jed Sundwall (AWS Global Open Data Lead) sums it up really well: "The cloud completely changes the dynamic for sharing data. When data is shared in the cloud, researchers no longer have to worry about downloading or copying data before getting to work. Instead, they can deploy compute resources on-demand in the cloud, where a single copy of the data is made available. It is much more efficient to move algorithms to where the data is, than to move the data to where the algorithms are, and this makes it cheaper for researchers to ask more questions and experiment often." See the full whitepaper here: https://s3-us-west-2.amazonaws.com/opendata.aws/AWS_Sharing_... |
|
I love this quote, thanks. I do try to do things in the cloud as much as possible, but often times it's more practical for TCO reasons to do things locally.
This quote makes me wonder if in the future we'll see some sort of external SSDs with a RasberyPi-like portable GPU hooked up. Some sort of dedicated Storage+Computer USB hybrid.
What I like about our schema/anonymization solution, is you can put fake data and real code online, and then people can make changes to the real code on the cloud, and you can run those reliably on data locally.