Hacker News new | ask | show | jobs
by chrisbaglieri 3014 days ago
A lot of these datasets have been available for some time: https://aws.amazon.com/blogs/aws/new-aws-public-data-sets-tc.... Perhaps what is most surprising is that the list hasn't grown a ton since then.
2 comments

As someone who has spent a considerable amount of time on data that has ended up on this page, I think the fact that the list hasn't grown says more about the priorities of other companies than of AWS. Amazon doesn't (yet) have time to build and maintain these datasets themselves: they work with others to build and maintain it and then fund the storage and transmission fees.

I helped build the Terrain Tiles dataset as part of Mapzen, which recently shut down. The OpenStreetMap data exists on the AWS Public Datasets page because it's useful to Humanitarian OpenStreetMap Team. If you're able to convince your company to generate and work with a public dataset, consider reaching out to the AWS and Google public datasets teams to get it hosted and publicized.

In my experience, the list that AWS keeps on their public page isn't completely up to date. There are two major datasets in the neuroimaging community that are hosted in S3 (see my comment elsewhere on this page), but AWS hasn't widely publicized this fact for some reason.