Hacker News new | ask | show | jobs
Yahoo Webscope: a reference library of interesting and useful datasets (webscope.sandbox.yahoo.com)
76 points by skreuzer 3769 days ago
4 comments

Archive Team will have their hands full if and when Yahoo finally shuts down. Their footprint on the web is still enormous and quite a bit of that stuff is valuable.
I didn't look at all of them, but the size of these sets seems pretty trivial... like you could put them all in S3 for a few dollars a month
Thanks for pointing them out. Here's a link to the page where they track endangered datasets: http://archiveteam.org/index.php?title=Alive..._OR_ARE_THEY
Thanks for the link. Was a fun read

Google wants you to think they will be here forever

I'd like to contribute to dissemination of this stuff by turning it into nice public APIs.

Which ones should I start with? Where to look in the docs to get started?

These data are for academics only.
I applied a while ago (Jan 14) for their news feed dataset and have not heard back. Does anyone here know the average turn around time? Also does it make a difference if you are a professor/ phd student?

Thank you