|
|
|
|
|
by martinkallstrom
4735 days ago
|
|
My former startup Twingly http://twingly.com has hundreds of millions of blog posts stored (everything collected since 2006) in 128 MySql shards with a unified query interface. The last few months of data are indexed and searchable for free from their website, but the entire archive is kept forever. |
|
However, ArchiveTeam has uploaded all data that they've found (at least 46.23M feeds) to the Internet Archive. That means it's public for everyone to mine through and/or use.
I'm not trying to belittle Twingly here - but their "last few months of data" are maybe not really comparable to completely free and public data - kept forever.