Hacker News new | ask | show | jobs
by 1337shadow 2170 days ago
They've just archived the HEAD of the 6000 most popular repos

> We’ve archived 6,000 of the world’s most popular repositories as a proof of concept for future archives.

> The snapshot will consist of the HEAD of the default branch of each repository, minus any binaries larger than 100KB in size.

1 comments

Archive Program director here - the 6,000 repos were on the single proof-of-concept reel we archived last autumn. The full archive consists of millions of repos, including all repos with at least one star with any commits in the year leading up to 02/02/2020.
Hello Jon, the info page mentions binaries larger than 100KB are not archived. What about images of 500KB? I really am curious what these archived tar.xz files flook like. Would have been nice if the project site included an example of what retrieved data will look like. A lot of readme.md files have illustrations. Either way it's a cool project and I like what your team did.
where can we find the full list of the 6000 archive repo?