Hacker News new | ask | show | jobs
by throwaway83242 1489 days ago
Thanks for making this open source. What about the index? Are you open sourcing that also?
1 comments

If I can solve the logistics of publishing that data, then sure. In its most compressed form it's still of order 100 Gb.

The intermediate goal is to have some standardized testing dataset of a couple of hundred megabytes to a gigabyte or so.

Like another commenter suggested, torrents might be a good solution once it's seeded
Cool. Looking forward to see the intermediate dataset.

I think you should post a ToDo list on the git repo. People can then contribute their skills.

Yeah, that's a good idea. I'm looking at a bunch of ideas for reducing the friction to contributing, still a bit of work that needs doing in that area.