Hacker News new | ask | show | jobs
by josefcullhed 1481 days ago
Great job with Marginalia. Do you plan to open source your data as well or only the code?
1 comments

If I can solve the logistics, maybe. I don't have the needed bandwidth and off-site storage at this point.
What sort of sizes are we talking about? I'm thinking if it would be possible to "crowdfund" the storage-costs for a requester-pays s3 bucket for it.
I'm probably producing around 250-500 Gb data/month at this point.
What's the cumulative size for the index to date? I'm not rich by any measure, but if it's within reach I'd probably fund the storage costs.
Reach out to Jason Scott -- textfiles.com -- and see if he knows anyone who would be interested.

He might know some folks.

Any idea how compressible that is?

If it's something that compresses really well (eg text data in a database), then live compression filesystems (eg ZFS, likely others) could potentially help make that workable.

The data is either already compressed or dense binary soup, so no luck.
Idea: Host datasets as one or more torrents. Thoughts?