Hacker News new | ask | show | jobs
by eva1984 3670 days ago
I bet the author didn't count the account of time of downloading data to a single box. Scalability, sometimes, is not a choice.
1 comments

I really want a content-addressed storage system with differential compression to solve this problem.

I was browsing through dat for awhile, but haven't caught up with it lately:

https://github.com/maxogden/dat

Basically disk is so cheap that you should just keep 2 or 3 copies of your data around. And then you can sync them really quickly and do the processing on any one of N machines.