Hacker News new | ask | show | jobs
by htrap 2028 days ago
I feel that there is a real need for permanent Storage with respect to Open Source.

Code breaks when old packages are unpublished or repositories deleted. Push once and fetch forever solves this.

Also Centralized solutions are providing open source collaboration tools for free, storage for free, because of their revenue from enterprise customers.

What happens when they decide to shut down? or change their policies? or just comply with wrongful takedown notices?

3 comments

Would IPFS or DAT not suffice there? How is 'another permanent storage, that is a piece of a larger project, better than one that has the sole purpose and focus?
The challenge with IPFS and DAT is that you have no guarantees around the data reliability. The DHT style of p2p sharing pretty much only works for popular content. Incentivized storage networks can onboard any type of data and guarantee high uptime.

It's also been my experience that IPFS has significant performance issues. If you use a professional gateway like ipfs.io or cloudflare it runs at good speeds but as soon as you switch to being fully peer-to-peer it's almost unusable.

I don't have much experience with DAT, it may not have the same performance issues.

disclaimer: I work on an incentivized storage network called Skynet

Those are but some of the problems that IPFS, Dat (or Storj and probably Skynet too) do, or will face.

That doesn't make "rolling your own" any more reliable, performant, etc. So the question becomes even more apt: why will "building your own decentralised storage as requirement of a much larger project" work. If even the focused, dedicated "decentralised storage projects" cannot solve some problems?

With git it's rare for a project that's actually in use to go completely memory-holed, every contributor effectively having a local copy of the resource.

Using git (generally github) repositories for dependency management is, IMO, a hack and so it's not surprising that it often breaks. I like the way buildroot handles it (I'm sure they're not the only ones, but that's the one project I'm most familiar with):

- The buildroot buildbot fetches third party packages dependencies and archive them.

- When you build your buildroot image locally, it attempts to fetch from the third party directly. If the file doesn't exist anymore, it falls back onto the buildroot cache instead.

You could also easily add your own caching layer in there if you wanted too. I think that's distributed computing at its best: simple and robust, with a clear and easily understandable architecture. No blockchain-based proof-of-stake distributed storage, just a series of wget. And of course since everything is authenticated with a strong hash it's always perfectly safe.

Buy hard drives, do frequent backups, store redundant backups remotely, use RAID. At the end of the day someone must pay for the hard drives and they could just stop paying for it one day. There is no such thing as permanent and free.