Hacker News new | ask | show | jobs
by lenish 3896 days ago
Well, it doesn't matter so much how many hosts join the network. You still need to convince some members of the network to view your content at least once in order to distribute the data.

I suppose you could argue that nonvaluable content would vanish over time justifiably, but then it's not really a, "Permanent Web."

Edit: Apparently I can't reply to your reply to this comment, but thanks for the link. I hadn't seen that.

2 comments

I believe IPFS was partially intended to help the Internet Archive in that regard. They'll be the consumer of last resort for all objects, thereby bringing about the Permanent Web.

https://ipfs.io/ipfs/QmNhFJjGcMPqpuYfxL62VVB9528NXqDNMFXiqN5...

It'd be interesting to see a browser implement caching using something like IPFS. When a regular HTTP GET (or whatever, really) request is sent, the IPFS enabled browser could look for a `Link` header with the `ipfs:` scheme and `rel="alternate"` in the response, and use that as an alternate place to look for the content. The Etag header could carry the hash, so the browser would could tell on subsequent requests which hash it associates with the mutable URI. In the event of a 304 it'd look up the data in the IPFS cache – which may or may not actually be on disk. If not it might still be a more efficient fetch than HTTP since there may be many peers; worst case scenario, the only peer is the HTTP server you made the request to in the first place.

I suppose `Content-Location` could be used as well, but I don't know how well browsers that don't understand the `ipfs:` scheme would react to that, although the spec doesn't preclude non-http schemes used in the URI.

It'd be an interesting experiment anyway, and could be a boon to adoption of a distributed system like IPFS.

Come to think of it, `Content-Location` is much more semantically appropriate than `Link <ipfs:hash>; rel="alternate"`; the latter is just a link, but the `Content-Location` header would tell you the canonical location of the requested content. For an IPFS enabled client, this would mean that if they want that specific content, they'd never even hit the HTTP server on subsequent requests, but dive straight into IPFS. That said, existing clients may get very confused by an unsupported scheme in that header. Presumably, that client should go `IDK lol` and go use the not-so-canonical URL instead, but I'd be surprised if they'd actually work like that.
Can't you just make the initial request for your document yourself? (I know this works to seed Freenet; not as sure about IPFS.)
It caches things locally (in ~/.ipfs/blocks), so you'd have to request it from a secondary system to get it even on another node. However, my understanding is that if that second system left the network and you left the network the data would still be lost.

You need a third party to request the data and not leave the network to keep the data around.

Given either the third party reliably remains in the network (e.g. the Internet Archive) or you can consistently get new third parties to request the data and cache it then it will remain in the network. The latter does not seem particularly reliable to me, however.

I think it's more of an issue with "marketing" or how IPFS is (was?) presented: It's not a magic web-in-the-sky for all the things -- but it does make it really easy to a) host stuff redundantly, and b) scale out distribution. So you could edit a static web page on your laptop, have a "post commit hook" (or other automagic system) that pulls/pushes published posts to two-three permanent servers -- these could be backed up as normal, or you could just spin up some VMs and have them "restore" from some "master list" of your own content (hashes).

Now as long as at least one device is up (and has the content), you can bring backups on-line easily. And as long as at least one server is connected to IPFS other nodes can get the content, and in theory, any spike in popularity will get distributed and cached "suitably".

An added bonus is that if you publish something like a controversial, but popular, political blog post/expose, and some government throw you in a hole that officially doesn't exist -- your readers, if they're on IPFS, will maintain an active backup of your content by virtue of reading it.

This is a lot more convenient than someone having to explicitly spider it etc (although a combination would probably work/be good idea -- eg: an IPFS "dmoz.org" where authors could register content index-pointers for others to spider/download into their IPFS nodes -- and index for search).

I don't disagree on any particular points. When I first read about it and started playing with it I definitely felt like my expectations were set to something other than what IPFS actually provides.

That said, I think systems of this nature are worth pursuing and perhaps IPFS itself can be improved for more general purpose use cases. For my part, I think it'd be awesome to be able to write some html, css, make some images, `ipfs add ~/website` and then be able to link anyone my content and have reasonable guarantees of it's existence for the rest of my life. I can host my own websites, but it's not a particularly enjoyable experience.

> This is a lot more convenient than someone having to explicitly spider it etc (although a combination would probably work/be good idea -- eg: an IPFS "dmoz.org" where authors could register content index-pointers for others to spider/download into their IPFS nodes -- and index for search).

IIRC it's possible to follow announcements of new hashes on the network and retrieve them automatically. I picked this up from #ipfs on FN, I believe, so I'm not 100% sure about it. Doing that would make an IPFS search engine fairly robust (and interesting to build, actually).

ipfs dev here! This is indeed possible, you will be able to listen on announcements (provider messages) of hashes that are near your nodes peerID within the kademlia metric space. To get a complete picture of all hashes on the network, you would need to ensure your nodes had reasonable coverage over a good portion of the keyspace (enough that the K closest peers calls for any hash would return at least one of your nodes).

I really want to build something like this, just haven't had the time to do so.

You don't need to do this with Freenet. When you insert data it is pushed to other nodes - a completed upload means other nodes have the data. You can turn off your node and the data is still available.