Hacker News new | ask | show | jobs
by btrettel 2296 days ago
What are the available solutions to the problem of locating other copies of webpages or documents online? (Let's assume that the page of interest is not on the Internet Archive.) This article mentions a few:

> I was able to find these pages through Google, which has functionally made page titles the URN of today.

> Given the power of search engines, it’s possible the best URN format today would be a simple way for files to point to their former URLs.

Daniel Bernstein proposes a document ID that can be found in search engines: https://cr.yp.to/bib/documentid.html

I actually started using this before, but found it to be clumsy and stopped using it.

Someone else has suggested a UUID instead: https://lobste.rs/s/xltmol/this_page_is_designed_last#c_nis6...

But that's still clumsy. I'd prefer something shorter.

Perhaps the title of the page is the best option, as people are more likely to have that saved than the UUID: https://lobste.rs/s/xltmol/this_page_is_designed_last#c_0snr...

> I imagine it’d be uncommon for someone has the UUID but not the website saved.

1 comments

If a page is available on IPFS, then the page would be accessible under a URL based on the page's hash, and anyone with a cached/saved copy of the page would be able to help host it at that URL. (IPFS works very similar to torrents with magnet links.)
Interesting. How would IPFS handle updates to a page? I assume that the updated page would have to be hosted separately.
Yeah, that generally applies to content-addressable systems like it (and torrents). An updated version of the page would have a new hash and a separate IPFS link.

To make a URL for content that can change, you would need to use a system that lets you create a mutable URL, and then you can make that point to an IPFS link. DNS works well for that. You can add a "_dnslink" TXT DNS record to a domain, and then when anyone with an IPFS-supporting browser (or browser add-on) accesses the domain, their browser can fetch the content over IPFS from anyone seeding it, and help seed it if the user wants. (Yes, this wouldn't work well at all for domains with dynamic content. Works great for domains that have static content, including sites made by static site generators, etc.)

I personally serve my blog with IPFS by making its files accessible over IPFS, putting a _dnslink TXT record on my domain pointing to the directory's current IPFS link, and then my domain's A record is pointing to a service (cloudflare-ipfs.com) that responds to HTTP requests by serving contents from the IPFS link that my _dnslink record points to. I'm using multiple free IPFS pinning services to keep my blog's files seeded on IPFS. I like that I'm not tied down to any of them, and I could easily replace them with other services or my own server without changing any of the rest of the setup. Additionally, anyone that liked my blog could help seed it themselves, so it could outlast the pinning services and me.

Assuming a world where IPFS was commonly-supported and my content was well-liked enough to get seeded by others, the only point of failure for keeping my site up is the domain name staying staying renewed and the DNS service I'm using staying up. Though if those went down, as long as someone still had the last IPFS link to my site, my site would still be accessible through that as long as people seeded it.

I believe the Ethereum Name Service would also be a good decentralized alternative to using DNS for keeping update-able URLs pointing to IPFS content, but I haven't personally used it and I don't know if there are good integrations that make it usable for that today. IPFS also has a feature called IPNS for creating mutable links to content that can be updated by whoever owns the private key, which sounds perfect on paper, but it doesn't work well in my experience for a few reasons (latency, timeouts, etc), so I wouldn't recommend it.

Thanks for all this information! I think I might do the _dnslink TXT record approach you mention for my own website (just a static site), though I'll first need to learn more about DNS and IPFS.