Hacker News new | ask | show | jobs
by horsawlarway 2548 days ago
Your vision sounds pretty terrible. At a very cursory glance it seems fine - I don't want my information web to degrade.

Once you dig even a little bit into this, though... you get a quagmire of old and useless information. Outdated sites never go away. They just become a constant burden on the whole system. Information that's been refuted and invalidated years back is still alive and kicking because a domain exists and can be indexed, and there's no chance to ever update it, since the owner is long since dead.

The longer this lasts, the more and more noise to information you get. It's like having an incredible information web and then giving it dementia.

1 comments

I think it would be absolutely incredible to peruse old blogs and websites from a century ago. Clippings from 75 year old newspapers or letters are interesting for plenty of reasons. The cost of things; The news of the day; The writing style. Imagine direct access to this sort of content from 750 years ago!
Someone's going to have to pay for hosting that data, though. The domain name is only a small part of the picture.
That's why we have wayback machine. Or we use another way to archive it.
The Wayback Machine is great, but it's basically a hack. Archiving shouldn't depend on a single centralized entity occasionally crawling the web and saving chunks of it to its archive (but only what it finds during the crawl, and excluding content with large file sizes, such as videos).

It ought to be built into the architecture of the Web, decentralized, immediate, and (at least for small file sizes) on by default. Oh, and censorship-resistant. Even for large file sizes, I think there ought to be some very easy-to-use mechanism to donate either hard disk space or money to publicly archive content of your choice.

Those are lofty goals, of course, but the current web has is quite vulnerable to bitrot as it is, and there's no guarantee the Internet Archive will continue to operate indefinitely.

> Archiving shouldn't depend on a single centralized entity

It doesn't. It's decentralized, with lots of archivists, it's just not federated.

Are there any other Web archives with scope comparable to the Wayback Machine? I have not heard of any. I guess there may be private archives which are not publicly known or accessible.
As the Wayback Machine currently operates, the present owner of a domain name can make the archives go away.