Hacker News new | ask | show | jobs
by enobrev 1595 days ago
I was thinking of this generally. I've built hundreds of websites and while many had plenty of similarities, there would still be a plethora of databases and languages to support. And that's just my own history. There are quite a few developers, languages, data formats, services, etc.

And so the system that can run any "boxed" website - as in a working archive - would need to support all of these things, as they are today, in some form or would need to emulate / translate these things.

But instead of normalizing and archiving, my off-the-cuff suggestion is basically to freeze it in time. OS, languages, databases, services, data, code, everything is exactly as it was last built. Ideally, provided the storage medium held all its bits and the power source were compatible, it should boot up in 200 years without issue.

In a sense this is what we've accomplished with VMs and containers. But those still require an underlying system. If the system is part of the medium, you only need power and a console.

1 comments

> there would still be a plethora of databases and languages to support

I don't think you absorbed the point I was going for there.

It sounds like you're prioritizing the website as an entity unto itself above the content. You're not thinking of the content as individual entities, and the website merely being a place where these entities are collected—like books on a bookshelf. That first false step in the conceptualization of this stuff is I believe to be _the_ reason that websites are "hard" to make manageable for a non-technical audience. All the implementations of "Information Management: A Proposal" have been executed poorly because the thing that TBL was outlining was obviously going to require computer systems, and when computer systems are involved you get the kinds of people who work on computer systems and their (largely self-serving) tendencies—which a lot of the time involves making everything in that system another system.

Do you need to support a plethora of databases and languages to maintain the other books on your bookshelf—that is, the dead-tree ones? Even for digital artifacts... the music you listen to, do you encode those as programs with unbounded power that you execute every time you want to hear a song? Or do you restrict yourself to relying on "dead" formats for what is sufficiently "dead" content, and then offload the need for computation to a separate machine that operates across a well-defined interface and that exists outside the content itself—a music player app? When it comes to backups, do you exclusively use self-extracting archives instead of something reliable like a tarball or a ZIP? Why not?

At the end of the day, websites are really just about pieces of content and the names we assign to them. It's the entanglement of all the unnecessary violations of the Principle of Least Power in the form of live systems and running programs that has made this stuff hard to manage (often not just for non-technical people, but programmers and sysadmins, too).

Thanks for your thorough response. You've made your point well.

As I gather it, and I realize I'm simplifying, you're suggesting that websites need to be compiled for consumption just as a studio recording would be. I'm suggesting otherwise: That it's too varied of an experience to be compiled sufficiently across all the different ways in which the web is composed and consumed.

The content is not alone in what needs to be archived in order for a website (or any software) to live beyond just today. We don't _only_ read websites. We experience them in different ways depending on the purpose.

I have 30 tabs open right now: HN, google music, wordle, twitter, github, docker, nytimes, aws console, a couple blog posts, some software documentation, a couple pricing pages, and google maps open. Most of these things are not even remotely related in functionality, and we're supposed to be able to archive them into some similar format? For what purpose?

Much like a game, the experience of the content can be inherent in the content itself. But games are written to the same system. NES emulators and PS emulators and so on are required to enjoy game archives for systems that are now gone. My suggestion is to skip the emulator and include the environment with the content - which allows every piece of software to exist as it were originally conceived.

I could export a list of places I've been from google, but that's not useful to me in the same way without an interactive map to see the places therein. Same goes for foursquare and yelp and anything else that I allow to track my location for some purpose.

It would be incredible to be able to click around a static localized mostly-functional read only version of slashdot or fark or digg or prodigy or compuserve or geocities, or more recently HN or reddit. The UI for these applications is a _part_ of the content. Because the features are included in the content. Upvotes and Downvotes on HN are not the same as they are on slashdot, because slashdot's came with reasons. How does one experience that without the actual software? And how do we compile those software onto a single system that we'll be able to just load from nothing?

Do we _need_ to support a plethora of databases for books? No. Because they're books. Static, unsearchable, non-functional. Amazing in their simplicity. Perfect for what they are.

Software isn't that, nor should it be.

This isn't even a new suggestion. We used to have arcades - which were filled with separate systems built for single games, and each one was the size of a closet, including a screen and a UI. You can buy a 50 year old arcade game and, provided it was taken care of, it'll still boot up and work as it did in 1972.

Now we can get an entire system, with any software, and any database, onto a computer small enough that you could put hundreds of them on a single bookshelf. We could have terminals at libraries that you could plug these cartridges into and consume the information therein, and interact with them, in the original format.

I'm definitely not saying its the answer and should be done - only that it would be reasonable to accomplish as the world is today.