Hacker News new | ask | show | jobs
by themadprogramer 1430 days ago
You have to, however, also consider the multi-domainness of YouTube videos. Yeah sure, there's billions of hours of clips not one person can watch in a single life-time. But unlike your 2000-year old Roman shopping lists, we have footage of events that are anchored to a particular time-period. Or location.

One of the most impressive things you can do is, try searching up a landmark. My personal favorite is the [Jumping Stone](https://www.youtube.com/watch?v=u1TtMN8nXTM) on the Nias Island of Indonesia. What would have otherwise just remained a novel tourist attraction, forgotten by the modernity of the 21st century, is now essentially a "tag" which has hours of footage associated with it. Thousands of tourists travelling back and forth, locals growing old, new people being born, buildings being built and demolished around it. You can even just study how video quality improved in that particular region. That there IS something wholly unique to this era and definitely worth preserving. YouTube as a company has figured the logistics of storing it, but the question of how humans can hope to read such data remains yet unanswered.

1 comments

Like so many questions of this type, I think the Internet Archive is the answer. They are not a warehouse of stuff stored on media/formats that are forever becoming obsolete -- they store data, keyed by timestamp, sometimes with metadata. How they store it and serve it up is irrelevant, and they will upgrade as needed (I assume this is a continuous process).

If the IA didn't exist, we'd have to invent it.

I suspect if the IA ever decides to properly archive YouTube, they'll interface with the folks that run it directly. Archive Team is, to put it as diplomatically as I can, not a good organization.

I don't think that's a complete answer though. The article lists ContentID as the major reason videos get deleted, meaning that copyright trolls could go after IA if they want to. What we desperately need is copyright law reform.
I'm not 100% positive, but I recall a conversation we had on TheEye a while back with some IA reps and they simply can't archive YouTube. I recall a private project to archive just the video Metadata ended up in the hundreds of Terabytes. The videos themselves must be a gargantuan collection. Most YouTube archiving thus far is pretty much done and maintained by private individuals.
The Internet Archive has a lot of stuff they don’t make available over the internet.

A couple of months ago, they sent me a thumb drive of some stuff I requested (for a nominal processing fee).

I'm surprised to hear that, what's the point of keeping stuff they don't make available? What sort of stuff is this?
Well, they do make it available, just not over the internet, I assume for legal reasons.

In my case it was some TV footage broadcast on the evening of September 10, 2001.

> If the IA didn't exist, we'd have to invent it.

You know, a part of the original company vision for YouTube prior to the Google acquisition was really something akin to the IA, in that they did pride themselves with hosting footage of the Indian Ocean Earthquake:

https://www.youtube.com/results?search_query=Indian+Ocean+ea...

Now, acting as diplomatically as I possibly can, I can say that your suggestions of the IA and YouTube interfacing together were at a previous point in time a continuous process. But a number of factors have made direct cooperation between the IA and Google (thereby YouTube) come to a screeching halt.

At this current point in time, we stand at a historical crossroad. And I'm only here to just act as a humble messenger ;)