Hacker News new | ask | show | jobs
by niftich 3306 days ago
The problem is, content delivered over the web "leaks out" in standard usage and interfaces with the overworld, unless prevented by DRM. People save files, sometimes using contrived workflows, and content that was transferred to a browser for display (such as WebP) will eventually be downloaded, shared, and reposted somewhere else. Facebook once famously migrated serving most images as WebP and had to back out because people were downloading images, only to end up with .webp files that are unsupported in other programs [1].

In the past, I articulated, on the topic of 'Ask HN: Software for writing a diary that will be around in 20 years from now' [2]:

I'd be wary of the archival potential of formats that are solely used on the Web with little usage on tangible physical hardware by major commercial publishers -- the Web of today moves very fast and technologies come and go. Google is pushing WebP, WebM, but work is already under way on a big consensus format called AV1. When AV1 comes out, new VP8/VP9 content will likely no longer be produced. Browsers periodically prune older features, 20 years is almost as long as the web has been around, and given enough time support for the format may only be available in software that make format coverage an explicit goal (ffmpeg, libav, VLC). Opus is being made a mandatory audio codec for WebRTC, teleconferencing is usually ephemeral -- will there be lots of .opus files sitting on disks in the future? Too early to tell, not worth gambling on.

The context of this was choosing formats for the express purpose of long-term archival, but our incidental usage of formats today will shape the sort of files that will be naturally around in our future.

[1] https://news.ycombinator.com/item?id=5589206 [2] https://news.ycombinator.com/item?id=12979854#12980359

1 comments

This is why many tend to consider only uncompressed/lossless formats viable for archival usage. Doubling down on those doesn't really address what you were talking about — something that you make today being viewable in browsers of the future — but with the original content itself preserved in a way that maintains its full quality, the 'deliverable' formats can be updated periodically from the original masters in a way that the content will be viewable with software 20 years from now, and without degrading quality each time.

We had similar problems in the past with physical media, and still do to some extent, but in the purely digital domain this problem is somewhat more tractable (content negotiation helps facilitate these transitions for those willing to work that into build pipelines and maintain those over time, for example.)