Hacker News new | ask | show | jobs
by amelius 3301 days ago
Why don't we store a url to the decoder inside images and other compressed files for maximum flexibility? I mean, looking at WASM, sandbox technology is sufficiently strong. And performance and availability aren't really an issue either, because for specific cases we can fall back to what we are doing now.
3 comments

> Why don't we store a url to the decoder inside images and other compressed files for maximum flexibility?

Image formats are already some of the biggest attack surface of modern systems, "executable" image formats (hello PDF) have an absolutely ghastly track record there.

And not being able to open an image if you don't have an internet connection sounds dreadful.

Re: attack surface—PDF is complex because it requires a ton of interaction (you can fill PDF forms, etc.) Image decoders should just be pure functions—binary stream in, matrix of pixel structs out. Easily sandboxed—you shouldn't need access to any system calls while doing that decoding.

Re: requiring an Internet connection—you still do require an Internet connection to display an image, if it's in a format whose decoder you don't currently have installed; you just currently have to manually 1. figure out what the format even is, 2. select a library package for a decoder for that format, and 3. install that package.

Presumably such libraries would be able to register decoder-URLs they are (non-reference) implementations of—like they register MIME types today—so your system would be able to display standard formats no problem; it'd just be the weird rare or new ones that would trigger the zero-install process for a decoder.

> not being able to open an image if you don't have an internet connection sounds dreadful

Caching the decoder smells like having already downloaded the image library. Mostly just different in that "first run time" is now always the same thing as "install time".

But, as I already implied, if we can safely run arbitrary "binary" code of the internet (see WASM), why can't we run arbitrary code of a decoder?
The things that make it safe to run that code will significantly negatively impact image decoding speed right now. That's not necessarily a fundamental problem with the universe, but it's a true statement at the moment.

And that's accepting that WASM is safe, which I do not axiomatically accept. History suggests that I am very safe in claiming that most implementations will end up with some catastrophic-level security vulnerabilities in them before all's said and done.

> The things that make it safe to run that code will significantly negatively impact image decoding speed right now.

Okay, but as I said before, for specific cases we can still do things the old way. I.e., upon decoding we detect that the url points to a known format, and we run the fast+safe decoder. For unknown formats, we download the slow decoder and use that. This way, we have more than what we would otherwise have (flexibility). And in the future, WASM will be faster anyway. I only see benefits. Of course, the sandbox should be formally proved correct first.

What's your threshold for significance on image decoding speed? I bet making it 4x slower would be negligible in the vast majority of cases.

The computational power you need for image decoding is extremely narrow and easy to make safe. You need some mathematical operations and some loops. You don't need any APIs or data structures. Mask off all the pointers and you can have have a provably safe interpeter/compiler that runs pretty fast.

Remember that 4x slower means (at least) 4x worse battery drain on a phone. In the modern internet there basically is never any excuse to waste resources.
But it's also on a phone where saving data transfer can give you huge battery benefits. And it's not intentional waste; having this fallback doesn't stop browsers from adding native decoders.
We've already implemented this as part of a web-based AV1 analyzer tool. You provide the URL to both the image and the decoder, which is very convenient because the decoder is constantly changing [1].

Wikipedia also does this to enable video playback on Edge and Safari [2].

The idea to do this within the format itself is quite old. The very early versions of what became Vorbis had a very similar concept, with programmable passes. Matroska once had a field for a link to download the VfW decoder dll (the horror!). It's never really caught on, due to reasons already listed by sibling comments.

[1] https://arewecompressedyet.com/analyzer/?maxFrames=4&decoder...

[2] https://github.com/brion/ogv.js/

That's actually not a bad idea!

The HEIF examples on the Nokia HEIF site do something similar; they implement a HEVC decoder (and do the HEIF container processing) all in JavaScript using http://www.libde265.org. The images you see are decoded into a HTML5 canvas.