Hacker News new | ask | show | jobs
by 885895 3882 days ago
Interesting write-up. Still, why even support BMP?
9 comments

From the article:

> On the web it’s mostly used on the web for favicons though it can be used for normal images.

Basically the .ico format often used for favicons comes from Windows icons (the first favicon implementation originated in Internet Explorer and MS supported .ico so people could reuse existing application icon files, though IIRC .gif files and other formats supported in web pages could be used from the start too) which are a variant of the .bmp format which has a few extra features like containing different images for different sizes (so a single file can contain separately optimised 16x16 and 32x32 bitmaps).

I assume .ico and .bmp files are fairly rare now, presumably completely unheard of for new sites/apps, but you can't drop support because there are enough of them out there on legacy sites that it would break things people care about.

IE only supported .ico favicons for a long time.
I've just looked it up (assuming https://en.wikipedia.org/wiki/Favicon#File_format_support is an accurate reference here) and was surprised that it seems IE11 was the first to support other formats.

I have kept our icons in .ico format but I thought that was to support proper legacy IE (5 and below). Where I have used other formats I presumably didn't notice the icon not working in IE8/9/10 because I generally only care about Firefox & Chrome and worry about IE10/11 breaking when someone reports a problem [if someone reports a problem in IE prior to 10 outside of my day job (where I have to support IE8 for some rather backward clients) I consider that their problem because of their browser choice, and even 10 is rapidly dropping off my care radar].

Why not? It's an old, well-supported, relatively-well-understood, simple to emit file format that Just Works. It's the lowest common denominator for 24-bit graphics interchange.

The only real alternative to it (== simple uncompressed bitmaps) that I can think of are .tga files, but these haven't penetrated the Internet as well as BMP has.

Why would you ever need uncompressed bitmaps?

Also - all code exposes security risks. Especially image decoders written in C/C++.

> Why would you ever need uncompressed bitmaps?

Because they're incredibly easy to generate programatically. For example, here's a bunch of experiments I did to test the code and image embedding process for my site:

http://chriswarbo.net/essays/procedural

The snippets of Haskell code on those pages are executed when the page's Markdown is rendered, and the images are the results. To make this work, each function maps x and y pixel coordinates to either a Bool (for b/w), an Int (for greyscale) or an (Int, Int, Int) tuple (for RGB). These are trivially converted into strings of PPM image data (via [1]), rendered to PNG (via [2]) and embedded into the page as a data URI (via [3]). The "view source" links show all the code, although it's a bit convoluted ;)

Whilst this example is just for experimentation, I could image someone generating raw bitmaps in a monitoring situation, for example.

[1] http://chriswarbo.net/git/chriswarbo-net/branches/master/sta... [2] http://chriswarbo.net/git/chriswarbo-net/branches/master/sta... [3] http://chriswarbo.net/git/chriswarbo-net/branches/master/sta...

Off-topic: I produced some similar-ish visualisations (also with Haskell).

http://paquari.com/qsort2000.png

Quicksort of 2000 random elements, (x,y) is black iff the algorithm compares x and y. You can clearly see how the pivots get compared to all the other elements.

http://paquari.com/msortOrig2000.png

A similar picture for merge-sort, but here (x,y) is black iff the algorithm compares the elements at original position x and y.

http://paquari.com/msort2000.png

Mergesort, but colouring like in the Quicksort case.

Some libraries output them, because the code for writing them is very simple, and you don't need any separate area of memory for managing the compression or whatever.
It's much faster to write. So it's useful for e.g. game screenshots.
https://en.wikipedia.org/wiki/ICO_%28file_format%29 is used for favicons and can contain BMP.
Good question - it's obviously still widely enough used that the need to decode the format lives on.

Many years ago I wrote the code to support PBM/PGM/PPM images for Mozilla (https://bugzilla.mozilla.org/show_bug.cgi?id=117983). They never got that much traction on the web though, and so support for that image format was dropped: <https://bugzilla.mozilla.org/show_bug.cgi?id=197530>.

I think a more pertinent question would be: why have the browser decoding images at all, rather than loading a dedicated library?

The points mentioned in the article about streaming data and untrustworthy images make sense, but would be useful even outside the context of a browser rendering engine (e.g. many local image files will have come from the Web, so are just as untrustworthy).

I remember when Mosaic used to launch an external program like 'xv' to display images. However in answer to your question, it's not clear that having a library would add any security, since libraries run in the same context as the rest of the browser (especially on Firefox where the default is still to have everything running in a single process). You could have an external program to do rendering, but embedding the output of external programs into the rest of the output is very inflexible, hard to do, and (with the design of X) not actually any more secure.
> it's not clear that having a library would add any security

I don't think that's the main benefit of using/developing an external library for this. To me, using an external library sounds like a good idea because:

- You get code separation. A bit less code on the main Firefox repo, and a repo dedicated only to BMP decoding (or image files in general). Someone who wants to contribute to the BMP decoder wouldn't need to download all FF repo and understand/configure its build system. Big plus!

- The library can be shared among different applications. It doesn't make sense for each browser to have a different implementation of BMP decoding, each with their own bugs. Sharing a library for this kind of stuff would actually benefit security, as a bug fixed by one browser/app developer would benefit the others.

That last one is the biggest thing for me. The BMP example is a very simple one, and not very important. It is in more complex tasks that i think sharing libraries would be much more beneficial.

For example, wouldn't it be great that, instead duplicating so much effort in implementing the streaming capabilities of Media Source Extensions [1], the different browsers shared a library dedicated to that complex task? We could have had a more complete and robust implementation in less total time! And that's just one example; there are tons of complex things browsers do that could be extracted to separate shared libraries.

[1]: And so many bugs https://bugzilla.mozilla.org/show_bug.cgi?id=778617

> code separation

This is incredibly important for security. Reducing complexity and possible attack surface even between components is something that has been ignored in software for far too long.

Crypto shouldn't ever be in the same address space of any process that also does parsing or network I/O. That's just asking for the keys to be leaked (or other problems) when the inevitable bug is found.

I completely agree with this reasoning, and those are pretty much the reasons why Servo's developed in such a modular way (in contrast to most other browser engines).
The obvious answer is that the library may not do what you want. Some examples of what browsers want to do with images in different circumstances:

1) Just determine some metadata about the image (size and a few other things) but don't decode the pixel data yet.

2) Asynchronously decode the pixel data on a separate thread, notify when done.

3) Synchronously decode the pixel data on the current thread.

4) Asynchronously decode the pixel data on a separate thread, notify as you go. This can have variants along the lines of "Progressive JPEG" as opposed to just delivering more scanlines.

Now ideally a good library would in fact let you do all of this as needed (e.g. Firefox just uses libpng for PNG decoding, though even there it's patched locally to add APNG support). Not all image formats have good libraries.

Because (a) the browser can't guarantee to support image formats, since it depends on what's available locally and (b) a flaw in an image decoder on one OS is essentially unfixable by the browser creator. Today they can roll out a new version which patches that flaw.
Might be nice if this code was packaged up as a library.
> why have the browser decoding images at all, rather than loading a dedicated library?

Because a dedicated library could change its API in version 2.0, and at the same time fix important security flaws. At that point your only options are: switch to a different library, write your own library, put the old API back in the original library, or ask the maintainers of the library. All of those options are difficult or unreliable, especially given the small timeframe in which it all should happen.

> Because a dedicated library could change its API in version 2.0, and at the same time fix important security flaws.

The correct thing to do is back-port fixes to the 1.x branch, or come up with an alternative fix if the 1.x/2.x transition changes too much (in the latter case, the 1.x and 2.x versions would essentially be different libraries which just-so-happen to share the same name). Anyone can (attempt to) do this patch, including the library authors, the browser authors (who may be the same people), or any other users of the library.

If upstream don't accept such patches, and repeatedly indulge in such uncooperative behaviour, there is always the option to fork (and, in the process, perhaps strip out the parts which the browser doesn't need to make maintenance easier).

As an aside, the situation you describe sounds a lot like the Firefox/Iceweasel drama in Debian!

> Firefox/Iceweasel drama Was it a drama though? AFAIK it was pretty much drama free. Debian wanted to backport security fixes themselves but it wouldn't have complied with Firefox's trademark policy. So they just changed the name/icon and didn't make a big deal out of it.
The end user doesn't care about the correct thing, they just want their web browser to work and to not break any other applications in their operating system. Working and not breaking other stuff is 'the correct thing', being secure is also 'the correct thing' so the culmination of these two leads to 'the browser contains all the libraries it needs'. This whole forking crap is exactly the last thing the end user wants or needs to care about.
> The end user doesn't care about the correct thing

I think both of our definitions of "correct" coincide.

The reason to use libraries is precisely so that the applications will work, not break each other, be secure, etc.

The article talks about different browsers supporting different sub-sets of possible BMPs; from a user perspective, I wouldn't call that working.

> the culmination of these two leads to 'the browser contains all the libraries it needs'

This doesn't conflict with what I said. Write as many libraries as you like. Grab as many third-party ones as you want. Use OS-provided ones if you prefer that. Bundle them with your browser, I don't care. Statically link the binaries if it makes things easier. Keep copies of the libraries in your browser's source control if you think that's best. I wasn't addressing any of that.

My point was to use libraries; regardless of who writes them, how they're distributed, how they're linked, etc.

> This whole forking crap is exactly the last thing the end user wants or needs to care about.

Erm, exactly? Nobody likes to fork a project. It's a last resort for bad situations. End users certainly do care when devs steer a project off course; hell, just look at the outcry that happens whenever some social media site changes its colour scheme a little!

Less dependencies usually means less bloat.
Firefox is still offered as a "portable" archive download.

Meaning you can download said archive, extract it onto some storage media, and run it from there on a compatible platform.

Backwards compatibility. There's too many of them around
.ico files (as used for favicons) typically contain BMPs.
I'd like to see a breakdown of file types at e.g. imgur.com - I'm guessing more than 0% are bmp images, and there isn't a compelling reason for a browser to not be able to view those.
Because some old websites serve BMP images?