Hacker News new | ask | show | jobs
by burlesona 1680 days ago
The idea that dithering should be used to reduce your image size is a misunderstanding of image compression.

Dithering is a technique which allows you to represent a color image with a very limited palette, in particular a two bit palette. The result is not much like a normal photo, but much better than nothing if you have a machine that can only output a few colors. (You could think of that as a sort of compression where the goal is not to reduce file size, but to reduce the number of colors needed to comprehend the image.)

JPG, WebP etc. are compression techniques designed to reduce the size of a full color images, especially photos. Because they’re specifically designed for photos, they don’t work as well on things that aren’t “photo like.”

Dithered images are very much not like real photos, so it’s not surprising that compression techniques designed for photos don’t work well on dithered images.

(I’m not an expert on image compression, but as an example, I believe JPEG and similar algorithms expect to find large blocks of basically the same color in photos - such as a blue sky - and save space by simplifying that to a few big regions of all one color. The “speckled” appearance caused by dithering actively defeats that particular optimization.)

9 comments

> Dithering is a technique which allows you to represent a color image with a very limited palette, in particular a two bit palette.

To be way more pedantic, dithering is a technique to reduce the quantization error (what happens when you map values from a big, possibly infinite set to a smaller, finite set). This is done whenever a system or algorithm converts data from higher dynamic range representation (more bits per quantum of signal, like a pixel or audio sample) to a lower dynamic range representation, it's called a bit-depth reduction.

And like you alluded to, every compression algorithm that might find it worth it to do this will do it internally. However the benefits are great, since lowering the bit depth has pretty awful results on quality.

Bit-depth reduction is used in practice in a few places, I'm not well versed in image compression but you do see it in telephony.

> This is done whenever a system or algorithm converts data from higher dynamic range representation (more bits per quantum of signal, like a pixel or audio sample) to a lower dynamic range representation, it's called a bit-depth reduction.

You are mixing two things here? Bit depth reduction is a specific term, it refers to the reduction in the amount of information you want to _convey_.

Compression is be reducing the number of bits/pixel averaged across the whole image. Good compression algorithms will not be spatially uniform, it's entirely possible that pixels in some parts of the image are compressed with more bits/pixel than pixels in other parts.

I took care not to conflate the two and to isolate the terms outside the context of compression, and image processing at large.
Dithering can be applied to any palettesd imaged; in the earlier days of the net, restricted color palettes could indeed help reduce size. Dithering however can interfere with run length encoding (RLE) compression used in certain formats (i.e. GIF, or the ancient PCX).

As I understand it, JPEG compression does not play nice with dithering as it is based on a matrix of discrete cosine transformations. Smooth transitions from one color to the next are much easier to compress this way than highly detailed features (i.e. a series of small dots due to dithering). For example, if you blur out parts of a photo, you will likely get a smaller image at the same compression level as the original. In other words, dithering basically creates a much harder image for the JPEG algorithm to compress.

Dithering would, in theory, work well for compression if you use a palette. GIF always does that, PNG also has a palette mode.

JPEG, WebP, and AVIF use various frequency-domain transforms. These work best for smooth color transitions, like in photos. They're generally terrible for sharp edges (as found on screenshots), and especially so when many neighboring pixels have drastically different colors.

> Dithering would, in theory, work well for compression if you use a palette. GIF always does that, PNG also has a palette mode.

Even then that's a bit more complicated because both formats will apply LZ compression (respectively LZW and LZ77 because DEFLATE), so depending on the original source the dithering can work against the compression.

Dithering is a way of mitigating the effects of quantization. Quantization is in fact an effective form of compression, as it reduces bits per pixel. Modern compression techniques for photographic images, however, rely on the coherence of the image (similarity between adjacent pixels) and dithering tends to work against that.

In a purely mathematical sense, though, quantization is very much a form of compression. It's a way of reducing the amount of information in the image. https://twitter.com/gabrielpeyre/status/1326776195107713026?...

Dithering was also a good option for CRT screen as they tended to blend the dithering (more or less depending on the video signal quality). It helped to make transparency or new colors as pointed out.
Exactly. Dithering is not a compression scheme; it's an encoding scheme. It's particularly useful in systems where you have limited sample dynamic range but you can sample much more frequently than the Nyquist criterion. It trades sample range for sample frequency while conveying the same information.

Audio CD players use Sigma-Delta modulation (also called "1-bit D/A converters") which is essentially just dithering in one dimension. But CDs don't contain fewer bits because of this.

Yeah, I thought about this as soon as I saw this article compare dithered JPEGs to non-dithered JPEGs. Dithering is going to amplify the high frequencies in your images. JPEG is not the target type of compression for dithering.
you might get better results with a run-length encoding on a dithered image. I’m not aware of any modern image formats that use RLE though.
I've experimented with that. It only works if you pick you pick a form of ordered dithering, like a Bayer index or Interleaved Gradient Noise[0][1]. Which looks pretty terrible compared to error diffusion dithering approaches like Sierra3. And even if you use ordered dithering it only somewhat helps with compression.

[0]https://bartwronski.com/2016/10/30/dithering-part-three-real...

[1] https://www.iryoku.com/next-generation-post-processing-in-ca...

A dithered image tends to produce very short runs of pixels so is likely to get poor compression with RLE.

Ordered dithering with a compression algorithm that can compress "words" (repeated patterns) (like the LZW in GIF) will perform better if the image has large areas of flat colour (not photos).

PNG uses deflate, which can encode runs of repeating values very efficiently. A run of repeating XYZXYZX… will get encoded as

Literal X, literal Y, literal Z, copy distance=3 length=N

Was going to say, this but tiny correction, its the LZ variant that does this, as will most of them. Deflate is just two pass compression of LZSS followed by Huffman. The LZ variant will pick out repeat patterns, while the second will compress the byte representation of the resulting stream if a particular set of values is over represented. AKA its all variations of X, Y, Z in different orders those might get assigned 2 bit values rather than the 8 they normally might take up.

(so this a bit of a rant I have about people calling deflate a compression algorithm, its an algorithm composed of two other fundamental ones).

What a weird take. Have you written a deflate/inflate implementation? Deflate really isn't just LZSS (I assume you mean something like Haruhiko Okumura's LZSS?), followed by Huffman, it's a very intertwined and sophisticated combination of LZ77 and Huffman. How the two work together is integral to why Deflate works as well as it does.

The optimal parse here isn't always to pick the greediest match from the LZ77 perspective and then "run it through Huffman", you have to know the Huffman cost model when picking your LZ77 matches.

I wrote a zip decompressor many years ago, the compression side wasn't really much of a target, because the focus was on compressing with a more speed focused algorithm. At the time it was a pretty distinct portion of my decompression pass. I didn't know they were picking matches based on the compressed size vs just longest match, but I guess it makes sense, but I don't see why that "intertwines" it anymore than any other adjustment one makes to how one finds matches (which is AFAIK generally the largest change in all the LZ variations).

edit: Just as a note, actually doing it as two distinct passes rather than at the same time would be silly since its going to significantly slow it down. So just because its doing the entire thing as a single "pass" doesn't count IMHO.

I’ve written a deflate decompressor and I’m not sure what correction you are trying to make. It seems like you’re replying to this comment to show off some details you know about deflate.
Maybe just encode it as a bitmap, and then use other types of compression like RLE. Dithering reduces the color palette, so in combination with a bitmap it would reduce the memory footprint of each pixel.
The opposite is likely. RLE works best with long "runs" of the same value. Dithering will tend to break up these regions if it improves quantization error.
PCX was RLE. I remember playing with PCX files in C++ back in the day.
JPEG removes fine detail by transforming 8x8 pixel squares into the frequency domain using a 2d discreet cosine transform and quantizing the coefficients. It also optionally subsamples the chroma. The other things JPEG does are reversible.