Hacker News new | ask | show | jobs
by garblegarble 2110 days ago
Would this work for a lossless / near lossless approach by having a final pass storing a delta between the compressed image and the original pixels, or do you think they diverge too much on a purely pixel-for-pixel basis for this to be valuable?
2 comments

The model uses a GAN which does not learn the exact PDF. So not lossless, but as you can see from the images it gets extremely visually accurate results.

From the README

> The generator is trained to achieve realistic and not exact reconstruction. It may synthesize certain portions of a given image to remove artifacts associated with lossy compression. Therefore, in theory images which are compressed and decoded may be arbitrarily different from the input. This precludes usage for sensitive applications. An important caveat from the authors is reproduced here:

> "Therefore, we emphasize that our method is not suitable for sensitive image contents, such as, e.g., storing medical images, or important documents."

> "Therefore, we emphasize that our method is not suitable for sensitive image contents, such as, e.g., storing medical images, or important documents."

As an example of this going wrong previously, xerox had once implemented compression based on deduplicating duplicate parts of documents. Obviously numbers contains tons of duplicate symbols (digits). The problem was that the scanner software deduplicated different numbers with each other, leading to wrong numbers.

http://www.dkriesel.com/en/blog/2013/0802_xerox-workcentres_...

>The model uses a GAN which does not learn the exact PDF. So not lossless, but as you can see from the images it gets extremely visually accurate results.

Yes, I understand this is a lossy compression method - what I was proposing is to have the compressor as a final pass take the predicted output image, and subtract it from the original pixels. This gives you a delta between the predicted image and the original image. You can then compress that delta losslessly, and store it alongside the output of this model - if the predicted image is close enough to the original image then you've significantly reduced the amount of entropy in the delta, making it highly compressible.

This is how some domain-specific lossless compression algorithms work, e.g. DTS-HD Master Audio

Yes, the model is not lossless as this would require learning the PDF in the original input space.

However, the model does learn a conditional probability distribution over a lower-dimensional representation of the original image - this is unavoidable as entropy coding requires a distribution over discrete symbols. The GAN is almost auxiliary and not a central component of the model - in fact, you can get very good results without the GAN, but does seem to result in visually superior reconstructions.

I suspect if lossless reconstruction was your goal, you would want a different architecture. You would want the model to give you a conditional probability distribution for each pixel, conditioned on all previous pixels, so you could use a regular entropy coder to encode exact data.
As u/londons_explore mentioned, in theory you can train a model for lossless reconstruction - there are several papers about this, e.g. [1] is a good recent example. Lossless compressors need to learn a probability distribution over each input pixel, which amounts to maximum likelihood estimation in the original image space.

The model in the demo is a lossy compression method because it first projects the input to a lower dimensional space and performs quantization of this representation to integer values so the result can be ultimately entropy coded. It uses the mean-scale hyperprior model introduced in [1] to estimate the necessary probability distributions in the lower-dimensional space for entropy coding.

[1]: https://arxiv.org/abs/1811.12817 [2]: https://arxiv.org/abs/1802.01436