Hacker News new | ask | show | jobs
by throwawaywego 2496 days ago
> Why Not Lossy Compression?

> Although humans cannot compress losslessly, they are very good at lossy compression: remembering that which is most important and discarding the rest. Lossy compression algorithms like JPEG and MP3 mimic the lossy behavior of the human perceptual system by discarding the same information that we do. For example, JPEG codes the color signal of an image at a lower resolution than brightness because the eye is less sensitive to high spatial freqencies in color. But we clearly have a long way to go. We can now compress speech to about 8000 bits per second with reasonably good quality. In theory, we should be able to compress speech to about 20 bits per second by transcribing it to text and using standard text compression programs like zip.

> Humans do poorly at reading text and recalling it verbatim, but do very well at recalling the important ideas and conveying them in different words. It would be a powerful demonstration of AI if a lossy text compressor could do the same thing. But there are two problems with this approach.

> First, just like JPEG and MP3, it would require human judges to subjectively evaluate the quality of the restored data.

> Second, there is much less noise in text than in images and sound, so the savings would be much smaller. If there are 1000 different ways to write a sentence expressing the same idea, then lossy compression would only save log2 1000 = about 10 bits. Even if the effect was large, requiring compressors to code the explicit representation of ideas would still be fair to all competitors.

http://mattmahoney.net/dc/rationale.html