Hacker News new | ask | show | jobs
by sgtnoodle 1729 days ago
As a human analog, just think about how you would write down instructions for someone to recreate the image.

For the ordered image: "Make a 16 by 16 grid of boxes, each getting redder from left to right and top to down. Inside each box make a 16x16 grid of boxes getting greener, and inside each of those boxes make a 16x16 grid of pixels getting bluer". Done.

Vs. the random image: "Make a blueish red pixel with a bit of green. Then make a reddish blue pixel with a a moderate amount of green. Then make a brownish pixel. Then make a greenish pixel..." That will go on for about 16 million sentences!

Image compression is simply coming up with a language that's good for describing images, and then an algorithm for writing succinct descriptions in that language. Rather than being human friendly languages (a modest number of complex words made from an alphabet), they are computer friendly languages (a ton of simple words made from 0s and 1s.)

JPEG compression separates out the image into "component" images. One component is brightness, another is hue, and a third is saturation. Each of those components map nicely to how human vision works. In particular, brightness needs high resolution and precision. Hue needs high precision but resolution doesn't matter. Saturation doesn't need high resolution or precision. Therefore, each component can be compressed differently and independently from each other. The component images are further broken up into a grid of 8x8 boxes, and then each of those boxes are approximated via a weighted sum of reference box images (the encoder and decoder have a dictionary of reference boxes they both agree to use.) For each box, only the weights are saved. The weights themselves can have varying precision, and that's basically you're controlling when you set the "jpeg quality". Higher quality jpegs have more precision in their weights, and lower quality jpegs have less precision in their weights.